{"id":784,"date":"2017-11-02T06:40:11","date_gmt":"2017-11-02T14:40:11","guid":{"rendered":"https:\/\/www.pnfsoftware.com\/blog\/?p=784"},"modified":"2017-11-02T06:40:11","modified_gmt":"2017-11-02T14:40:11","slug":"having-fun-with-obfuscated-mach-o-files","status":"publish","type":"post","link":"https:\/\/www.pnfsoftware.com\/blog\/having-fun-with-obfuscated-mach-o-files\/","title":{"rendered":"Having Fun with Obfuscated Mach-O Files"},"content":{"rendered":"<p>Last week was the release of <a href=\"https:\/\/www.pnfsoftware.com\/jeb\/changelist\">JEB 2.3.7<\/a> with a brand new parser for Mach-O, the executable file format of Apple\u2019s macOS and iOS operating systems. This file format, like its cousins PE and ELF, contains a lot of technical peculiarities and implementing a reliable parser is not a trivial task.<\/p>\n<p>During the journey leading to this first Mach-O release, we encountered some interesting executables. This short blog post is about one of them, which uses some Mach-O features to make reverse-engineering harder.<\/p>\n<h1 style=\"text-align: justify;\">Recon<\/h1>\n<p>The executable in question belongs to a well-known adware family dubbed InstallCore, which is <a href=\"https:\/\/blog.malwarebytes.com\/detections\/pup-optional-installcore\/\">usually bundled with others applications to display ads to the users<\/a>.<\/p>\n<p>The sample we will be using in this post is the following:<\/p>\n<pre>57e4ce2f2f1262f442effc118993058f541cf3fd: Mach-O 64-bit x86_64 executable<\/pre>\n<p>Let\u2019s first take a look at the Mach-O sections:<\/p>\n<figure id=\"attachment_785\" aria-describedby=\"caption-attachment-785\" style=\"width: 536px\" class=\"wp-caption alignnone\"><a href=\"https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2017\/11\/img0.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-785 size-full\" src=\"https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2017\/11\/img0.png\" alt=\"\" width=\"536\" height=\"617\" srcset=\"https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2017\/11\/img0.png 536w, https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2017\/11\/img0-261x300.png 261w\" sizes=\"auto, (max-width: 536px) 100vw, 536px\" \/><\/a><figcaption id=\"caption-attachment-785\" class=\"wp-caption-text\">Figure 1 &#8211; Mach-O sections<\/figcaption><\/figure>\n<p>Interestingly, there are some sections related to the Objective-C language (&#8220;<em>__objc_&#8230;&#8221;<\/em>). Roughly summarized, Objective-C was the main programming language for OS X and iOS applications prior the introduction of Swift. It adds some object-oriented features around C, and it can be difficult to analyze at first, in particular because of <a href=\"https:\/\/blog.zynamics.com\/2010\/04\/27\/objective-c-reversing-i\/\">its way to call methods<\/a> by &#8220;sending messages&#8221;.<\/p>\n<p>Nevertheless, the good news is that Objective-C binaries usually come with a lot of meta-data describing methods and classes, which are used by Objective-C runtime to implement the message passing. These metadata are stored in the &#8220;<em>__objc_&#8230;&#8221;<\/em>\u00a0sections previously mentioned, and the JEB Mach-O parser process them to find and properly name Objective-C methods.<\/p>\n<p>After the initial analysis, JEB leaves us at the entry point of the program (the yellow line below):<\/p>\n<figure id=\"attachment_797\" aria-describedby=\"caption-attachment-797\" style=\"width: 843px\" class=\"wp-caption alignnone\"><a href=\"https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2017\/11\/img1-1.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-797 size-full\" src=\"https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2017\/11\/img1-1.png\" alt=\"\" width=\"843\" height=\"338\" srcset=\"https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2017\/11\/img1-1.png 843w, https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2017\/11\/img1-1-300x120.png 300w, https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2017\/11\/img1-1-768x308.png 768w\" sizes=\"auto, (max-width: 843px) 100vw, 843px\" \/><\/a><figcaption id=\"caption-attachment-797\" class=\"wp-caption-text\">Figure 2 &#8211; Entry point<\/figcaption><\/figure>\n<p>Wait a minute\u2026 there is no routine here and it is not even correct x86-64 machine code!<\/p>\n<p>Most of the detected routines do not look good either; first, there are a few objective-C methods with random looking names like this one:<\/p>\n<figure id=\"attachment_795\" aria-describedby=\"caption-attachment-795\" style=\"width: 362px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2017\/11\/img2-1.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-795\" src=\"https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2017\/11\/img2-1.png\" alt=\"\" width=\"362\" height=\"271\" srcset=\"https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2017\/11\/img2-1.png 441w, https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2017\/11\/img2-1-300x224.png 300w\" sizes=\"auto, (max-width: 362px) 100vw, 362px\" \/><\/a><figcaption id=\"caption-attachment-795\" class=\"wp-caption-text\">Figure 3 &#8211; Objective-C method<\/figcaption><\/figure>\n<p>Again the code makes very little sense\u2026<\/p>\n<p>Then comes around 50 native routines, whose code can also clearly not be executed &#8220;as is&#8221;, for example:<\/p>\n<figure id=\"attachment_791\" aria-describedby=\"caption-attachment-791\" style=\"width: 899px\" class=\"wp-caption alignnone\"><a href=\"https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2017\/11\/img3.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-791 size-full\" src=\"https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2017\/11\/img3.png\" alt=\"\" width=\"899\" height=\"342\" srcset=\"https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2017\/11\/img3.png 899w, https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2017\/11\/img3-300x114.png 300w, https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2017\/11\/img3-768x292.png 768w\" sizes=\"auto, (max-width: 899px) 100vw, 899px\" \/><\/a><figcaption id=\"caption-attachment-791\" class=\"wp-caption-text\">Figure 4 &#8211; Native routine<\/figcaption><\/figure>\n<p>Moreover, there are no cross-references on any of these routines! Why would JEB disassembler engine \u2013 which follows a recursive algorithm combined with heuristics \u2013 even think there are routines here?!<\/p>\n<h1>Time for a Deep Dive<\/h1>\n<h3>Code Versus Data<\/h3>\n<p>First, let\u2019s deal with the numerous unreferenced routines containing no correct machine code. After some digging, we found that they are declared in the <em>LC_FUNCTION_STARTS<\/em> Mach-O command \u2013 &#8220;command&#8221; being Mach-O word for an entry in the file header.<\/p>\n<p>This command provides a table containing function entry-points in the executable. It allows for example debuggers to know function boundaries without symbols. At first, this may seem like a blessing for program analysis tools, because distinguishing code from data in a stripped executable is usually a hard problem, to say the least. And hence JEB, like other analysis tools, uses this command to enrich its analysis.<\/p>\n<p>But this gift from Mach-O comes with a drawback: nothing prevents miscreants to declare function entry points where there are none, and analysis tools will end up analyzing random data as code.<\/p>\n<p>In this binary, <strong>all<\/strong> routines declared in <em>LC_FUNCTION_STARTS<\/em> command are actually not executable. Knowing that, we can simply remove the command from the Mach-O header (i.e. nullified the entry), and ask JEB to re-analyze the file, to ease the reading of the disassembly. We end up with a much shorter routine list:<\/p>\n<figure id=\"attachment_792\" aria-describedby=\"caption-attachment-792\" style=\"width: 497px\" class=\"wp-caption alignnone\"><a href=\"https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2017\/11\/img4.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-792 size-full\" src=\"https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2017\/11\/img4.png\" alt=\"\" width=\"497\" height=\"275\" srcset=\"https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2017\/11\/img4.png 497w, https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2017\/11\/img4-300x166.png 300w\" sizes=\"auto, (max-width: 497px) 100vw, 497px\" \/><\/a><figcaption id=\"caption-attachment-792\" class=\"wp-caption-text\">Figure 5 &#8211; Routine list<\/figcaption><\/figure>\n<p>The remaining routines are mostly Objective-C methods declared in the metadata. Once again, nothing prevents developers to forge these metadata to declare method entry points in data. For now, let\u2019s keep those methods here and focus on a more pressing question\u2026<\/p>\n<h3>Where Is the Entry Point?<\/h3>\n<p>The entry point value used by JEB comes from the <em>LC_UNIXTHREAD<\/em> command contained in the Mach-O header, which specifies a CPU state to load at startup. How could this program be even executable if the declared entry point is not correct machine code (see Figure 2)?<\/p>\n<p>Surely, there has to be another entry point, which is executed first. There is one indeed, and it has to do with the way the Objective-C runtime initializes the classes. An Objective-C class can implement a method named &#8220;<em>+load&#8221;<\/em>\u00a0&#8212; the + means this is a class method, rather than an instance method &#8211;, <a href=\"https:\/\/www.mikeash.com\/pyblog\/friday-qa-2009-05-22-objective-c-class-loading-and-initialization.html\">which will be called during the executable initialization<\/a>, that is before the program <em>main()<\/em>\u00a0function will be executed.<\/p>\n<p>If we look back at Figure 5, we see that among the random looking method names there is one class with this famous <em>+load<\/em> method, and here is the beginning of its code:<\/p>\n<figure id=\"attachment_793\" aria-describedby=\"caption-attachment-793\" style=\"width: 562px\" class=\"wp-caption alignnone\"><a href=\"https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2017\/11\/img5.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-793 size-full\" src=\"https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2017\/11\/img5.png\" alt=\"\" width=\"562\" height=\"447\" srcset=\"https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2017\/11\/img5.png 562w, https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2017\/11\/img5-300x239.png 300w\" sizes=\"auto, (max-width: 562px) 100vw, 562px\" \/><\/a><figcaption id=\"caption-attachment-793\" class=\"wp-caption-text\">Figure 6 &#8211; +load method<\/figcaption><\/figure>\n<p>Finally, some decent looking machine code! We just found the real entry point of the binary, and now the adventure can really begin\u2026<\/p>\n<p>That\u2019s it for today, stay tuned for more technical sweetness on JEB blog!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Last week was the release of JEB 2.3.7 with a brand new parser for Mach-O, the executable file format of Apple\u2019s macOS and iOS operating systems. This file format, like its cousins PE and ELF, contains a lot of technical peculiarities and implementing a reliable parser is not a trivial task. During the journey leading &hellip; <a href=\"https:\/\/www.pnfsoftware.com\/blog\/having-fun-with-obfuscated-mach-o-files\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Having Fun with Obfuscated Mach-O Files<\/span><\/a><\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[8,2,13,5],"tags":[],"class_list":["post-784","post","type-post","status-publish","format-standard","hentry","category-jeb2","category-malware","category-native-code","category-obfuscation"],"_links":{"self":[{"href":"https:\/\/www.pnfsoftware.com\/blog\/wp-json\/wp\/v2\/posts\/784","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.pnfsoftware.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.pnfsoftware.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.pnfsoftware.com\/blog\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/www.pnfsoftware.com\/blog\/wp-json\/wp\/v2\/comments?post=784"}],"version-history":[{"count":0,"href":"https:\/\/www.pnfsoftware.com\/blog\/wp-json\/wp\/v2\/posts\/784\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.pnfsoftware.com\/blog\/wp-json\/wp\/v2\/media?parent=784"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.pnfsoftware.com\/blog\/wp-json\/wp\/v2\/categories?post=784"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.pnfsoftware.com\/blog\/wp-json\/wp\/v2\/tags?post=784"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}