{"id":975,"date":"2019-01-03T11:49:47","date_gmt":"2019-01-03T19:49:47","guid":{"rendered":"https:\/\/www.pnfsoftware.com\/blog\/?p=975"},"modified":"2019-04-25T09:41:27","modified_gmt":"2019-04-25T17:41:27","slug":"jeb-native-pipeline-intermediate-representation","status":"publish","type":"post","link":"https:\/\/www.pnfsoftware.com\/blog\/jeb-native-pipeline-intermediate-representation\/","title":{"rendered":"JEB Native Analysis Pipeline &#8211; Part 1: Intermediate Representation"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">JEB native code analysis components make use of a custom intermediate representation (IR) to perform code analysis.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Some background: after analysis of a code object, the native assembly of a reconstructed routine is converted to an intermediate representation. <sup class='footnote'><a href='#fn-975-1' id='fnref-975-1' onclick='return fdfootnote_show(975)'>1<\/a><\/sup> That IR subsequently goes through a series of transformation passes, including massages and optimizations. Final stages include the generation of high-level C-like code. Most stages in this pipeline can be customized by users via the use of plugins. A high-level, simplified view of the pipeline could be as follows:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">CodeObject (*)<br>  -&gt; Reconstructed Routines &amp; Data<br>    -&gt; Conversion to IR (low-level, non-optimized)<br>      -&gt; IR Optimizations<br>        -&gt; Final IR (higher-level, optimized, typed) <br>          -&gt; Generation of AST<br>            -&gt; AST Optimizations<br>              -&gt; Final AST (final, cleaned) <br>                -&gt; High-level output (eg, C variant)<\/pre>\n\n\n\n<p class=\"wp-block-paragraph\" style=\"font-size:14px\"><em>(*) Examples of code objects:&nbsp;a Windows PE&nbsp;file with&nbsp;x86-code,&nbsp;an ELF library with with&nbsp;MIPS&nbsp;code,&nbsp;a headless ARM&nbsp;firmware,&nbsp;a&nbsp;Wasm&nbsp;binary&nbsp;file,&nbsp;an&nbsp;Ethereum&nbsp;smart contract,&nbsp;etc.<\/em><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Two important JEB API components to hook into and customize the native analysis pipeline are:<br>&#8211; The IR classes<br>&#8211; The AST classes<br>We will start looking at IR components through the rest of this part 1.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">IR Description<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">JEB IR can be seen as a low-level, imperative assembly language, made of <em>expressions<\/em>. Highest-level expressions are <em>statements<\/em>. Statements contain expressions. Generally, expressions can contain expressions. IR can be accessed via interfaces in the JEB API. The top-level interface for all IR expressions is <a href=\"https:\/\/www.pnfsoftware.com\/jeb\/apidoc\/reference\/com\/pnfsoftware\/jeb\/core\/units\/code\/asm\/decompiler\/ir\/IEGeneric.html\">IEGeneric<\/a>. All IR elements start with <em>IExxx<\/em>. <sup class='footnote'><a href='#fn-975-2' id='fnref-975-2' onclick='return fdfootnote_show(975)'>2<\/a><\/sup><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The diagram below shows the current hierarchy of IR expression interfaces:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"750\" height=\"896\" src=\"https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2019\/01\/image.png\" alt=\"\" class=\"wp-image-976\" srcset=\"https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2019\/01\/image.png 750w, https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2019\/01\/image-251x300.png 251w\" sizes=\"auto, (max-width: 750px) 100vw, 750px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Note that IEGeneric sits at the top. All other IRE&#8217;s (short for <em>IR Expressions<\/em> from now on) derive from it. Let&#8217;s go through those interfaces:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li><strong><a href=\"https:\/\/www.pnfsoftware.com\/jeb\/apidoc\/reference\/com\/pnfsoftware\/jeb\/core\/units\/code\/asm\/decompiler\/ir\/IEImm.html\">IEImm<\/a><\/strong>: Integer immediate of arbitrary length. Eg,  <br>Imm(0x1122, 64) would represent the 64-bit integer value 0x1122.<\/li><li><strong><a href=\"https:\/\/www.pnfsoftware.com\/jeb\/apidoc\/reference\/com\/pnfsoftware\/jeb\/core\/units\/code\/asm\/decompiler\/ir\/IEVar.html\">IEVar<\/a><\/strong>: Generic IRE to represent variables. Variables can represent underlying physical registers, virtual registers, local function variables, global program variables, etc.<\/li><li><strong><a href=\"https:\/\/www.pnfsoftware.com\/jeb\/apidoc\/reference\/com\/pnfsoftware\/jeb\/core\/units\/code\/asm\/decompiler\/ir\/IEMem.html\">IEMem<\/a><\/strong>:  Piece of memory of arbitrary length. The memory address itself is an IRE; the accessed bitsize is not.<\/li><li><strong><a href=\"https:\/\/www.pnfsoftware.com\/jeb\/apidoc\/reference\/com\/pnfsoftware\/jeb\/core\/units\/code\/asm\/decompiler\/ir\/IECond.html\">IECond<\/a><\/strong>: A ternary expression &#8220;c ? a: b&#8221;, where a, b and c are IRE&#8217;s.<\/li><li><strong><a href=\"https:\/\/www.pnfsoftware.com\/jeb\/apidoc\/reference\/com\/pnfsoftware\/jeb\/core\/units\/code\/asm\/decompiler\/ir\/IERange.html\">IERange<\/a><\/strong>: A fixed integer range, commonly used with Slice<\/li><li><strong><a href=\"https:\/\/www.pnfsoftware.com\/jeb\/apidoc\/reference\/com\/pnfsoftware\/jeb\/core\/units\/code\/asm\/decompiler\/ir\/IESlice.html\">IESlice<\/a><\/strong>: A chunk (contents range) of an existing IR. Eg, Slice(Imm(0x11223344, 32), 16, 24)) can be simplified to Imm(0x22, 8)<\/li><li><strong><a href=\"https:\/\/www.pnfsoftware.com\/jeb\/apidoc\/reference\/com\/pnfsoftware\/jeb\/core\/units\/code\/asm\/decompiler\/ir\/IECompose.html\">IECompose<\/a><\/strong>: The concatenation of two or more IRE&#8217;s (IR0, IR1, &#8230;), resulting in an IR of size SUM(i=0-&gt;n, bitsize(IRi))<\/li><li><strong><a href=\"https:\/\/www.pnfsoftware.com\/jeb\/apidoc\/reference\/com\/pnfsoftware\/jeb\/core\/units\/code\/asm\/decompiler\/ir\/IEOperation.html\">IEOperation<\/a><\/strong>: A generic operation expression, with IRE operands and an <a href=\"https:\/\/www.pnfsoftware.com\/jeb\/apidoc\/reference\/com\/pnfsoftware\/jeb\/core\/units\/code\/asm\/decompiler\/ir\/OperationType.html\">operator<\/a>. Eg, Operation(ADD,Imm(0x10,8),Mem(Imm(0x10000,32),8)). Most standard operators are supported, as well as less standard operators such as the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Parity_function\">Parity function<\/a> or <a href=\"https:\/\/en.wikipedia.org\/wiki\/Carry_(arithmetic)\">Carry function<\/a>.)<\/li><li><strong><a href=\"https:\/\/www.pnfsoftware.com\/jeb\/apidoc\/reference\/com\/pnfsoftware\/jeb\/core\/units\/code\/asm\/decompiler\/ir\/IEStatement.html\">IEStatement<\/a><\/strong>: the super-interface for IR statements; we will detail them below.<\/li><\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">An IR <em>translation unit<\/em>, resulting from the conversion of a native routine, consists of a sequential list of IEStatement objects. An IR statement has a size (generally, but not necessarily, 1) and an address (generally, a 0-based offset relative to its position in the translation unit).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">As of JEB 3.0.8, IR statements can be:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li><strong><a href=\"https:\/\/www.pnfsoftware.com\/jeb\/apidoc\/reference\/com\/pnfsoftware\/jeb\/core\/units\/code\/asm\/decompiler\/ir\/IEAssign.html\">IEAssign<\/a><\/strong>: The most common of all statements: an assignment from a right-side source to a left-side destination. While the source can be virtually anything, the destination IRE is restricted to a subset of expressions.<\/li><li><strong><a href=\"https:\/\/www.pnfsoftware.com\/jeb\/apidoc\/reference\/com\/pnfsoftware\/jeb\/core\/units\/code\/asm\/decompiler\/ir\/IENop.html\">IENop<\/a><\/strong>: This statement does nothing but consumes virtual size in the translation unit.<\/li><li><strong><a href=\"https:\/\/www.pnfsoftware.com\/jeb\/apidoc\/reference\/com\/pnfsoftware\/jeb\/core\/units\/code\/asm\/decompiler\/ir\/IEJump.html\">IEJump<\/a><\/strong>: An unconditional or conditional jump within the translation unit, expressed using <em>IR offsets<\/em>.<\/li><li><strong><a href=\"https:\/\/www.pnfsoftware.com\/jeb\/apidoc\/reference\/com\/pnfsoftware\/jeb\/core\/units\/code\/asm\/decompiler\/ir\/IEJumpFar.html\">IEJumpFar<\/a><\/strong>: An unconditional or conditional far jump (can be outside the translation unit), expressed using <em>native<\/em> addresses.<\/li><li><strong><a href=\"https:\/\/www.pnfsoftware.com\/jeb\/apidoc\/reference\/com\/pnfsoftware\/jeb\/core\/units\/code\/asm\/decompiler\/ir\/IESwitch.html\">IESwitch<\/a><\/strong>: The N-branch equivalent of IEJump.<\/li><li><strong><a href=\"https:\/\/www.pnfsoftware.com\/jeb\/apidoc\/reference\/com\/pnfsoftware\/jeb\/core\/units\/code\/asm\/decompiler\/ir\/IECall.html\">IECall<\/a><\/strong>: Represent a well-formed static or dynamic dispatch to another IR translation unit. The dispatch expression can be any IRE (eg, an Imm for a static dispatch; a Var or Mem for a dynamic dispatch).<\/li><li><strong><a href=\"https:\/\/www.pnfsoftware.com\/jeb\/apidoc\/reference\/com\/pnfsoftware\/jeb\/core\/units\/code\/asm\/decompiler\/ir\/IEReturn.html\">IEReturn<\/a><\/strong>: A high-level expression used to denote a return-to-caller from a translation unit representing a routine. This IRE is always introduced by later optimization passes.<\/li><li><strong><a href=\"https:\/\/www.pnfsoftware.com\/jeb\/apidoc\/reference\/com\/pnfsoftware\/jeb\/core\/units\/code\/asm\/decompiler\/ir\/IEUntranslatedInstruction.html\">IEUntranslatedInstruction<\/a><\/strong>: This powerful statement can be used to express anything. It is generally used to represent native instructions that cannot be readily translated using other IR expressions. (Users may see it as an IECall on steroid, using native addresses. In that sense, it is to IECall what IEJumpFar is to IEJump.)<\/li><\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">Now, let&#8217;s look at a few examples of conversions.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">IR Examples<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Let&#8217;s assume the following EVars were previously defined by an Intel x86 (or x86-64) converter: <em>tmp<\/em> (a 32-bit EVar representing a virtual placeholder register); <em>eax<\/em> (an EVar representing the physical register %eax); <em>?f<\/em> (1-bit EVars representing standard x86 flags).<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>x86: <strong>mov eax, 1<\/strong><\/li><\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>s32:_eax = s32:00000001h<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Translating this mov instruction is straight-forward, and can be done with a single Assign IR statement.<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>x86-64: <strong>not r9d<\/strong><\/li><\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>s64:_r9 = C(~(s64:_r9[0:32[), i32:00000000h)<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Translating a not-32-bit-register on an x86-64 platform is slightly more complex, as the upper 32-bit of the register are zeroed out. Here, the converter is making use of three nested IREs: (IECompose(IEOperation(NOT, Slice(r9, 0, 32))))<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong><em>Reading IR.<\/em>&nbsp;<\/strong><em>IECompose&nbsp;are&nbsp;pretty-printed&nbsp;as C(lo, &#8230;, hi),&nbsp;IESlice&nbsp;as Expr[m:n[&nbsp;<\/em><\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>x86-64: <strong>xor rax, qword ds:[ecx+1]<\/strong><\/li><\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>0000 : s64:_rax = (s64:_rax ^ 64&lt;s16:_ds>[(s64:_rcx[0:32[ + i32:00000001h)])\n0001 : s1:_zf = (s64:_rax ? i1:0 : i1:1)\n0002 : s1:_sf = s64:_rax[63:64[\n0003 : s1:_pf = PARITY(s64:_rax[0:8[)\n0004 : s1:_of = i1:0\n0005 : s1:_cf = i1:0<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">One side-effect of arithmetic operations on x86 is the modification of flag registers. A converter explicits those side effects. Consequently, translating the exclusive-or above resulted in several Assign IR statements to represent register and flags updates. <sup class='footnote'><a href='#fn-975-3' id='fnref-975-3' onclick='return fdfootnote_show(975)'>3<\/a><\/sup><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong><em>Reading IR.<\/em><\/strong><em> IEMem are pretty-printed as bitsize&lt;SegmentIR&gt;[AddressIR]<\/em><\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>x86: <strong>add eax, 2 <\/strong><\/li><\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>0000 : s32:_tmp = s32:_eax\n0001 : s32:_eax = (s32:_eax + i32:00000002h)\n0002 : s1:_zf = (s32:_eax ? i1:0 : i1:1)\n0003 : s1:_sf = s32:_eax[31:32[\n0004 : s1:_pf = PARITY(s32:_eax[0:8[)\n0005 : s1:_af = ((s32:_tmp ^ i32:00000002h) ^ s32:_eax)[4:5[\n0006 : s1:_cf = (s32:_tmp CARRY i32:00000002h)\n0007 : s1:_of = ((s32:_tmp ^ s32:_eax) &amp; ~((s32:_tmp ^ i32:00000002h)))[31:32[\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">The translation of add makes use of the temporary, virtual EVar <em>tmp<\/em>. It holds the original value of %eax, before the addition was done. That value is necessary for some flag update computations (eg, the overflow flag.) Also take note of the use of special operators Parity and Carry in the converted stub.<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>x86-64: <strong>@100000h: jz $+1<\/strong><\/li><\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>s64:_rip = (s1:_zf ? i64:0000001000000003h : i64:0000001000000002h)<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Note that a native address is written to the RIP-IEVar (or any EVar representing the Program Counter &#8211; PC). PC-assignments like those can later be optimized to IEJump, making use of IR Offsets instead of Native Addresses.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Also note that the Control Flow Graph (CFG) of the native instruction in the examples thus far are isomorphic to their IR-CFG translated counterparts. That is not always the case, as seen in the example below.<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>x86: <strong>repe cmpsb<\/strong><\/li><\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>0000 : if (s32:_ecx == i32:00000000h) goto 000B\n0001 : s1:_zf = ((8&lt;s16:_ds>[s32:_esi] - 8&lt;s16:_es>[s32:_edi]) ? i1:0 : i1:1)\n0002 : s1:_sf = (8&lt;s16:_ds>[s32:_esi] - 8&lt;s16:_es>[s32:_edi])[7:8[\n0003 : s1:_pf = PARITY((8&lt;s16:_ds>[s32:_esi] - 8&lt;s16:_es>[s32:_edi]))\n0004 : s1:_cf = (8&lt;s16:_ds>[s32:_esi] &lt;u 8&lt;s16:_es>[s32:_edi])\n0005 : s1:_of = ((8&lt;s16:_ds>[s32:_esi] ^ (8&lt;s16:_ds>[s32:_esi] - 8&lt;s16:_es>\n       [s32:_edi])) &amp; (8&lt;s16:_ds>[s32:_esi] ^ 8&lt;s16:_es>[s32:_edi]))[7:8[\n0006 : s1:_af = ((8&lt;s16:_ds>[s32:_esi] ^ 8&lt;s16:_es>[s32:_edi]) ^ (8&lt;s16:_ds>\n       [s32:_esi] - 8&lt;s16:_es>[s32:_edi]))[4:5[\n0007 : s32:_esi = (s32:_esi + (s1:_df ? i32:FFFFFFFFh : i32:00000001h))\n0008 : s32:_edi = (s32:_edi + (s1:_df ? i32:FFFFFFFFh : i32:00000001h))\n0009 : s32:_ecx = (s32:_ecx - i32:00000001h)\n000A : if s1:_zf goto 0000<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><em><strong>Reading IR.<\/strong> conditional IEJump are pretty-printed &#8220;if (cond) goto IROffset&#8221;. Unconditional IEJump are rendered as simple &#8220;goto IROffset&#8221;.<\/em><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This IR-CFG is not isomorphic to the native CFG. Additional edges (per the presence of 2x IEJump) are used to represent the compare &#8220;[esi+xxx] to [edi+xxx]&#8221; loop.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Accessing IR<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The JEB back-end API allows full access to several IR-CFG&#8217;s, from low-level, raw IR to partially optimized IR, to fully lifted IR just before AST generation phases.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Navigating the IR in the GUI<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">The UI client currently provides access to the most optimized IR of routines. Those IR-CFG&#8217;s can be examined in the apt-named fragment right next to the source fragment showing decompiled code. Here is an example of a side-by-side assemblies (x86, IR). The next screenshot shows the decompiled source.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><a href=\"https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2019\/01\/image-1.png\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"404\" src=\"https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2019\/01\/image-1-1024x404.png\" alt=\"\" class=\"wp-image-979\" srcset=\"https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2019\/01\/image-1-1024x404.png 1024w, https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2019\/01\/image-1-300x118.png 300w, https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2019\/01\/image-1-768x303.png 768w, https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2019\/01\/image-1.png 1572w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/a><figcaption>Left-side: x86 routine \/ Right-side: optimized IR of the converted routine<br>(Click to enlarge)<\/figcaption><\/figure>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"272\" height=\"320\" src=\"https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2019\/01\/image-2.png\" alt=\"\" class=\"wp-image-980\" srcset=\"https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2019\/01\/image-2.png 272w, https:\/\/www.pnfsoftware.com\/blog\/wp-content\/uploads\/2019\/01\/image-2-255x300.png 255w\" sizes=\"auto, (max-width: 272px) 100vw, 272px\" \/><figcaption>Decompiled source<\/figcaption><\/figure><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">IR via API<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">The API is the preferred method when it comes to power-users wanting to manipulate the IR for specific needs, such as writing a custom optimizer, as we will see in the next blog in this series.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><em>Reminder: JEB back-end plugins can be written in Java (preferably) or Python. JEB front-end scripts can be written in Python, and can run both in headless clients (eg, using the built-in command line client) or the UI client.<\/em><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For now, let&#8217;s see how to write a Python script to:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Retrieve a decompiled routine<\/li><li>Get the generated Intermediate Representations<\/li><li>Print it out<\/li><\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">The following script does retrieve the first internal routine of a Native unit, decompiles it, retrieve the default (latest) IR, and prints out its CFG. The full scripts is available on <a href=\"https:\/\/github.com\/pnfsoftware\/jeb2-samplecode\/blob\/master\/scripts\/PrintNativeRoutineIR.py\">GitHub<\/a>. <\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\n# retrieve `unit`, the code unit\n\n# GlobalAnalysis is assumed to be on (default)\ndecomp = DecompilerHelper.getDecompiler(unit)\nif not decomp:\n  print(&#039;No decompiler unit found&#039;)\n  return\n\n# retrieve a handle on the method we wish to examine\nmethod = unit.getInternalMethods().get(0)#(&#039;sub_1001929&#039;)\nsrc = decomp.decompile(method.getName(True))\nif not src:\n  print(&#039;Routine was not decompiled&#039;)\n  return\nprint(src)\n    \ndecompTargets = src.getDecompilationTargets()\nprint(decompTargets)\n\ndecompTarget = decompTargets.get(0)\nircfg = decompTarget.getContext().getCfg()\n# CFG object reference\n# see package com.pnfsoftware.jeb.core.units.code.asm.cfg\nprint(&quot;+++ IR-CFG for %s +++&quot; % method)\nprint(ircfg.formatSimple())\n\n<\/pre><\/div>\n\n\n<p class=\"wp-block-paragraph\"><strong>Running on Desktop Client.<\/strong> Run this script in the UI client via <em>File, Scripts, Run&#8230;<\/em> (hotkey: F3). Remember to open a binary file first, with a version of JEB that ships with the decompiler for that file&#8217;s architecture.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Running on the command-line.<\/strong> You may also decide to run it on the command-line. Example, on Windows:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ jeb_wincon.bat -c --srv2 --script=PrintNativeRoutineIR.py -- winxp32bit\/notepad.exe<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Example output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>... &lt;trimmed>\n...\n+++ IR-CFG for Method{sub_1001929}@1001929h +++\n0000\/1>  s32:_$eax = 32&lt;s16:_$ds>[s32:_gvar_100A4A8]\n0001\/1:  if !(s32:_$eax) goto 0003\n0002\/1+  call s32:_GlobalFree(s32:_$eax)->(s32:_$eax){i32:0100193Ch}\n0003\/1+  s32:_$eax = 32&lt;s16:_$ds>[s32:_gvar_100A4AC]\n0004\/1:  if !(s32:_$eax) goto 0006\n0005\/1+  call s32:_GlobalFree(s32:_$eax)->(s32:_$eax){i32:01001948h}\n0006\/1+  32&lt;s16:_$ds>[s32:_gvar_100A4A8] = i32:00000000h\n0007\/1:  32&lt;s16:_$ds>[s32:_gvar_100A4AC] = i32:00000000h\n0008\/1:  return s32:_$eax<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Conclusion<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">That is it for part 1. In part 2, we will continue our exploration of the IR and see how we can hook into the decompilation pipeline to write our custom optimizers to clean packer-specific obfuscation, as well as make use of the data flow analysis components available with the IR-CFG. Stay tuned!<\/p>\n\n\n<div class='footnotes' id='footnotes-975'><div class='footnotedivider'><\/div><ol><li id='fn-975-1'> Working on IR presents several advantages, two of which being: a\/ the reduction of coupling between the analysis pipeline and the input native architecture; b\/ and offering a side-effect free representation of a program. <span class='footnotereverse'><a href='#fnref-975-1'>&#8617;<\/a><\/span><\/li><li id='fn-975-2'> The design choices of JEB IR are out-of-scope for this blog. They may be the subject of a separate document. <span class='footnotereverse'><a href='#fnref-975-2'>&#8617;<\/a><\/span><\/li><li id='fn-975-3'> When decompiling routines, IR optimization passes will iteratively refactor and clean-up unnecessary operations. In practice, most flag assignments will end up being removed or consolidated. <span class='footnotereverse'><a href='#fnref-975-3'>&#8617;<\/a><\/span><\/li><\/ol><\/div>","protected":false},"excerpt":{"rendered":"<p>JEB native code analysis components make use of a custom intermediate representation (IR) to perform code analysis. Some background: after analysis of a code object, the native assembly of a reconstructed routine is converted to an intermediate representation. 1 That IR subsequently goes through a series of transformation passes, including massages and optimizations. Final stages &hellip; <a href=\"https:\/\/www.pnfsoftware.com\/blog\/jeb-native-pipeline-intermediate-representation\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">JEB Native Analysis Pipeline &#8211; Part 1: Intermediate Representation<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[21,3,18,13],"tags":[],"class_list":["post-975","post","type-post","status-publish","format-standard","hentry","category-api-jeb3","category-decompilation","category-jeb3","category-native-code"],"_links":{"self":[{"href":"https:\/\/www.pnfsoftware.com\/blog\/wp-json\/wp\/v2\/posts\/975","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.pnfsoftware.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.pnfsoftware.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.pnfsoftware.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.pnfsoftware.com\/blog\/wp-json\/wp\/v2\/comments?post=975"}],"version-history":[{"count":0,"href":"https:\/\/www.pnfsoftware.com\/blog\/wp-json\/wp\/v2\/posts\/975\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.pnfsoftware.com\/blog\/wp-json\/wp\/v2\/media?parent=975"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.pnfsoftware.com\/blog\/wp-json\/wp\/v2\/categories?post=975"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.pnfsoftware.com\/blog\/wp-json\/wp\/v2\/tags?post=975"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}