JEB 2.3 and MIPS Decompilation

We recently released our latest decompiler for MIPS 32-bit binary code. It is the first interactive decompiler in a series of native code analysis modules that will be released this year with JEB 2.3.

If you haven’t done so: feel free to download the demo, or if you own a Pro or Embedded license, ask for the beta 2.3 build.

The 2.3 branch contains tons of under-the-hood updates, required to power the decompilation modules — as well as the future advanced static and dynamic analysis modules that we have on our roadmap. Changes such as:

  • A generic code parsing framework for interactive disassembly and analysis of code objects.
  • A generic decompilation framework using a custom Intermediate Representation as well as a partially-customizable decompilation pipeline.
  • API additions to allow third-party to develop things as simple as instrumentation tools for the decompilers, or as complex as IR refining plugins to thwart custom obfuscation.

MIPS is the first native decompiler we made publicly available, and while the beta can be a bit rough around the edges, we believe it will be of a tremendous help to any reverser pouring though lines of embedded firmware or application code.

Decompiling MIPS

MIPS programs exhibit a level of complexity that experienced reverse-engineers may feel overwhelmed or unprepared to deal with. Unlike well understood and well practiced x86, even the simplest of operations do not seem to “stand out” in a MIPS disassembly. Not to mention other intricacies inherent to a RISC instruction set, such as unaligned reads and writes; or counter-intuitive idioms closely tight to the MIPS architecture itself, such as the branch delay slots.

Have a look at this “trivial” piece of code:

A trivial, yet “unreadable” chunk of MIPS code.

If you’ve never reversed MIPS code, you may experience a temporary brain-freeze moment. This code contains typical MIPS idioms:

  • $gp building for globals access
  • convoluted arithmetic, usually 16-bit based
  • delay-slots (unused here)

The pseudo code is simply this:

 for(i = 0; i < 64; ++i) {
    array[i] = i;
 }

Here is the full routine disassembly and decompiled code:

Unannotated decompiled code. Note the presence of canary-checking code introduced by the compiler.

How about something more complex. Below, you will find the raw decompilation (non annotated) of the domain-generating algorithm used in Mirai:

DGA of Mirai for MIPS; decompiled output is unannotated.

JEB allows you to set types, prototypes, rename, comment, create custom data structures, etc. in order to clean up the disassembly and the pseudo-code.

Augmented Disassembly

Not everything warrants use of a decompiler. Navigating the disassembly to get the overall sense of a piece of code is a common task. Unfortunately, raw MIPS assembly can be tricky to read for a bunch of reasons. The main problem lies in the presence of memory access relative to the dynamically computed $gp register. Those non-static references prevent straightforward determination of callsites or data references (eg, string references).

What is the target callsite of the JALR (=call subroutine) instruction?

In order to resolve those references and produce readable assembly code, disassemblers have several strategies. The cheapest one is to resort to pattern matching or instruction(s) matching and make inference 1. This strategy can provide fast results, however, it is extremely limiting, and would perform poorly on non-standard or obfuscated code.

The strategy used by our code analyzer is to emulate the intermediate representation (IR) resulting from the conversion of the MIPS code. That controlled simulation is fast and allows the resolution of the most complex cases. Currently, the results are shown in the assembly comments.

See the examples below:

Advanced analysis resolving a target branch site held in $t9.
Advanced analysis resolving pre- and post-execution register data.

Type libraries and Syscalls

JEB 2.3 ships with  type libraries for several platforms, including the GNU Linux API for Linux MIPS 32-bit systems. Soon we will also release the signatures of common libraries.

Type libraries loaded by JEB.

Combined with the advanced analyzer, the controlled simulation step described above also resolves MIPS system calls. The resolution of $v0, holding the syscall number, is resolved during simulation – therefore handling complex obfuscation cases; under the hood,  a virtual method reference is created to represent the syscall as a standard routine. See the example below:

Syscall #4013 (time) resolved during the advanced analysis phase.

Conclusion

We presented some of the most interesting features of our new MIPS decompiler specifically, and more generally, JEB 2.3. It is still in beta mode and actively developed, feel free to try it out and let us know your feedback. Other decompilers will be released in the coming weeks/months, stay tuned.

  1. For example, prologues such as “lui $gp / addiu / addu” are common and could be looked for statically.