In this post, we’re having a look at the first release of dProtect (v 1.0) by Romain Thomas. dProtect is a fork of ProGuard that provides four additional self-explanatory configuration flags:
-obfuscate-strings
-obfuscate-constants
-obfuscate-arithmetic
-obfuscate-control-flow (via flattening & opaque predicates — unfortunately, I was unable to get this flag to work, so it’s something we’ll have to revisit in the future.)
Let’s see how JEB’s dexdec’s built-in optimizers as well as custom IR plugins can be used to defeat some implementations of strings obfuscation, constants obfuscation, and arithmetic operations obfuscation.
Let’s disable dexdec’s built-in deobfuscators (CTRL+TAB to decompile, untick “Enable deobfuscators”) to get a chance to look at the obfuscated code. It decompiles to:
A decryptor method a(String):String was generated by dProtect. It performs various computations to decrypt the input string.
One built-in optimizer that ships with JEB’s dexdec uses the IDState object to perform emulation (explained in a previous blog). It cleans up such code automatically:
provideString() is auto-deobfuscated by JEB’s dexdec
Arithmetic Operations Obfuscation
The test method is as follows:
// targeted by: -obfuscate-arithmetic
public int calculate(int x) {
return 100 + x;
}
With standard JEB settings (re-tick “Enable deobfuscators” if you had disabled it), the obfuscated code decompiles to:
As can be seen, the constant 100 has been replaced by an arithmetic operation, here, a XOR operating on an immediate and a static array element set up in the class initializer.
JEB does not ship with overly complex deobfuscators operating on arrays, because it is near-impossible in the general case to assess their finality (i.e. answer the question “will values be changed during the program execution?” definitively). However, to solve particular cases of obfuscation, writing a custom IR plugin to tackle this obfuscation is an acceptable solution. (Have a look at this post to get started on dexdec IR plugins.)
Let’s check DOptUnsafeArrayAccessSubst.java, a sample IR plugin that ships with JEB (folder coreplugins/scripts/) and does does exactly what we need: detecting the use of static array elements and replacing them by their actual values. We can enable the plugin by removing the “.DISABLED” extension. Now redecompile (CTRL+TAB). And… well, nothing has changed! It is time to examine the plugin code carefully, maybe even use your favorite IDE to troubleshoot and augment it. Here is what prevented the original plugin from kicking in: the plugin was looking for IR elements such as: IDArrayElt ^ IDImm. However, the IR it got was: (<int>IDArrayElt) ^ IDImm, that is, the array element was cast to int, making the IR expression an IDOperation, not an IDArrayElt.
Now we can redecompile. and things were deobfuscated as expected:
calculate() is deobfuscated by DOptUnsafeArrayAccessSubstV2
Constants Scrambling
Finally, let’s have a look at how constants obfuscation is achieved. The documentation gives examples of cryptographic-like S-boxes being initialized. The test method is as follows:
Note that the use of synthetic static arrays is made, as was the case for the arithmetic operations obfuscation pass. Therefore, let’s try the DOptUnsafeArrayAccessSubstV2 plugin. As careful examination of the above code may give in, the plugin fails to deobfuscate this code on the first go. The reason: if you examine the IR produced while debugging the plugin, you will notice that the static array elements are accessed via a variable (v0, above). In IR, those elements are IDVar. Therefore, we need to check whether this variable references a static array. We will do that by using the data flow analysis facility made available to all dexdec plugins (public field dfa of optimizers sub-classing AbstractDOptimizer):
...
analyzeChains(); // initialize the `dfa` member field
Long defaddr = dfa.checkSingleDef(insnAddress, varid); // use-def chains
...
The obfuscated code is now processed as expected, and dexdec generates the following decompilation:
initArray() is deobfuscated by DOptUnsafeArrayAccessSubstV3
Conclusion and Future Work
dProtect is a great project to provide code obfuscation for the masses. Its compatibility with ProGuard makes integration into new and existing Android projects a breeze. I have little doubt many developers will try it out in the future. Let’s see how upcoming upgrades to the obfuscators fare against the decompiler!
In future blogs, we will have a look at dProtect’s control-flow obfuscation (once I’ve got it to work!) and we will see how O-MVLL, the LLVM-based native code obfuscator counterpart, does against JEB’s gendec (generic decompiler for native code).
Under some circumstances, JEB’s generic decompiler is able to detect inline decryptors, and subsequently attempt to emulate the underlying IR to generate plaintext data items, both in the disassembly view and, most importantly, decompiled views.1
This feature is available starting with JEB 4.0.3-beta. It makes use of the IREmulator object, available in the public API for scripting and plugins.
–
Here’s an example of a protected elf file2 (aarch64) that was encountered a few months ago:
Disassembly of the target routine
GENDEC’s unsafe optimizers are enabled by default. Let’s disable them before performing a first decompilation, in order to see what the inline decryptor looks like.
To bring up the decompilation options on-demand, use CTRL+TAB (or Command+TAB), or alternatively, menu Action, command Decompile with OptionsDecompilation #1: unsafe optimizers disabled
That decryptor’s control flow is obfuscated (flattened, controlled by the state variable v5). It is called once, depending on the boolean value at 0x2F227. Here, the decrypted contents is used by system_property_get.
Below, the contents in virtual memory, pre-decryption:
Encrypted contents.
Let’s perform another decompilation of the same routine, with the unsafe optimizers enabled this time. GENDEC now will:
detect something that potentially could be decryption code
start emulating the underlying IR (not visible here, but you can easily read/write the Intermediate Representation via API) portion of code is emulated
collect and apply results
See the decrypted contents below. (An data item existed beforehand at 0x2F137, and the decompiler chose not to erase it.) The decompiled code on the right panel no longer shows the decryption loop: an optimizer has discarded it since it can no longer be executed.
Decompilation #2: unsafe optimizers enabled
We may convert the data item (or bytes) to a string by pressing the A key (menu Native, command Create String). The decompiled code will pick it up and refresh the AST as well.
The final result looks like:
The VM and decompiled view show the decrypted code, “ro.build.version.sdk”
A few additional comments:
This optimizer is considered unsafe3because it is allowed to modify the VM of the underlying native code unit, as seen above.
The optimizer is generic (architecture-agnostic). It performs its work on the underlying IR mid-stage in the decompilation pipeline, when various optimizations are applied.
It makes use of public API methods only, mostly the IREmulator class. Advanced users can write similar optimizers if they choose to. (We will also publish the code of this optimizer on GitHub shortly, as it will serve as a good real-life example of how to use the IR emulator to write powerful optimizers. It’s slightly more than 100 lines of Java.)
We hope you enjoy using JEB 4 Beta. There is a license type for everyone, so feel free to try things out. Do not hesitate to reach out to us on Twitter, Slack, or privately over email! Thanks, and until next time 🙂
Users familiar with JEB’s Dex decompilers will remember that a similar feature was introduced to JEB 3 in 2020, for Android Dalvik code. ↩
sha256 43816c47315aab27e50e6f895774a7b86d591807179e1d3262446ab7d68a56ef also available as lib/arm64-v8a/libd.so in 309d848275aa128ebb7e27e570e5a2876977122625638630a6c61f7434b771c3 ↩
“unsafe” in the context of decompilation; unsafe here is not to be understood as, “could any code be executed on the machine”, etc. ↩
Disclaimer: a long time ago in our galaxy, we published part 1 of this blog post; then we decided to wait for the next major release of JEB decompiler before publishing the rest. A year and a half later, JEB 4.0 is finally out! So it is time for us to publish our complete adventure with MarsAnalytica crackme. This time as one blog covering the full story.
In this blog post, we will describe our journey toward analyzing a heavily obfuscated crackme dubbed “MarsAnalytica”, by working with JEB’s decompiled C code 1.
To reproduce the analysis presented here, make sure to update JEB to version 4.0+.
Part 1: Reconnaissance
MarsAnalytica crackme was created by 0xTowel for NorthSec CTF 2018. The challenge was made public after the CTF with an intriguing presentation by its author:
My reverse engineering challenge ‘MarsAnalytica’ went unsolved at #nsec18 #CTF. Think you can be the first to solve it? It features heavy #obfuscation and a unique virtualization design.
Given that exciting presentation, we decided to use this challenge mainly as a playground to explore and push JEB’s limits (and if we happen to solve it on the road, that would be great!).
The MarsAnalytica sample analyzed in this blog post is the one available on 0xTowel’s GitHub2. Another version seems to be available on RingZer0 website, called “MarsReloaded”.
So, let’s examine the beast! The program is a large x86-64 ELF (around 10.8 MB) which, once executed, greets the user like this:
Inserting a dummy input gives:
It appears we have to find a correct Citizen ID! Now let’s open the executable in JEB. First, the entry point routine:
A few interesting imports: getchar() to read user input, and putchar() and puts() to write. Also, some memory manipulation routines, malloc() and memcpy(). No particular strings stand out though, not even the greeting message we previously saw. This suggests we might be missing something.
Actually, looking at the native navigation bar (right-side of the screen by default), it seems JEB analyzed very few areas of the executable:
Navigation Bar (green is cursor’s location, grey represents area without any code or data)
To understand what happened let’s first look at JEB’s notifications window (File > Notifications):
Notifications Window
An interesting notification concerns the “Initial native analysis styles”, which indicates that code gaps were processed in PROLOGUES_ONLY mode (also known as a “conservative” analysis). As its name implies, code gaps are then disassembled only if they match a known routine prologue pattern (for the identified compiler and architecture).
This likely explains why most of the executable was not analyzed: the control-flow could not be safely followed and unreferenced code does not start with common prologue patterns.
Why did JEB used conservative analysis by default? JEB usually employs aggressive analysis on standard Linux executables, and disassembles (almost) anything within code areas (also known as “linear sweep disassembly”). In this case, JEB went conservative because the ELF file looks non-standard (eg, its sections were stripped).
So, first a few memcpy() to copy large memory areas onto the stack, followed by series of “obfuscated” computations on these data. The main() routine eventually returns on an address computed in rax register. In the end, JEB disassembler was not able to get this value, hence it stopped analyzing there.
Let’s open the binary in JEB debugger, and retrieve the final rax value at runtime: 0x402335. We ask JEB to create a routine at this address (“Create Procedure”, P), and end up on very similar code. After manually following the control-flow, we end up on very large routines — around 8k bytes –, with complex control-flow, built on similar obfuscated patterns.
And yet at this point we have only seen a fraction of this 10MB executable… We might naively estimate that there is more than 1000 routines like these, if the whole binary is built this way (10MB/8KB = 1250)!
Most obfuscated routines re-use the same stack frame (initialized in main() with the series of memcpy()). In others words, it looks like a very large function has been divided into chunks, connected through each other by obfuscated control flow computations.
At this point, it seems pretty clear that a first objective would be to properly retrieve all native routines. Arguably the most robust and elegant way to do that would be to follow the control flow, starting from the entry point routine . But how to follow through all these obfuscated computations?
Explore The Code (At C Level)
Let’s now take a look at the pseudo-C code produced by JEB for those first routines. For example, here is main():
Decompiled main()
Overall, around 40 lines of C code, most of them being simple assignments, and a few others being complex operations. In comparison to the 200 non-trivial assembly instructions previously shown, that’s pretty encouraging.
What Do We Know
Let’s sum up what we noticed so far: MarsAnalytica’s executable is divided into (pretty large) handler routines, each of them passing control to the next one by computing its address. For that purpose, each handler reads values from a large stack, make a series of non-trivial computations on them, then write back new values into the stack.
As originally mentioned by 0xTowel, the crackme author, it looks like a virtual-machine style obfuscation, where bytecodes are read from memory, and are interpreted to guide the execution. It should be noted that virtual machine handlers are never re-executed: execution seems to go from lower to higher addresses, with new handlers being discovered and executed.
Also, let’s notice that while the executable is strongly obfuscated, there are some “good news”:
There does not seem to be any self-modifying code, meaning that all the code is statically visible, we “just” have to compute the control-flow to find it.
JEB decompiled C code looks (pretty) simple, most C statements are simple assignments, except for some lengthy expression always based on the same operations; the decompilation pipeline simplified away parts of the complexity of the various assembly code patterns.
There are very few subroutines called (we will come back on those later), and also a few system APIs calls, so most of the logic is contained within the chain of obfuscated handlers.
What Can We Do
Given all we know, we could try to trace MarsAnalytica execution by implementing a C emulator working on JEB decompiled code. The emulator would simulate the execution of each handler routine, update a memory state, and retrieve the address of the next handler.
The emulator would then produce an execution trace, and provide us access to the exact memory state at each step. Hence, we should find at some point where the user’s input is processed (typically, a call to getchar()), and then hopefully be able to follow how this input gets processed.
The main advantage of this approach is that we are going to work on (small) C routines, rather than large and complex assembly routines.
There are a few additional reasons we decided to go down that road:
– The C emulator would be architecture-independent — several native architectures are decompiled to C by JEB –, allowing us to re-use it in situations where we cannot easily execute the target (e.g. MIPS/ARM).
– It will be an interesting use-case for JEB public API to manipulate C code. Users could then extend the emulator to suit their needs.
– This approach can only work if the decompilation is correct, i.e. if the C code remains faithful to the original native code. In other words, it allows to “test” JEB decompilation pipeline’s correctness, which is — as a JEB’s developer — always interesting!
Nevertheless, a major drawback of emulating C code on this particular executable, is that we need the C code in the first place! Decompiling 10MB of obfuscated code is going to take a while; therefore this “plan” is certainly not the best one for time-limited Capture-The-Flag competitions.
Part 2: Building a (Simple) C Emulator
The emulator comes as a JEB back-end plugin, whose code can be found on our GitHub page. It starts in CEmulatorPlugin.java, whose logic can be roughly summarized as the following pseudo-code:
In this part we will focus on emulate()method. This method’s purpose is to simulate the execution of a given C routine from a given machine state, and to provide in return the final machine state at the end of the routine.
Decompiled C Code
First thing first, let’s explore what JEB decompiled code looks like, as it will be emulate() input. JEB decompiled C code is stored in a tree-structured representation, akin to an Abstract Syntax Tree (AST).
For example, let’s take the following C function:
int myfunction()
{
int a = 1;
while(a < 3) {
a = a + 1;
}
return a;
}
The JEB representation of myfunction body would then be:
AST Representation (rectangles are JEB interfaces, circles are values)
As of JEB 4.0, the hierarchy of interfaces representing AST elements (i.e. nodes in the graph) is the following:
AST ICElement Hierarchy
Two parts of this hierarchy are of particular interest to us, in the context of building an emulator:
ICExpression represents C expressions, for example ICIdentifier (a variable), or ICOperation (any operation). Our emulator is going to evaluate those expressions, i.e. assign concrete values to them.
While an AST provides a precise representation of C elements, it does not provide explicitly the control flow. That is, the order of execution of statements is not normally provided by an AST, which rather shows how some elements contain others from a syntactic point-of-view.
In order to simulate a C function execution, we are going to need the control flow. So here is our first step: compute the control flow of a C method and make it usable by our emulator.
To do so, we implemented a very simple Control-Flow Graph (CFG), which is computed from an AST. The code can be found in CFG.java, please refer to the documentation for the known limitations.
Here is for example the CFG for the routine previously presented myfunction():
myfunction() CFG
Why does JEB does not provide a CFG for decompiled C code? Mainly because at this point JEB decompiler does not need it. Most important optimizations are done on JEB Intermediate Representation — for which there is indeed a CFG. On the other hand, C optimizations are mainly about “beautifying” the code (i.e. pure syntactic transformations), which can be done on the AST only 3.
Before digging into the emulation logic, let’s see how emulator state is represented and initialized.
Emulator State
The emulator state is a representation of the machine’s state during emulation; it mainly comprehends the state of the memory and of the CPU registers.
The memory state is a IVirtualMemory object — JEB interface to represent virtual memory state. This memory state is created with MarsAnalytica executable initial memory space (set by JEB loader), and we allocate a large area at an arbitrary address to use as the stack during emulation:
// initialize from executable memory
memory = nativeUnit.getMemory();
// allocate large stack from BASE_STACK_POINTER_DEFAULT_VALUE (grows downward)
VirtualMemoryUtil.allocateFillGaps(memory, BASE_STACK_POINTER_DEFAULT_VALUE - 0x10_0000, 0x11_0000, IVirtualMemory.ACCESS_RW);
The CPU registers state is simply a Map from register IDs — JEB specific values to identify native registers — to values:
Update the state according to the statement semantic, i.e. propagate all side-effects of the statement to the emulator state.
Determine which statement should be executed next; this might involve evaluating some predicates.
For example, let’s examine the logic to emulate a simple assignment like a = b + 0x174:
void evaluateAssignment(ICAssignment assign) {
// evaluate right-hand side
Long rightValue = evaluateExpression(assign.getRight());
// assign to left-hand side
state.setValue(assign.getLeft(), rightValue);
}
The method evaluateExpression() is in charge of getting a concrete value for a C expression (i.e. anything under ICExpression), which involves recursively processing all the subexpressions of this expression.
In our example, the right-hand side expression to evaluate is an ICOperation (b + 0x17). Here is the extract of the code in charge of evaluating such operations:
If b is a local variable, i.e. mapped in stack memory, the method ICIdentifier.getAddress() provides us its offset from the stack base address. Also note that an ICIdentifier has an associated ICType, which provides us the variable’s size (through the type manager, see emulator’s getTypeSize()).
Finally, evaluating constant 0x17 in the operation b + 0x17 simply means returning its raw value:
For statements with more complex control flow than an assignment, the emulator has to select the correct next statement from the CFG. For example, here is the emulation of a while loop wStm (ICWhileStm):
// if predicate is true, next statement is while loop body...
if(evaluateExpression(wStm.getPredicate()) != 0) {
return cfg.getNextTrueStatement(wStm);
}
// ...otherwise next statement is the one following while(){..}
else {
return cfg.getNextStatement(wStm);
}
In MarsAnalytica there are only a few system APIs that get called during the execution. Among those APIs, only memcpy() is actually needed for our emulation, as it serves to initialize the stack (remember main()). Here is the API emulation logic:
Long simulateWellKnownMethods(ICMethod calledMethod,
List<ICExpression> parameters) {
if(calledMethod.getName().equals("→time")) {
return 42L; // value does not matter
}
else if(calledMethod.getName().equals("→srand")) {
return 37L; // value does not matter
}
else if(calledMethod.getName().equals("→memcpy")) {
ICExpression dst = parameters.get(0);
ICExpression src = parameters.get(1);
ICExpression n = parameters.get(2);
// evaluate parameters concrete values
[...REDACTED...]
state.copyMemory(src_, dst_, n_);
return dst_;
}
}
}
Demo Time
The final implementation of our tracer can be found in our GitHub page. Once executed, the plugin logs in JEB’s console an execution trace of the emulated methods, each of them providing the address of the next one:
Good news everyone: the handlers addresses are correct (we double-checked them with a debugger). In other words, JEB decompilation is correct and our emulator remains faithful to the executable logic. Phew…!
Part 3: Solving The Challenge
Plot Twist: It Does Not Work
The first goal of the emulator was to find where user’s input is manipulated. We are looking in particular for a call to getchar(). So we let the emulator run for a long time, and…
…it never reached a call to getchar().
The emulator was correctly passing through the obfuscated handlers (we regularly double-checked their addresses with a debugger), but after a few days the executed code was still printing MarsAnalytica magnificent ASCII art prompt (reproduced below).
MarsAnalytica Prompt
After investigating, it appears that characters are printed one by one with putchar(), and each of these calls is in the middle of one heavily obfuscated handler, which will be executed once only. More precisely, after executing more than one third of the whole 10MB, the program is still not done with printing the prompt!
As mentioned previously, the “problem” with emulating decompiled C code is that we need the decompiled code in the first place, and decompiling lots of obfuscated routines takes time…
Let’s Cheat
Ok, we cannot reach in a decent time the point where the user’s input is processed by the program. But the execution until this point should be deterministic. What if… we start the emulation at the point where getchar() is called, rather than from the entry-point?
In other words, we are going to assume that we “found” the place where user’s input starts to be processed, and use the emulator to analyze how this input is processed.
To do so, we used GDB debugger to set a breakpoint on getchar() and dumped both stack and heap memories at this point 5. Then, we extended the emulator to be able to initialize its memory state from stack/heap memory dumps, and change emulation start address to be the first call to getchar().
What Now?
At this point getchar() is called to get the first input character, so we let the emulator simulate this API by returning a pseudo-randomly chosen character, such that we can follow the rest of the execution. After 19 calls to getchar() we finally enter the place where user’s input is processed. Hooray…
Then, we let the emulator run for a whole day, which provided the execution trace we will be working on for the rest of this blog. After digging into the trace we noticed that input characters were passed as arguments to a few special routines.
Introducing The Stack Machine
When we first skimmed through MarsAnalytica code, we noticed a few routines that seemed specials for two reasons:
While obfuscated routines are executed only once and in a linear fashion (i.e. from low to high memory addresses), these “special” routines are at the very beginning of the executable and are called very often during the execution.
These routines’ code is not obfuscated and seems to be related with memory management at first sight.
For example, here is JEB decompiled code for the first of them (comments are ours):
long sub_400AAE(unsigned long* param0, int param1) {
long result;
unsigned long* ptr0 = param0;
int v0 = param1;
if(!ptr0) {
result = 0xffffffffL;
}
else {
// allocate new slot
void* ptr1 = →malloc(16L);
if(!ptr1) {
/*NO_RETURN*/ →exit(0);
}
// set value in new slot
*(int*)((long)ptr1 + 8L) = v0;
// insert new slot in first position
*(long*)ptr1 = *ptr0;
*ptr0 = ptr1;
result = 0L;
}
return result;
}
What we have here is basically a “push” operation for a stack implemented as a chained list (param0 is a pointer to the top of the stack, param1 the value to be pushed).
Each slot of the stack is 16 bytes, with the first 8 bytes being a pointer to the next slot and the next 4 bytes containing the value (remaining 4 bytes are not used).
It now seemed clear that these special routines are the crux of the challenge. So we reimplemented most of them in the emulator, mainly as a way to fully understand them. For example, here is our “push” implementation:
/** PUSH(STACK_PTR, VALUE) */
if(calledMethod.getName().equals("sub_400AAE")) {
Long pStackPtr = evaluateExpression(parameters.get(0));
Long pValue = evaluateExpression(parameters.get(1));
long newChunkAddr = allocateNewChunk();
// write value
state.writeMemory(newChunkAddr + 8, pValue, 4);
// link new chunk to existing stack
Long stackAdr = state.readMemory(pStackPtr, 8);
state.writeMemory(newChunkAddr, stackAdr, 8);
// make new chunk the new stack head
state.writeMemory(pStackPtr, newChunkAddr, 8);
}
Overall, these operations are implementing a custom data-structure that can be operated in a last-in, first-out fashion, but also with direct accesses through indexes. Let’s call this data structure the “stack machine”.
Here are the most used operators:
Address
Operator (names are ours)
Argument(s)
0x400AAE
PUSH
VALUE
0x4009D7
POP
VALUE
0x400D08
GET
INDEX
0x400D55
SET
INDEX,VALUE
Stack Machine’s Main Operators
Tracing The Stack Machine
At this point, we modified the emulator to log only stack operations with their arguments, starting from the first call to getchar(). The full trace can be found here, and here is an extract:
S: SET index:7 value:97
S: SET index:8 value:98
S: SET index:13 value:99
S: SET index:15 value:100
S: SET index:16 value:101
[...REDACTED...]
S: PUSH 2700
S: POP (2700)
S: SET index:32 value:2700
S: GET index:32
S: PUSH 2700
S: PUSH 2
S: POP (2)
S: POP (2700)
S: PUSH 2702
[...REDACTED...]
The trace starts with a long series of SET operations, which are storing the result of getchar() at specific indexes in the stack machine (97, 98, 99,… are characters provided by the emulator).
And then, a long series of operations happen, combining the input characters with some constant values. Some interesting patterns appeared at this point, for example:
S: POP (2)
S: POP (2700)
S: PUSH 2702
Here an addition was made between the two popped values, and the result was then pushed. Digging into the trace, it appears there are also handlers popping two values and pushing back a subtraction, multiplication, exclusive or, etc.
Another interesting pattern appears at several places:
S: POP (16335)
S: POP (1234764)
S: PUSH 1
Looking at the corresponding C code, it is actually a comparison between the two popped values — “greater than” in this case –, and the boolean result (0 or 1) is then pushed. Once again, different comparison operators (equal, not equal, …) are used in different handlers.
Finally, something suspicious also stood out in the trace:
S: PUSH 137
S: PUSH 99
S: POP (137)
S: POP (99)
The popped values do not match the order in which they were pushed!
Our objective here is to understand how input characters are manipulated, and what tests are done on them. In other words,we want to know for each POP/POP/PUSH pattern if it is an operation (and which operation — addition, subtraction …–), or a test (and which test — equal, greater than …–).
Again, note that routines implementing POP/POP/PUSH patterns are executed only once. So we cannot individually analyze them and rely on their addresses.
This is where working on decompiled C code becomes particularly handy. For each POP/POP/PUSH series:
We search in the method’s decompiled code if a C operator was used on the PUSH operand. To do so, it is as simple as looking at the operand itself, thanks to JEB decompiler’s optimizations! For example, here is a subtraction:
...
long v1 = pop(v0 - 0x65f48L); long v2 = pop(v0 - 0x65f48L); push(v0 - 0x65f48L, v1 - v2);
...
When a C operator is found in push() second operand, the emulator adds the info (with the number of operands) in the trace:
S: POP (137)
S: POP (99)
S: PUSH 38
| operation: (-,#op=2)
Also, we check if there is a “if” statement following a POP in the C code. For example, here is a “greater-than” check between popped values:
...
long v2 = pop(v0 - 0x65f48L); long v3 = pop(v0 - 0x65f48L); if(v2 > v3) {
...
If so, the emulator extracts the C operator used in the if statement and logs it in the trace (as a pseudo stack operator named TEST):
S: POP (16335)
S: POP (1234764)
S: TEST (>,#op=2)
S: PUSH 0
It should be noted that operands are always ordered in the same way: first poped value is on left side of operators. So operators and operands are the only thing we need to reconstruct the whole operation.
Time To Go Symbolic
At this point, our execution trace shows how the user’s input is stored onto the stack, and which operations and tests are then done. Our emulator is providing a “bad” input, so they are certainly failed checks in our execution trace. Our goal is now to find these checks, and then the correct input characters.
At this point, it is time to introduce “symbolic” inputs, rather than using concrete values as we have in our trace. To do so, we made a quick and dirty Python script to replay stack machine trace using symbolic variables rather than concrete values.
First, we initialize a Python “stack” with symbols (the stack is a list(), and the symbols are strings representing each character “c0“, “c1“, “c2“…). We put those symbols at the same indexes used by the initial SET operations:
# fill stack with 'symbolic' variables (ie, characters)
# at the initial offset retrieved from the trace
stack = [None] * 50 # arbitrary size
charCounter = 0
stack[7] = 'c' + str(charCounter) # S: SET index:7 value:c0
charCounter+=1
stack[8] = 'c' + str(charCounter) # S: SET index:8 value:c1
[... REDACTED ...]
We also need a temporary storage for expressions that get popped from the stack.
Then, we read the trace file and for each stack operation we execute the equivalent operation on our Python stack:
if operator == "SWAP":
last = stack.pop()
secondToLast = stack.pop()
stack.append(last)
stack.append(secondToLast)
elif operator == "GET":
index = readIndexFromLine(curLine)
temporaryStorage.append(stack[int(index)])
elif operator == "SET":
index = readIndexFromLine(curLine)
stack[int(index)] = temporaryStorage.pop()
elif operator == "POP":
value = stack.pop()
temporaryStorage.append(value)
[... REDACTED ...]
Now here is the important part: whenever there is an operation, we build a new symbol by “joining” the symbol operands and the operator. Here is an example of an addition between symbols “c5” and “c9“, corresponding respectively to the concrete input characters initially stored at index 26 and 4:
Concrete Trace
Symbolic Trace
... GET index:26
PUSH 102
GET index:4
PUSH 106
POP (106)
POP (102)
PUSH 208 | operation: (+,#op=2) ...
... GET index:26
PUSH "c5"
GET index:4
PUSH "c9"
POP ("c9")
POP ("c5")
PUSH "c9+c5"
...
Concrete execution trace, and its corresponding symbolic trace; on the symbolic side, rather than pushing the actual result of 106 + 102, we build an addition between the two symbols corresponding to the two concrete values
Note that our symbolic executor starts with a clean stack, containing only input symbols. All constants used during the computation are indeed coming from the bytecode (the large memory area copied on the (native) stack at the beginning of the execution), and not from the stack machine.
We can then observe series of operations on input symbols getting build by successive POP/POP/PUSH patterns, and being finally checked against specific values. Here is an extract of our stack at the end:
Here is another advantage to work with C code: the expressions built from our emulator’s trace are using high-level operators, which are directly understood by Z3.
Finally, we ask Z3 for a possible solution to the constraints, and we build the final string from c0, c1,… values:
m = s.model()
result = ''
result += chr(m[c0].as_long())
result += chr(m[c1].as_long())
result += chr(m[c2].as_long())
result += chr(m[c3].as_long())
...
And…
Hurray!
Conclusion
We hope you enjoy this blog post, where we used JEB C decompiled code to analyze a heavily obfuscated executable.
Please refer to our GitHub page for emulator code. While it has been tailored for MarsAnalytica crackme, it can be extended to emulate any executable’s decompiled C code (MarsAnalytica’s specific emulation logic is constrained in subclass MarsAnalyticaCEmulator).
You can run the plugin directly from JEB UI (refer to README):
By default, it will show emulation traces as text subunits in JEB project (stack machine trace in MarsAnalytica mode, or just C statements trace):
Plugin output: left panel is MarsAnalytica stack machine trace (when MarsAnalytica specific emulation logic is enabled), while right panel shows C statements emulation trace
Alternatively, the plugin comes with a headless client, more suitable to gather long running emulation traces.
Finally, kudo to 0xTowel for the awesome challenge! You can also check the excellent Scud’s solution.
Feel free to message us on Slack if you have any questions. In particular, we would be super interested if you attempt to solve complex challenges like this one with JEB!
While JEB’s default decompiled code follows (most of) C syntactic rules and their semantics, some custom operators might be inserted to represent low-level operations and ease the reading; hence strictly speaking JEB’s decompiled code should be called pseudo-C. The decompiled output can also be variants of C, e.g. the Ethereum decompiler produce pseudo-Solidity code. ↩
SHA1 of the UPX-packed executable: fea9d1b1eb9d3f93cea6749f4a07ffb635b5a0bc ↩
Implementing a complete CFG on decompiled C code will likely be done in future versions of JEB, in order to provide more complex C optimizations. ↩
The actual implementation is more complex than that, e.g. it has to deal with pointers dereferencement, refer to emulateStatement() for details. ↩
Dumping memory was done with peda for GDB, and commands dumpmem stack.mem stack and dumpmem heap.mem heap↩
The third part of this series is about bytecode virtualization. The analyses that follow were done statically.
Bytecode virtualization is the most interesting and technically challenging feature of this protector.
TL;DR: – JEB Pro can un-virtualize protected methods. – A Global Analysis (Android menu) will point you to p-code VM routines. – Make sure to disable Parse Exceptions when decompiling such methods. – For even clearer results, rename opaque predicates of the method to guard0/guard1 (refer part 1 of this blog for details)
What Is Code Virtualization
Relatively novel, code virtualization is possibly one of the most effective protection technique there is 1. With it come relatively heavy disadvantages, such as hampered speed of execution 2 and the difficulty to troubleshoot production code. The advantages are heightened reverse-engineering hurdles over other more traditional software protection techniques.
Virtualization in the context of code protection means:
Generating a virtual machine M1
Translating an original code object C0 meant to be executed on a machine M03, into a semantically-equivalent code object C1, to be run on M1.
While the general features of M1 are likely to be fixed (e.g., all generations of M1 are stack machines with such and such characteristics), the Instruction Set Architecture (ISA) of M1 may not necessarily be. For example, opcodes, microcodes and their implementation may vary from generation to generation. As for C1, the characteristics of a generation are only constrained by the capabilities of the converter. Needless to say, standard obfuscation techniques can be applied on C1. The virtualization process can possibly be recursive (C1 could be a VM implementing the specifications of a machine M2, executing a code object C2, emulating the original behavior of C0, etc.).
All in all, in practice, this makes M1 and C1 unique and hard to reverse-engineer.
Before and after virtualization of a code object C0 into C1
Example of a Protected Method
Note: all identifier names had been obfuscated. They were renamed for clarity and understanding.
Below, the class VClass was found to be “virtualized”. A virtualized class means that all non-constructor (all but <init>(*)V and <clinit>()V) methods were virtualized.
Interestingly, the constructors were not virtualized
The method d(byte[])byte[] is virtualized:
It was converted into an interpreter loop over two large switch constructs that branch on pseudo-code entries stored in the local array pcode.
A PCodeVM class was added. It is a modified stack-based virtual machine (more below) that performs basic load/store operations, custom loads/stores, as well as some arithmetic, binary and logical operations.
Virtualized method. Note the pcode array. The opcode handlers are located in two switches. This picture shows the second switch, used to handle specific operations and API calls.
A snippet of the p-code VM class. Full code here, also contains the virtualized class.
The generic interpreter is called via vm.exec(opcode). Execution falls back to a second switch entry, in the virtualized method, if the operation was not handled.
Please refer to the gist linked above for a full list of “generic” VM operations. Three examples, including one showing that the operations are not as generic as the term implies:
(specific to this VM) opcode 6, used to peek the most recently pushed object(specific to this VM) opcode 8, a push-int operation(specific to this VM) opcode 23 is relatively specialized, it implements an add-xor stack operation (pop, pop, push). It is quite interesting to see that the protection system does not simply generate one-to-one, dalvik-to-VM opcodes. Instead, the target routine is thoroughly analyzed, most likely lifted, high-level (compounded) arithmetic operations isolated, and pseudo-generic (in PCodeVM) or specialized (in the virtualized method) opcodes generated.
As said, negative opcodes represent custom operations specific to a virtualized method, including control flow changes. An example:
opcode -25: a if(a >=b) goto LABEL operation (first, call into opcode 55 to do a GE operation on the top two integers; then, use the result to do conditional branching)
Characteristics of the P-code VM
From the analysis of that code as well as virtualized methods found in other binaries, the characteristics of the p-code VM generated by the app protector can be inferred:
The VM is a hybrid stack machine that uses 5 parallel stacks of the same height, stored in arrays of:
java.lang.Object (accommodating all objects, including arrays)
int (accommodating all small integers, including boolean and char)
long
float
double
For each one of the 5 stack types above, the VM uses two additional registers for storing and loading
Two stack pointers are used: one indicates the stack TOP, the other one seems to be used more liberally, and is akin to a peek register
The stack has a reserved area to store the virtualized method parameters (including this if the method is non-static)
The ISA encoding is trivial: each instruction is exactly one-word long, it is the opcode of the p-code instruction to be executed. There is no concept of register, index, or immediate value embedded into the instruction, as most stack machine ISA’s have.
Because the ISA is so simple, the implementation of the semantics of an instruction falls almost entirely on the p-code handler. For this reason, they were grouped into two categories:
Semi-generic VM operations (load/store, arithmetic, binary, tests) are handled by the VM class and have a positive id. (A VM object is used by every virtualized method in a virtualized class.)
Operations specific to a given virtualized method (e.g., method invocations) use negative ids and are handled within the virtualized method itself.
While the PCodeVM opcodes are all “useful”, many specific opcodes of a virtualized method (negative ids) achieve nothing but the execution of code semantically equivalent to NOP or GOTO.
opcodes -2, -1: essentially branching instructions. A substantial amount of those can be found, including some branching to blocks with no other input but that source (i.e., an unnecessary GOTO – =spaghetti code -, or a NOP operation if the next block is the follow.)
Rebuilding Virtualized Methods
Below, we explain the process used to rebuild a virtualized method. The CFG’s presented are IR-CFG’s (Intermediate Representations) used by the dexdec4 pipeline. Note that unlike gendec‘s IR 5, dexdec‘s IR is not exposed publicly, but its textual representation is mostly self-explanatory.
Overall, a virtualized routine, once processed by dexdec like any other routine, looks like the following: A loop over p-code entries (stored in x8 below), processed by a() at 0xE first, or by the large routine switch.
Virtualized method, optimized, virtualized
The routine a() is PCodeVM.exec(), and its optimized IR boils down to a large single switch. 6
PCodeVM.exec()
The unvirtualizer needs to identify key items in order to get started, such as the p-code entries, identifiers used as indices into the p-code array, etc. Once they have been gathered, concolic execution of the virtualized routine becomes possible, and allows rebuilding a raw version of the original execution flow. Multiple caveats need to be taken care of, such as p-code inlining, branching, or flow termination. In its current state, the unvirtualizer disregards exceptional control flow.
Below, a raw version of the unflattened CFG. Note that all operations are stack-based; the code itself has not been modified at this point, it still consists of VM stack-based operations.
Virtualized method after unflattening, raw
dexdec’s standard IR optimization passes (dead-code removal, constant and variable propagation, folding, arithmetic simplification, flow simplifications, etc.) clean up the code substantially:
Virtualized method after unflattening and IR optimizations (opt1)
At this stage, all operations are stack-based. The high-level code generated from the above would be quite unwieldy and difficult to analyze, although substantially better than the original double-switch.
The next stage is to analyze stack-based operations to recover stack slots uses and convert them back to identifiers (which can be viewed as virtual registers; essentially, we realize the conversion of stack-based operations into register-based ones). Stack analysis can be done in a variety of ways, for example, using fixed-point analysis. Again, several caveats apply, and the need to properly identify stacks as well as their indices is crucial for this operations.
Virtualized method after unflattening, IR optimizations, VM stack analysis (opt2)
After another round of optimizations:
Virtualized method after unflattening, IR optimizations, VM stack analysis, IR optimizations (opt2_1)
Once the stack analysis is complete, we can replace stack slot accesses by identifier accesses.
Virtualized method after unflattening, IR optimizations, VM stack analysis, IR optimizations, virtual registers insertion (opt3)
After a round of optimizations:
Virtualized method after unflattening, IR optimizations, VM stack analysis, IR optimizations, virtual registers insertion, IR optimizations (opt3)
At this point, the “original” CFG is essentially reconstructed, and other advanced deobfuscation passes (e.g., emulated-based deobfuscators) can be applied.
The high-level code generation yields a clean, unvirtualized routine:
High-level code, unvirtualized, unmarked
After reversing, it appears to be a modified RC4 algorithm. Note the +3/+4 added to the key.
High-level code, unvirtualized, marked
Detecting Virtualized Methods
All versions of JEB detect virtualized methods and classes: run Global Analysis (GUI menu: Android) on your APK/DEX and look for those special events:
JEB Pro version 3.22 7 ships with the unvirtualizer module.
Tips:
Make sure to enable the Obfuscators, and enable Unvirtualization (enabled by default in the options).
The try-blocks analysis must be disabled for the class to unvirtualize. (Use MOD1+TAB to redecompile, untick “Parse Exception Blocks”).
After a first decompilation pass, it may be easier to identify guard0/guard1, rename, and recompile, else OP obfuscation will remain and make the code unnecessarily difficult to read. (Refer to part 1 of this series to learn about what renaming those fields to those special names means and does when a protected app is detected.)
Conclusion
We hope you enjoyed this third installment on code (un)virtualization.
There may be a fourth and final chapter to this series on native code protection. Until next time!
—
On a personal note, my first foray into VM-based protection dates back to 2009 with the analysis of Trojan.Clampi, a Windows malware protected with VMProtect ↩
Although one could argue that with current hardware (fast x64/ARM64 processors) and software (JIT’er and AOT compilers), that drawback may not be as relevant as it used to be. ↩
Machine here may be understood as physical machine or virtual machine ↩
Note the similarities with CFG flattened by chenxification and similar techniques. One key difference here is that the next block may be determined using the p-code array, instead of a key variable, updated after each operation. I.e., is the FSM – controlling what the next state (= the next basic block) is – embedded in the flattened code itself, or implemented as a p-code array. ↩
JEB Android and JEB demo builds do not ship the unvirtualizer module. I initially wrote this module as a proof-of-concept not intended for release, but eventually decided to offer it to our professional users who have legitimate (non malicious) use cases, e.g. code audits and black-box assessments. ↩
The second part of this series focuses on encryption:
Asset encryption
Class encryption
Full application encryption
Those analyses were done statically using JEB 3.21.
Asset Encryption
Assets can be encrypted, while combining other techniques, such as class encryption (seen in several high-profile apps), and bytecode obfuscation (control-flow obfuscation, string encryption, reflected API access). With most bytecode obfuscation being automatically cleaned up, Assets are being accessed in the following way:
Purple and cyan tokens represent auto-decrypted code. The assets decryptor method was renamed to ‘dec’, it provides a FilterInputStream that transparently decrypts contents.The DecryptorFilterStream (renamed) factory method
The DecryptorFilterStream object implements a variant of TEA (Tiny Encryption Algorithm), known for its simplicity of implementation and great performance 1.
Note the convoluted generation of Q_w, instead of hard-coding the immediate 0x9E37. Incidentally, a variant of that constant is also used by RC5 and RC6.read() decrypts and buffers 64 bits of data at a time. The decryption loop consists of a variable number of rounds, between 5 and 16. Note that Q_w is used as a multiplier instead of an offset, as TEA/XTEA normally does.
It seems reasonable to assume that the encryption and decryption algorithms may not always be the same as this one. This app protector making extensive use of polymorphism throughout its protection layers, it could be the case that during the protection phase, the encryption primitive is either user-selected or selected semi-randomly.
JEB can automatically emulate throughout this code and extract assets, and in fact, this is how encrypted classes, described in the next section, were extracted for analysis. However, this functionality is not present in current JEB Release builds. Since the vast majority of uses are legitimate, we thought that shipping one-click auto-decryptors for data and code at this time was unnecessary, and would jeopardize the app security of several high-profile vendors.
Class Encryption
Class encryption, as seen in multiple recent apps as well, works as follows:
The class to be protected, CP, is encrypted, compressed, and stored in a file within the app folder. (The filename is random and seems to be terminated by a dot, although that could easily change.) Once decrypted, the file is a JAR containing a DEX holding CP and related classes.
CL is also encrypted, compressed, and stored in a file within the app folder. Once decrypted, the file is a JAR containing a DEX holding the custom class loader CL.
Within the application, code using CP (that is, any client that loads CP, invokes CP methods, or accesses CP fields) is replaced by code using CM, a class manager responsible for extracting CP and CL, and loading CL. CM offers bridge methods to the clients of CP, in order to achieve the original functionality.
The following diagram summarizes this mechanism:
Class encryption mechanism
Since protected applications use the extensive RASP (Runtime Application Self-Protection) facility to validate the environment they’re running on, the dynamic retrieval of CL and CP may prove difficult. In this analysis, it was retrieved statically by JEB.
Below, some client code using CM to create an encrypted-class object CP and execute a method on it. Everything is done via reflection. Items were renamed for enhanced clarity.
Encrypted class loading and virtual method invocation
CM is a heavy class, highly obfuscated. The first step in understanding it is to:
disable rendering of catch-blocks that clutter the view.
With auto-decryption and auto-unreflection enabled, the result is quite readable. A few snippets follow:
Decrypted files are deleted after loading. On older devices, loading is done with DexFile; on newer devices, it is done using InMemoryDexClassLoader.In this case, the first encrypted JAR file (holding CL) is stored as “/e.”.In this case, the second encrypted JAR file (holding CP and related) is stored as “/f.”. The application held two additional couples, (“/a.”, “/b.”) and (“/c.”, “/d.”)
Once retrieved, those additional files can easily be “added” to the current DEX unit with IDexUnit.addDex() of your JEB project. Switch to the Terminal fragment, activate the Python interpreter (use py), and issue a command like:
Using Jython’s to add code to an existing DEX unitThe bnz class (CL) is a ClassLoader for the protected class (CP).
The protected class CP and other related classes, stored in “/f.” contained… anti-tampering verification code, which is part of the RASP facility! In other instances that were looked at, the protected classes contained: encrypted assets manager, custom code, API key maps, more RASP code, etc.
Full Application Encryption
“Full” encryption is taking class encryption to the extreme by encrypting almost all classes of an application. A custom Application object is generated, which simply overloads attachBaseContext(). On execution, the encrypted class manager will be called to decrypt and load the “original” application (all other protections still apply).
Custom application object used to provide full program encryption.
Note that activities can be encrypted as well. In the above case, the main activity is part of the encrypted jar.
Conclusion
That’s it for part 2. We focused on the encryption features. Both offer relatively limited protection for reverse-engineers willing to go the extra mile to retrieve original assets and bytecodes.
In Part 3, we will present what I think is the most interesting feature of this protector, code virtualization.
Until next time!
—
The TEA encryption family is used by many win32 packers ↩
What started as a ProGuard + basic string encryption + code reflection tool evolved into a multi-platform, complex solution including: control-flow obfuscation, complex and varied data and resources encryption, bytecode encryption, virtual environment and rooted system detection, application signature and certificate pinning enforcement, native code protection, as well as bytecode virtualization 1, and more.
This article presents the obfuscation techniques used by this app protector, as well as facility made available at runtime to protected programs 2. The analysis that follows was done statically, with JEB 3.20.
Identification
Identifying apps protected by this protector is relatively easy. It seems the default bytecode obfuscation settings place most classes in the o package, and some will be renamed to invalid names on a Windows system, such as con or aux. Closer inspection of the code will reveal stronger hints than obfuscated names: decryption stubs, specific encrypted data, the presence of some so library files, are all tell tale signs, as shown below.
Running a Global Analysis
Let’s run a Global Analysis (menu Android, Global analysis…) with standard settings on the file and see what gets auto-decrypted and auto-unreflected:
Results subset of Global Analysis (redacted areas are meant to keep the analyzed program anonymous; it is a clean app, whose business logic is irrelevant to the analysis of the app protector)
Lots of strings were decrypted, many of them specific to the app’s business logic itself, others related to RASP – that is, library code embedded within the APK, responsible for performing app signature verification for instance. That gives us valuable pointers into where we should be looking at if we’d like to focus on the protection code specifically.
Deobfuscating Code
The first section of this blog focuses on bytecode obfuscation and how JEB deals with it. It is mostly automated, but a final step requires manual assistance to achieve the best results.
Most obfuscated routines exhibit the following characteristics:
Dynamically generated strings via the use of per-class decryption routines
Most calls to external routines are done via reflection
Flow obfuscation via the use of a couple of opaque integer fields – let’s call them OPI0, OPI1. They are class fields generally initialized to 0 and 1.
Arithmetic operation obfuscation
Garbage code insertion
Unusual protected block structure, leading to fragmented try-blocks, unavoidable to produce semantically accurate raw code
As an example, the following class is used to perform app certificate validation in order, for instance, to prevent resigned apps from functioning. A few items were renamed for clarity; decompilation is done with disabled Deobfuscators (MOD1+TAB, untick “Enable deobfuscators”):
Take #1 (snippet) – The protected class is decompiled without deobfuscation in order to show semi-raw output (a few optimizers doing all sort of code cleanup are not categorized as deobfuscators internally, and will perform even if Deobfuscation is disabled). Note that a few items were also renamed for clarity.
In practice, such code is quite hard to comprehend on complex methods. With obfuscators enabled (the default setting), most of the above will be cleared.
See the re-decompilation of the same class, below.
strings are decrypted…
…enabling unreflection
most obfuscation is removed…
except for some control flow obfuscation that remains because JEB was unable to process OPI0/OPI1 directly (below,
Take #2 (full routine) – obfuscators enabled (default). The red blocks highlight use of opaque variables used to obfuscate control flow.
Let’s give a hint to JEB as to what OPI0/OPI1 are.
When analyzing protected apps, you can rename OPI0 and OPI1 to guard0 and guard1, respectively, to allow JEB go aggressively clean the code
Redecompile the class after renaming the fields
Take #3 (full routine) – with explicit guard0/guard1
That final output is clean and readable.
Other obfuscation techniques not exposed in this short routine above are arithmetic obfuscation and other operation complexification techniques. JEB will seamlessly deal with many of them. Example:
is optimized to
To summarize bytecode obfuscation:
decryption and unreflection is done automatically 3
garbage clean-up, code clean-up is also generic and done automatically
control flow deobfuscation needs a bit of guidance to operate (guard0/guard1 renaming)
Runtime Verification
RASP library routines are used at the developers’ discretion. They consist of a set of classes that the application code can call at any time, to perform tasks such as:
App signing verification
Debuggability/debugger detection
Emulator detection
Root detection
Instrumentation toolkits detection
Certificate pinning
Manifest check
Permission checks
The client decides when and where to use them as well as what action should be taken on the results. The code itself is protected, that goes without saying.
App Signing Verification
Certificate verification uses the PackageManager to retrieve app’s signatures: PackageManager.getPackageInfo(packageName, GET_SIGNATURES).signatures
The signatures are hashed and compared to caller-provided values in an IntBuffer or LongBuffer.
Debug Detection
Debuggability check
The following checks must pass:
assert that Context.ctx.getApplicationInfo().flags & ApplicationInfo.FLAG_DEBUGGABLE is false
check the ro.debuggable property, in two ways to ensure consistency
using android.os.SystemProperties.get() (private API)
using the getprop‘s binary
verify that no hooking framework is detected (see specific section below)
Debugging session check
The following checks must pass:
assert that android.os.Debug.isDebuggerConnected() is false
verify no tracer process: tracerpid entry in /proc/<pid>/status must be <= 0
verify that no hooking framework is detected (see specific section below)
Debug key signing
enumerate the app’s signatures via PackageInfo.signatures
use getSubjectX500Principal() to verify that no certificate has a subject distinguished name (DN) equals to "CN=Android Debug,O=Android,C=US", which is the standard DN for debug certificates generated by the SDK tools
Emulator Detection
Emulator detection is done by checking any of the below.
1) All properties defined in system/build.prop are retrieved, hashed, and matched against a small set of hard-coded hashes:
Unfortunately, we were not able to reverse those hashes back to known property strings – however, it was tried only on AOSP emulator images. If anybody wants to help and run the below on other build.prop files, feel free to let us know what property strings those hashes match to. Here is the hash verification source, to be run be on build.prop files.
4) Check for the presence of wired network interfaces: (via NetworkInterface.getNetworkInterfaces)
eth0
eth1
5) If the app has the permission READ_PHONE_STATE, telephony information is verified, an emulator is detected if any of the below matches (standard emulator image settings):
/proc/ioports: entry "0ff :" (unknown port, likely used by some emulators)
/proc/self/maps: entry "gralloc.goldfish.so" (GF: older emulator kernel name)
7) Property checks (done in multiple ways with a consistency checks, as explained earlier), failed if any entry is found and start with one of the provided values:
Class loading is done in different ways in an attempt to circumvent hooking itself, using Class.forName with a variety of class loaders, custom class loaders and ClassLoader.getLoadedClass, as well as lower-level private methods, such as Class.classForName.
4) Stack frame verification: an exception is generated in order to retrieve a stack frame. The callers are hashed and compared to an expected hard-coded value.
5) Native code checks. This will be detailed in another blog, if time allows.
Root Detection
While root detection overlaps with most of the above, it is still another layer of security a determined attacker would have to jump over (or walk around) in order to get protected apps to run on unusual systems. Checks are plenty, and as is the case for all the code described here, heavily obfuscated. If you are analyzing such files, keeping the Deobfuscators enabled and providing guard0/guard1 hints is key to a smooth analysis.
Static initializer of the principal root detection class. Most artifacts indicative of a rooted device are searched for by hash.
Build.prop checks. As was described in emulator detection.
su execution. Attempt to execute su, and verify whether su -c id == root
su presence.su is looked up in the following locations:
Magisk detection through mount. Check whether mount can be executed and contains databases/su.db (indicative of Magisk) or whether /proc/mounts contains references to databases/su.db.
Read-only system partitions. Check if any system partition is mounted as read-write (when it should be read-only). The result of mount is examined for any of the following entries marked rw:
Verify installed apps in the hope of finding one whose package name hashes to the hard-coded value:
0x9E6AE9309DBE9ECFL
Unfortunately, that value was not reversed, let us know if you find which package name generates this hash – see the algorithm below:
public static long hashstring(String str) {
long h = 0L;
for(int i = 0; i < str.length(); i++) {
int c = str.charAt(i);
h = h << 5 ^ (0xFFFFFFFFF8000000L & h) >> 27 ^ ((long)c);
}
return h;
}
NOTE: App enumeration is performed in two ways to maximize chances of evading partial hooks.
More convoluted: iterate over all known MAIN intents: PackageManager.queryIntentActivities(new Intent("android.intent.action.MAIN")), derive the package name from the intent via ResolveInfo.activityInfo.packageName
SElinux verification. If the file /sys/fs/selinux/policy cannot be read, the check immediately passes. If it is readable, the policy is examined and hints indicative of a rooted device are looked for by hash comparison:
472001035L
-601740789L
The hashing algorithm is extremely simple, see below. For each byte of the file, the crc is updated and compared to hard-coded values.
long h = 0L;
//for each byte:
h = (h << 5 ^ ((long)(((char)b)))) & 0x3FFFFFFFL;
// check h against known list
Running processes checks. All running processes and their command-lines are enumerated and hashed, and specific values are indirectly looked up by comparing against hard-coded lists.
APK Check
This verifier parses compressed entries in the APK (zip) file and compares them against well-known, hard-coded CRC values.
Manifest Check
Consistency checks on the application Manifest consists of enumerating the entries using two different ways and comparing results. Discrepancies are reported.
Open the archive’s MANIFEST.MF file via Context.getAssets(), parse manually
Use JarFile(Context.getPackageCodePath()).getManifest().getEntries()
Discrepancies in the Manifest could indicate system hooks attempting to conceal files added to the application.
Permissions Check
This routine checks for permission discrepancies between what’s declared by the app and what the system grant the app.
Set A: App permission gathering: all permissions requested and defined by the app, as well as all permissions offered by the system, plus the INTERACT_ACROSS_USERS and INTERACT_ACROSS_USERS_FULL permissions,
Set B: Retrieve all permissions that exist on the system
Define set C = B – A
For every permission in C, use checkCallingOrSelfPermission (API 22-) or checkSelfPermission (API 23+) to verify that the permission is not granted.
Permission discrepancies could be used to find out system hooks or unorthodox execution environments.
Note the “X & -(A+1) | ~X & A” checks. Several opaque arithmetic/binary expressions attempt to complicate the control flow. Here, that expression is never equals to v2, and therefore, the if-check will always fail. JEB 3.20 does not clean all those artifacts.
Miscellaneous
Other runtime components include library code to perform SSL certificate pinning, as well as obfuscated wrappers around web view clients. None of those are of particular interest.
Wrapper for android.webkit.WebViewClient. Make sure to enable deobfuscators and provide guardX hints. When this is done, most methods will be crystal clear. In fact, the majority of them are simple forwarders.
Conclusion
That’s it for the obfuscation and runtime protection facility. Key take-away to analyze such protected code:
Keep the obfuscators enabled
Locate the opaque integers, rename them to guard0/guard1 to give JEB a hint on where control flow deobfuscation should be performed, and redecompile the class
The second part in the series presents bytecode encryption and assets encryption.
VM in VM, repeat ad nauseam – something not new to code protection systems, it’s existed on x86 for more than a decade, but new on Android, and other players in this field, commercial and otherwise, seem to be implementing similar solutions. ↩
So-called “RASP”, a relatively new acronym for Runtime Application Self-Protection↩
Decryption and unreflection are generic processes of dexdec (the DEX Decompiler plugin); there is nothing specific to this protector here. The vast majority or encrypted data, regardless of the protection system in place, will be decrypted. ↩
A note about 2020 Q1 updates (versions 3.10 to 3.16) regarding the DEX/Dalvik decompiler modules:
Generic String Decryption
Lambda Recovery
Unreflecting Code
Decompiling Java Bytecode
Auto-Rename All
Generic String Decryption
JEB ships with a generic deobfuscator that can perform on-the-fly string decryption and other complex optimizations. Although this optimizer performs safe (i.e., guaranteed) optimizations in most cases, it is unsafe in the general case case and therefore, may be disabled in the options. Refer to the Engines options .parsers.dcmp_dex.EnableDeobfuscators and .parsers.dcmp_dex.EmulationSupport.
Many code protectors offer options to replace immediate string constants by method invocations that perform on-the-fly decryption.
A variety of techniques exist, ranging from simple one-off trivial decryptor methods, to complex schemes involving object(s) creation, complicated decryptors injected in third-party packages, non-trivial logic, junk code meant to slow down analyzers, use of opaque predicates, etc. They are implemented in an infinite number of ways. JEB’s generic deobfuscator can perform quick, safe emulation of the intermediate representation to provide a replacement. It may sometimes fail or bail out due to several reasons, such as performance or pitfalls like anti-emulation and anti-sandboxing techniques.
Example 1
The string decryptor is a static method reading encrypted string data in a class byte array. Also note that code reflection is used.
The above code (blue box) ends up being deobfuscated to:
Example 2:
The decryption methods were injected into library packages, e.g. Gson’s
The above code is deobfuscated to:
With the generic string decryptor optimizer ON
Below, a decryptor that had been injected into the com.google.gson.Gson() class:
The concat(String, int) method is not part of standard Gson, of course. It was injected by the protector and is used by (some) code to perform on-the-fly string decryption.
Example 3:
One last example, which was involuntarily – yet, quite timely! – provided by a user:
Static fields initialized with decrypted strings.After decryption.
Decrypting all strings: The decryptor kicks in when decompiling methods only. At the moment, if a string happens to be successfully decrypted, the optimizer does not attempt to recover all similarly encrypted strings in the code, although it is most certainly an addition that will make it in a future software update.
Rendering: You may quickly identify decrypted strings in the client as they are rendered using a special color associated with the itemId STRING_GENERATED, by default rendered in a flashy pink color in light and dark themes. Hovering over such items will bring up a pop-up with additional origin information, like the underlying code that would have generated that string:
Auto-decrypted strings in pink; overlays with source/origin information.
API: – From a DEX perspective: Generated strings are artificial. Therefore, IDexString.isArtificial() would return true. – From a Java/AST perspective: IJavaConstant objects that embed origin information do so using the “origin” tag. Use IJavaConstant.getTags().get("origin") to retrieve it.
Lambda Recovery
JEB attempts to perform Java 8 style lambda recovery and reconstruction.
Desugared Lambdas
Recovery and reconstruction does not rely on any type of metadata 1, such as special prefixes -$$Lambda$ for classes and methods implementing desugared lambdas in dex 37-.
Options: Lambda reconstruction can be disabled in the options (Edit, Options, Engines, …). Lambda rendering can also be disabled in the options, as well as on-demand by right-clicking a decompiled view, Rendering Options….
Lambdas options
API Note: In the above cases, the underlying Java AST may be a IJavaNew or IJavaStaticField node. This is not the case for real (not desugared) lambdas, which map to an IJavaCall node – see below.
This DEX file contains real lambdas implemented via invoke-custom
API Note: Such lambdas map to an IJavaCall node for which isCustomCall() will return true.
Unreflecting Code
Many code protectors make heavy use of reflection – combined with string encryption, as we’ll see below – to obfuscate code. In practice, reflection is limited to method invocation (static and virtual), static and non-static field setting and getting, and new instance creation. A few examples:
v = Class.forName("java.lang.Integer").getMethod("valueOf",
String.class).invoke(null, str);
// instead of
v = Integer.valueOf(str);
Class.forName("SomeClassName").getField("b").setInt(x, 4);
// instead of
x.b = 4;
Class.forName("java.lang.String").getConstructor(byte[].class)
.newInstance(val);
// instead of
new String(arg6);
Such code is generally protected by a catch-all handler that forwards the cause of any exception raised by a reflection issue:
By default, JEB will attempt to unreflect code. This deobfuscator is potentially unsafe and may be disabled in the options. Note that you always have the ability to choose, for a particular decompilation, whether some options should be temporarily enabled or disabled, by pressing CTRL+TAB (or COMMAND+TAB on macOS) to decompile (same as menu Action, Decompile with options…).
Unsafe deobfuscators can be globally disabled.
So, in a nutshell, code normally decompiled to:
Reflection, not cleaned (malware was obfuscated)
will be decompiled to:
With reflection cleaned
Technical Note: This optimizer works on the Intermediate Representation manipulated by the decompiler, not to be confused with the AST rendered as its output. (The AST cleaner that was described in an older post is more limited than this IR optimizer.)
Last-step failures: Successfully unreflecting code eventually depends on being able to find the intended target method or field matching the provided description (method parameter types or field type). Failure to do so will generate a log like "A candidate field/method/constructor for unreflection was not found".
Decompiling Java Bytecode
JEB supports JLS bytecode decompilation for *.class files and jar-like archives (jar, war, ear, etc.). The Java bytecode is converted to Dalvik using Android’s dx by default. Users may choose to use d8 (not recommended for now) instead by selecting so in the Options.
The resulting DEX file(s) are processed as usual.
You may use this to decompile Android Library files (*.aar files) in JEB.
Examining the android-arch-core-runtime library
Auto-Rename All
JEB 3.13 introduced a new generic action, Auto-Rename All. Its implementation is at the discretion of code plugins. The DEX plugin implements it, therefore users may execute Action, Auto-Rename All… at any time (generally after processing an obfuscated file) in order to rename code items such as field, method, or class names, to something more easily processable for our -limited- human brains.
Look at this horrendous obfuscation scheme below. It’s using right-to-left unicode characters to seriously mess up rendering:
Obfuscated name using RTL Arabic characters
Let’s run Action, Auto-Rename All… on this file:
Auto-Renaming capabilities are provided (optionally) by plugins.After auto-renaming code items in the above file. Not clearer in terms of meaning, but at least, it’s something we can start working on.
As usual, feel free to join us on Slack, message us on Twitter, or email us privately at support@pnfsoftware.com.
Until next time!
–
Relying on metadata leads to false negatives in the best case – e.g., when the code has been minified by something like ProGuard; it leads to false positives in the worst case – e.g. forged metadata to incite the decompiler to generate inaccurate or wrong code. ↩
In part 1 of this series, we gave an overview of the Intermediate Representation used by JEB’s Native Analysis Pipeline, as well as a simple Python script demonstrating how to use the API to access and print out IR-CFG of decompiled routines.
In part 2, we continue our exploration of JEB IR. We will show how to write a custom IR optimizer plugin to clean-up a custom obfuscation used in a piece of code. The resulting decompiled C code will end up very readable as well.
Before you proceed, make sure to update JEB Pro to version 3.1.1+.
Obfuscated Crypto-stealer Code
The sample we are going to look at monitors Windows clipboards for cryptocurrency-looking wallet addresses, and replaces them with a desired target address. The sample is specifically targeting Ethereum wallet addresses. It is a neutered final stage payload – the recipient address has been scrambled to render the code ineffective.
PE characteristics of file 1.exe
Although the payload is unpacked, what is interesting is that one of its key routines is obfuscated: custom garbage code was inserted.
Junk (useless) assignments
The garbage code is easy to go through: a bit of manual analysis shows that junk instructions are assigning pseudo-random values to an array whose bytes are never used. Two types of assembly patterns are present:
1- mov dword ptr [edi + offset], junk_value ; edi previously init. to ; junkarray address 2- push junk_value pop dword ptr [junkarray_address + offset]
If we decompile that code and look at the final IR (as shown below), we can see that those instructions ended up being converted and optimized to the following type of assignment:
Assign(Mem(mem_address), Imm(junk_value))
Currently, the decompiled code looks like the following, hard-to-digest blob:
Snippet of decompiled code (obfuscated)
Although quite painful to read, we can follow the program’s logic by abstracting away the junk assignments. (Essentially, win32 functions’ OpenClipboard, GetClipboardData, and SetClipboardData are used to retrieve, check, and replace copy-pasted Ascii and Unicode text, if they match the following pattern “/0x(..){20}/”. The replacement string target wallet address, previously decrypted by sub_401000.)
Cleaning the Intermediate Representation
Recall that the native analysis pipeline can be simplifed as the workflow below:
CodeObject (*) -> Reconstructed Routines & Data -> Conversion to IR (low-level, non-optimized) -> IR Optimizations<--- this is where we'll work -> Final IR (higher-level, optimized, typed) -> Generation of AST -> AST Optimizations -> Final AST (final, cleaned) -> High-level output (eg, C variant)
Our custom IR optimizer will look for junk assignments and remove them. The important criteria are: What is the junk array start and end addresses? Is it common to all routines in the binary, or is there one array per routine? Those questions may be hard to answer in the general case. However, for our specific sample file, we can assert with a high-degree of certainty that the junk array: – starts at address 0x415882 – is at most 256 bytes long – is used solely by sub_401171, the routine we want to analyze
Because of the above restrictions, the IR optimizer we are going to write should be qualified as a custom or ad-hoc IR optimizer. Chances are, we won’t be able to reuse it as-is in other programs without some amount of tweaking.
Let’s get started, we will: – create an Eclipse project with scaffold code for a Java back-end plugin – write and test a custom IR optimizer with a headless client – deploy the plugin and make it usable and accessible from the UI desktop client
Creating a Plugin Project
Before we proceed, make sure to:
Define an environment variable JEB_HOME, that points to your JEB installation folder
Open Eclipse and import the newly-created project into your Workspace (File, Import, Existing Projects into the Workspace, select the cloned repository folder, proceed)
Importing an existing project into Eclipse
Debugging the Obfuscation
Now that your project is imported in Eclipse, you should be able to see two source files in src’s default package:
Tester.java
EOptExample1.java
EOptExample1 is the IR optimizer plugin we will be working on. (Note that several classes of plugins exist, this one is a native IR optimizer, and therefore inherits from AbstractEOptimizer or one its subclasses.)
Tester creates a headless JEB instance that loads the plugin EOptExample1.
Package Explorer view of the newly-created project
Then, create a JEB project and load the artifact file samples/1.exe (IMPORTANT: unzip 1.zip to 1.exe first – password: password)
Analyze the artifact
Retrieve a handle on the native decompiler
Retrieve a handle on the to-be-analyzed routine sub_401171
Perform a full decompilation of that routine
Let’s have a preliminary look at EOptExample1: This IR optimizer type is set to STANDARD, which is not ideal when you use custom optimizers tailored for specific code. A better IR optimizer class for those is ON_DEMAND: those optimizers are to be manually invoked, e.g. from JEB UI (menu: File, Advanced Unit Options). However, during development, since we are focusing on a particular file and routine, STANDARD type may be fine. Standard optimizers are called during regular IR optimization phases of the decompilation pipeline.
public class EOptExample1 extends AbstractEOptimizer {
public EOptExample1() {
super(DataChainsUpdatePolicy.UPDATE_IF_OPTIMIZED);
getPluginInformation().setName("Sample IR Optimizer #1");
getPluginInformation().setDescription("Remove IR-statements reduced to \"*(&garbage + delta) = xxx\"");
getPluginInformation().setVersion(Version.create(1, 0, 0));
// Standard optimizers are normally run, as part of the IR optimization stages in the decompilation pipeline
setType(OptimizerType.STANDARD);
}
// replace all IR statements previously reduced to EMem ("[junk_address] = xxx") to ENop
@Override
public int perform(boolean updateDFA) {
logger.info("IR-CFG before running custom optimizer \"%s\":\n%s", getName(),
DecompilerUtil.formatIRCFGWithContext(2, cfg, ectx));
// ...
// optimizer code
}
}
Note the plugin’s data-chains update policy, set to UPDATE_IF_OPTIMIZED. Optimizations that specify this flag tell their runner, aka the master optimizer that orchestrate them, that identifiers may be modified – hence, if optimizations occurred, a data flow analysis (DFA) pass needs to take place again. DFA update policies are a topic for another article.
Lines 3-5 are plugin metadata information, such as name and description, authorship, version numbers (including minimum/maximum JEB back-end versions), etc.
Before we deep-dive into perform(), let’s first set a breakpoint on line 15, where logger.info(…) is called. Then, start a debugging session for Tester: menu Run, command Debug (hotkey: F11.)
After a few seconds of analysis, your breakpoint should be hit; it corresponds to the first-time invocation of your custom optimizer. The logger prints out the IR-CFG that’s about to be optimized. Let’s have a look at it:
IR-CFG before running custom optimizer "Sample IR Optimizer #1":
>> IN(@0): ecx={@D} esp={@0} ebp={@1} ss={@1,@C,@18,@1D,@21,@24,@25,@27,@30,@35,@38,@3B,@3E,@3F,@41,@43,@46,@4F,@51,@54,@56,@59,@5C,@5D,@5F,@6B,@77,@81,@84,@9B,@9E,@A0,@AC,@B8,@BA,@BD,@BF,@C3,@C5,@C7,@CB,@CD,@D1,@D2,@D4,@E0,@E9,@EF,@F1,@F5,@F7,@FB,@FC,@FE,@100,@103,@106,@107,@109,@10C,@10E,@112,@114,@116,@11A,@11C,@11F,@122,@123,@12E,@131,@133,@137,@139,@13D,@13E,@140,@143,@145,@149,@14B,@14F,@150,@152,@15D,@173,@176,@179,@17C,@17D,@17F,@181,@18A,@18C,@18F,@191,@194,@196,@19A,@19D,@19E,@1A0,@1B3,@1BF,@1C2,@1D9,@1DC,@1E0,@1EC,@1EF,@1F1,@1F3,@1F7,@1F9,@1FC,@1FF,@202,@203,@205,@211,@21D,@220,@222,@226,@227,@229,@22B,@22E,@231,@232,@234,@237,@23A,@23C,@23E,@242,@244,@246,@24A,@24C,@250,@251,@25C,@25F,@262,@263,@265,@268,@26A,@26E,@271,@272,@27A,@27F,@295,@298,@29A,@29D,@2A4,@2A9,@2AD,@2B0,@2B2,@2B6,@2B7,@2BA,@2BC,@2C0,@2C2,@2C6,@2C7,@2CA,@2CD,@2D2,@2DE,@2E4,@2E7,@2E8,@2EA} ds={@F,@11,@19,@1E,@22,@28,@31,@36,@39,@3C,@40,@44,@4C,@4E,@52,@55,@57,@5A,@60,@6C,@74,@78,@7B,@82,@85,@86,@87,@8E,@90,@92,@9C,@A1,@A3,@A4,@AD,@B5,@B7,@BB,@C0,@C4,@C8,@CE,@D5,@E1,@EA,@ED,@F2,@F8,@FD,@FF,@101,@104,@10A,@10F,@113,@117,@11B,@11D,@120,@124,@12F,@134,@13A,@13F,@141,@146,@14C,@153,@15A,@15C,@163,@165,@166,@167,@170,@171,@174,@177,@17A,@17E,@180,@187,@189,@18D,@192,@197,@19B,@1A1,@1AB,@1B4,@1B7,@1B9,@1C0,@1C3,@1C4,@1C5,@1CC,@1CE,@1D0,@1DA,@1DD,@1DE,@1E1,@1E9,@1ED,@1F0,@1F4,@1FA,@1FD,@200,@206,@212,@219,@21B,@21E,@223,@228,@22A,@22C,@22F,@235,@238,@23B,@23F,@243,@247,@24D,@252,@25B,@25D,@260,@264,@266,@26B,@26F,@273,@27B,@27E,@285,@287,@288,@289,@292,@293,@296,@29B,@2A5,@2AA,@2AE,@2B3,@2B8,@2BD,@2C3,@2C8,@2CE,@2D0,@2D3,@2DF,@2E1,@2E2,@2E5,@2EB} OpenClipboard={@25} GetClipboardData={@3F,@17D} GlobalAlloc={@FC,@227} GlobalLock={@107,@232} GlobalUnlock={@13E,@263} SetClipboardData={@150,@272,@2B7,@2C7} CloseClipboard={@2CB} Sleep={@2E8} sub_401000={@D} sub_405010={@5D} sub_404F80={@D2} sub_4024E0={@123,@251} sub_404E54={@19E} sub_404E14={@203}
0000/1> s32:_esp = (s32:_esp - i32:00000004h) DU: esp={@1,@2,@B} | UD: esp={}
0001/1: 32<s16:_ss>[s32:_esp] = s32:_ebp DU: | UD: esp={@0} ebp={} ss={}
0002/9: s32:_ebp = s32:_esp DU: ebp={@38,@41,@46,@4F,@54,@56,@84,@9E,@B8,@BD,@C5,@FE,@100,@10C,@114,@11C,@131,@140,@15D,@176,@17F,@181,@18A,@18F,@194,@1C2,@1DC,@1EF,@1F1,@1FC,@229,@22B,@237,@23C,@244,@25C,@265,@27F,@298,@29D} | UD: esp={@0}
000B/1: s32:_esp = (s32:_esp - i32:0000002Ch) DU: esp={@C,@D,@17} | UD: esp={@0}
000C/1: 32<s16:_ss>[s32:_esp] = i32:0040117Ch DU: | UD: esp={@B} ss={}
000D/1: call s32:_sub_401000(s32:_ecx)->(s32:_eax){32[s32:_esp]} DU: eax={} | UD: ecx={} esp={@B} sub_401000={}
000E/1+ s32:_edi = i32:00415882h DU: edi={} | UD:
000F/1: 32<s16:_ds>[i32:00415944h] = i32:E2E60682h DU: | UD: ds={}
0010/1: s32:_eax = i32:00000001h DU: eax={} | UD:
0011/6: 32<s16:_ds>[i32:00415904h] = i32:7C64C0E4h DU: | UD: ds={}
0017/1: s32:_esp = (s32:_esp - i32:00000004h) DU: esp={@18,@1A} | UD: esp={@B,@2EC}
0018/1: 32<s16:_ss>[s32:_esp] = i32:E87A1612h DU: | UD: esp={@17} ss={}
0019/1: 32<s16:_ds>[i32:004158DDh] = i32:E87A1612h DU: | UD: ds={}
001A/1: s32:_esp = (s32:_esp + i32:00000004h) DU: esp={@1C} | UD: esp={@17}
001B/1: nop DU: | UD:
001C/1+ s32:_esp = (s32:_esp - i32:00000004h) DU: esp={@1D,@20} | UD: esp={@1A}
001D/1: 32<s16:_ss>[s32:_esp] = i32:CCA4A4A0h DU: | UD: esp={@1C} ss={}
001E/2: 32<s16:_ds>[i32:004158CAh] = i32:CCA4A4A0h DU: | UD: ds={}
0020/1: s32:_esp = s32:_esp DU: esp={@21,@23} | UD: esp={@1C}
0021/1: 32<s16:_ss>[s32:_esp] = i32:00000000h DU: | UD: esp={@20} ss={}
0022/1: 32<s16:_ds>[i32:00415951h] = i32:249E4228h DU: | UD: ds={}
0023/1: s32:_esp = (s32:_esp - i32:00000004h) DU: esp={@24,@25,@26} | UD: esp={@20}
0024/1: 32<s16:_ss>[s32:_esp] = i32:004011CAh DU: | UD: esp={@23} ss={}
0025/1: call s32:_OpenClipboard(32<s16:_ss>[(s32:_esp + i32:00000004h)])->(s32:_eax){32[s32:_esp]} DU: eax={@33} | UD: esp={@23} ss={} OpenClipboard={}
...
... (trimmed)
...
The above IR listing is a human-friendly representation of IR statements. The general format of this listing is:
- offset: IR statement offset - length: IR statement length (generally, 1) - C: indicates whether the instruction is - the entry-point instruction (>) - the first of a basic-block (+) - any other instruction (:) - insn: IR statement instruction (refer to Part 1 of this blog series) - DU/UD: routine def-use and use-def chains - IN: live input variables at the entry-point - OUT: reaching output variables at a given exit point
We breakpoint’ed on logger.info(), and single-stepped one line. The output can be seen in the console view. It may be better (depending on how large your console buffer is) to examine the full output dumped to jeb-plugin-tester.log in your Temp folder.
The IR listing is relatively readable, although quite verbose at this early stage of optimization (roughly, the first pass in tier 1 of the analysis pipeline). The important idioms to look at here are:
Preliminary conversion of low-level junk inserts
a/ The first one is an Assign(Mem(Imm), Imm), which corresponds to optimized “mov [edi + offset], value”, where the value of edi was determined, propagated further, and the addition folded and converted to an immediate address.
b/ The second one is a partially optimized “push value / pop [address]”. Later optimizations phases will find and remove esp updates or esp-based operations, as was shown in the pseudo-code earlier. What we need to focus on here is the Assign(Mem(Imm), Imm), like the one in a/.
Those are the bits we will look for and modify: Assuming those assignments are useless, we will simply replace them by Nop statements.
Writing the Optimizer
At this point, our preliminary understanding of the obfuscation is enough to start writing the clean-up optimizer. Its code is extremely simple, for two main reasons: – The obfuscation scheme itself is relatively trivial – Other built-in JEB optimizers are giving us clean IR assignments to work on
Let’s look at the code of proceed():
@Override
public int perform(boolean updateDFA) {
final long garbageStart = 0x415882;
final long garbageEnd = garbageStart + 0x100;
int cnt = 0;
for(int iblk = 0; iblk < cfg.size(); iblk++) {
BasicBlock<IEStatement> b = cfg.get(iblk);
for(int i = 0; i < b.size(); i++) {
IEStatement stm = b.get(i);
if(!(stm instanceof IEAssign)) {
continue;
}
IEAssign asg = (IEAssign)stm;
if(!(asg.getLeftOperand() instanceof IEMem)) {
continue;
}
IEMem target = (IEMem)asg.getLeftOperand();
if(!(target.getReference() instanceof IEImm)) {
continue;
};
IEImm wraddr = (IEImm)target.getReference();
if(!wraddr.canReadAsAddress()) {
continue;
}
long addr = wraddr.getValueAsAddress();
if(addr < garbageStart || addr >= garbageEnd) {
continue;
}
b.set(i, ectx.createNop(stm));
cnt++;
}
}
return postPerform(updateDFA, cnt);
}
This optimizer inherits from AbstractEOptimizer. Therefore, the perform() method works on an IR-CFG. (Not all optimizers may choose to do so; it is sometimes easier to work directly on statements or expressions.)
process() goes through all statements or every basic block of the IR-CFG. Using the instanceof operator, we check that the statement is an assignment such as: Mem(address) = Imm. The address is retrieved, and we make sure that it falls within the junk array. If those checks succeed, we replace the assignment by a Nop.
And that is it. Clean and simple – although, not quite portable, since the junk array address and size are hard-coded into the code! But that is not the point of this blog, and neither is portability a first-class goal when writing optimizers for custom code.
Next up, let’s see how to use the plugin in an interactive session using the desktop client.
Building, Deploying, Interactive Use
In order to use the optimizer within the JEB desktop client, we either:
Register the plugin as a development plugin;
Or build the plugin as a Jar and drop it in JEB’s coreplugins/ folder.
Development Plugin
This is the easiest option. You may consider it as an intermediate step between prototyping with the headless client, as demonstrated above, and a full-blown, deployed Jar plugin.
Open the Options panel, Development tab, tick the option “Development Mode”, add the bin/ folder of your plugin’s project to the classpath, and add the classname of your plugin entry-point:
Setting up a development plugin in JEB UI
Press OK and restart JEB. Your plugin will be loaded and ready to use. You may now skip to the section “Using the IR optimizer plugin”.
Building a Jar plugin
The alternative is to run build.cmd (on Windows) or build.sh (on Linux/macOs), which calls an Ant script in the scripts/ folder, therefore, make sure to have Ant installed on your system first. You may also customize the plugin name and version before building.
The resulting Jar plugin file will be generated in your project’s out/ folder. Copy it to your JEB coreplugins/ folder and start the JEB client. Your plugin will be automatically loaded, along with the other plugins.
Using the IR Optimizer Plugin
If your plugin has the type STANDARD (default), then, as explained earlier, it will be invoked by the optimizations’ orchestrator automatically, at various times during the decompilation pipeline. If that’s the mode you’d like to choose, make sure that your plugin is generic enough to handle all types of input routines, else you’re in for some strange surprises if you ever forget to remove it from your coreplugins/ folder.
An alternative is to convert it to an on-demand plugin:
public EOptExample1() {
super(DataChainsUpdatePolicy.UPDATE_IF_OPTIMIZED);
getPluginInformation().setName("Sample IR Optimizer #1");
getPluginInformation().setDescription("Remove IR-statements reduced to \"*(&garbage + delta) = xxx\"");
getPluginInformation().setVersion(Version.create(1, 0, 0));
// Standard optimizers are normally run, as part of the IR optimization stages in the decompilation pipeline
//setType(OptimizerType.STANDARD);
// alternative (better for production / in UI use):
setType(OptimizerType.ON_DEMAND);
setPreferredExecutionStage(-NativeDecompilationStage.LIFTING_COMPLETED.getId());
setPostProcessingActionFlags(PPA_OPTIMIZATION_PASS_FULL);
}
– Line 11 makes the optimizer on-demand. Users must manually activate it, on specific code. – Line 12 is recommended for on-demand optimizers: we specify at which point in in the pipeline the plugin should be called. – Finally, we set some post-processing flags, specifying that a full round of standard optimizations must be performed after our custom optimizer has run: this will allow cleaning up code remnants, and optimize our IR-CFG further – something made possible after running an optimization pass like this one.
On-demand optimizer plugins show up in the File, Advanced Unit Options dialog box, that you may bring up when a decompiled routine has the focus:
List of on-demand optimizers managed by a given decompiler instance
Tick the optimizer box, press OK. The routine will be re-decompiled.
Clean Code
Regardless which method you choose, once cleaned up, the IR will allow for better downstream pipeline phases, including typing, AST generation, AST optimizations, etc.
The pseudo-C code has become quite readable:
The same decompiled method, after deobfuscation by the custom plugin.
Conclusion
That is it for part 2. We scratched the surface of IR optimizers (which themselves are a relatively small – albeit important – part of the overall decompilation pipeline 2) but it’s a good start. I strongly encourage you to experiment and ask your questions on our Slack channel. One ongoing effort right now is to bring the API documentation up to speed in terms of contents and sample code.
In part 3, we will continue exploring IR optimizers. Later on in the series, we will show how to write AST optimizers 3, how to write decompilation modules, and show how existing decompilers can be cutomized further. Stay tuned!
JEB must have been previously run, at least once: EULA accepted, license key generated, etc. ↩
The decompilation pipeline is one component of the native analysis pipeline, which is one module, among tens, of the JEB back-end: the public API is worth exploring if you’re into advanced use cases. ↩
AST generation is one of the very final decompilation phases – working on the syntax tree serves different purposes than working on the IR ↩
The latest JEB release ships with our all-new Android resources (ARSC) decoder, designed to reliably handle tweaked, obfuscated, and sometimes malformed resource files.
As it appears that optimizing resources for space (eg, the WeChat team has made their compressor/refactoring module publicly available, etc.) or complexity (eg, commercial app protectors have been doing it for some time now) is becoming more and more commonplace, we hope that our users will come to appreciate this new module.
Here are the key points, followed by examples of what to expect from the new module.
ARSC Decoder Workflow
In terms of workflow, nothing changes: starting with JEB 2.3.10, the new Android Resources decoder module is enable by default.
If you ever need to switch back to the legacy module, simply open the Options, Advanced panel, filter on AndroidResourcesDecoderSelector and set the value to 1 (instead of 2).
ARSC Decoder Output
In terms of output, users should see improvements in at least three areas:
First, the module can deal with obfuscated resources and malformed files better, resulting in lower failure rates. Ideally, we’d like to get as close as possible to a 0-failure, so please report issues!
Second, flattened, renamed, or generally refactored resources are handled as well, and the original res/ folder will be reconstructed, resulting in a readable Resources sub-tree.
Finally, the module can generate an aapt2-like text output to cope with the limitations of AOSP’s aapt/aapt2 (eg, crashes); the output can be quite large, so currently, aapt2-like output generation is disabled by default. To enable it, go to the Options, Advanced panel, filter on AndroidResourcesGenerateAapt2LikeOutput and set the value to true. The output will be visible as an additional fragment of the APK unit view:
aapt2-like output on a file that failed aapt2
Additional Input (APK Frameworks)
By default, the latest Android framework (currently API 27) is dropped by JEB in [HOME_FOLDER]/.jeb-android-frameworks/1.apk.
If an app you are analyzing requires additional framework libraries, drop them as [package_id].apk in that folder, and you should be good to go.
Example 1: flattened resources in a banking app
Here’s a sample that demonstrates what the output looks like with an app found on VirusTotal. The app is called itsme, the apk is protected by resources refactoring (res/ folder flattening) and trimming (renaming of files, name-less resource objects, etc.).
Have a look at the APK contents:
Protected app contents
aapt2 fails on it (resource id overlap):
error: trying to add resource 'be.bmid.itsme:attr/' with ID 0x7f010001 but resource already has ID 0x7f010000.
apktool 2.3.1 cannot reconstruct the resource tree either. Resources are moved to an unknown/ folder; on non-Linux system, resources manipulation also fail due to illegal character names.
JEB does its best to rebuild the resources tree, and renames illegally named resources as well across the Resources base, consistently:
A rebuilt resources tree, originally obfuscated by GuardSquare (?)
Example 2: tweaked xml
The second file is a version of the Xapo Bitcoin wallet app 1, also found on VirusTotal. This app does not fail aapt2, however, it does fail other tools, including apktool 2.3.1
I: Using Apktool 2.3.1 on 96cbabe2fb11c78a283348b2f759dc742f18368e0d65c5d0a15aefb4e0bdc645
I: Loading resource table...
I: Decoding AndroidManifest.xml with resources...
I: Loading resource table from file: [...]/1.apk
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 8601
The resources are flattened and renamed; the XML resources are oddly structured and stretch the XML specifications as well.
JEB handles things smoothly.
Conclusion
There are many more examples of “stretched” resources in APKs we’ve come across, however we cannot share them at the moment.
If you come across unsupported scenarios or bugs, feel free to issue a report, we’ll happily investigate and update the module.