JEB Android Updates – Generic String Decryption, Lambda Recovery, Unreflecting Code, and More

Updated on March 11.

A note about 2020 Q1 updates (versions 3.10 to 3.16) regarding the DEX/Dalvik decompiler modules:

  • Generic String Decryption
  • Lambda Recovery
  • Unreflecting Code
  • Decompiling Java Bytecode
  • Auto-Rename All

Generic String Decryption

JEB ships with a generic deobfuscator that can perform on-the-fly string decryption and other complex optimizations. Although this optimizer performs safe (i.e., guaranteed) optimizations in most cases, it is unsafe in the general case case and therefore, may be disabled in the options. Refer to the Engines options .parsers.dcmp_dex.EnableDeobfuscators and .parsers.dcmp_dex.EmulationSupport.

Many code protectors offer options to replace immediate string constants by method invocations that perform on-the-fly decryption.

A variety of techniques exist, ranging from simple one-off trivial decryptor methods, to complex schemes involving object(s) creation, complicated decryptors injected in third-party packages, non-trivial logic, junk code meant to slow down analyzers, use of opaque predicates, etc. They are implemented in an infinite number of ways. JEB’s generic deobfuscator can perform quick, safe emulation of the intermediate representation to provide a replacement. It may sometimes fail or bail out due to several reasons, such as performance or pitfalls like anti-emulation and anti-sandboxing techniques.

Example 1

The string decryptor is a static method reading encrypted string data in a class byte array. Also note that code reflection is used.

The above code (blue box) ends up being deobfuscated to:

Example 2:

The decryption methods were injected into library packages, e.g. Gson’s

The above code is deobfuscated to:

With the generic string decryptor optimizer ON

Below, a decryptor that had been injected into the com.google.gson.Gson() class:

The concat(String, int) method is not part of standard Gson, of course. It was injected by the protector and is used by (some) code to perform on-the-fly string decryption.

Example 3:

One last example, which was involuntarily – yet, quite timely! – provided by a user:

Static fields initialized with decrypted strings.
After decryption.

Decrypting all strings: The decryptor kicks in when decompiling methods only. At the moment, if a string happens to be successfully decrypted, the optimizer does not attempt to recover all similarly encrypted strings in the code, although it is most certainly an addition that will make it in a future software update.

Rendering: You may quickly identify decrypted strings in the client as they are rendered using a special color associated with the itemId STRING_GENERATED, by default rendered in a flashy pink color in light and dark themes. Hovering over such items will bring up a pop-up with additional origin information, like the underlying code that would have generated that string:

Auto-decrypted strings in pink; overlays with source/origin information.

API:
– From a DEX perspective: Generated strings are artificial. Therefore, IDexString.isArtificial() would return true.
– From a Java/AST perspective: IJavaConstant objects that embed origin information do so using the “origin” tag. Use IJavaConstant.getTags().get("origin") to retrieve it.

Lambda Recovery

JEB attempts to perform Java 8 style lambda recovery and reconstruction.

Desugared Lambdas

Recovery and reconstruction does not rely on any type of metadata 1, such as special prefixes -$$Lambda$ for classes and methods implementing desugared lambdas in dex 37-.

You may therefore see constructs like this:

This DEX file contains desugared, non-obfuscated lambdas.
This DEX file contained desugared, obfuscated lambdas

Options: Lambda reconstruction can be disabled in the options (Edit, Options, Engines, …). Lambda rendering can also be disabled in the options, as well as on-demand by right-clicking a decompiled view, Rendering Options….

Lambdas options

API Note: In the above cases, the underlying Java AST may be a IJavaNew or IJavaStaticField node. This is not the case for real (not desugared) lambdas, which map to an IJavaCall node – see below.

Real Lambdas

Lambda reconstruction also takes place when the code has not been desugared (which is rare!), i.e. code relying on dex38’s invoke-custom and invoke-polymorphic.

This DEX file contains real lambdas implemented via invoke-custom

API Note: Such lambdas map to an IJavaCall node for which isCustomCall() will return true.

Unreflecting Code

Many code protectors make heavy use of reflection – combined with string encryption, as we’ll see below – to obfuscate code. In practice, reflection is limited to method invocation (static and virtual), static and non-static field setting and getting, and new instance creation. A few examples:

v = Class.forName("java.lang.Integer").getMethod("valueOf",
        String.class).invoke(null, str);
// instead of 
v = Integer.valueOf(str);
Class.forName("SomeClassName").getField("b").setInt(x, 4);
// instead of 
x.b = 4;
Class.forName("java.lang.String").getConstructor(byte[].class)
        .newInstance(val);
// instead of
new String(arg6);

Such code is generally protected by a catch-all handler that forwards the cause of any exception raised by a reflection issue:

try {
    // ...
}
catch(Throwable e) {
    throw e.getCause();
}

By default, JEB will attempt to unreflect code. This deobfuscator is potentially unsafe and may be disabled in the options. Note that you always have the ability to choose, for a particular decompilation, whether some options should be temporarily enabled or disabled, by pressing CTRL+TAB (or COMMAND+TAB on macOS) to decompile (same as menu Action, Decompile with options…).

Unsafe deobfuscators can be globally disabled.

So, in a nutshell, code normally decompiled to:

Reflection, not cleaned (malware was obfuscated)

will be decompiled to:

With reflection cleaned

Technical Note: This optimizer works on the Intermediate Representation manipulated by the decompiler, not to be confused with the AST rendered as its output. (The AST cleaner that was described in an older post is more limited than this IR optimizer.)

Last-step failures: Successfully unreflecting code eventually depends on being able to find the intended target method or field matching the provided description (method parameter types or field type). Failure to do so will generate a log like "A candidate field/method/constructor for unreflection was not found".

Decompiling Java Bytecode

JEB supports JLS bytecode decompilation for *.class files and jar-like archives (jar, war, ear, etc.). The Java bytecode is converted to Dalvik using Android’s dx by default. Users may choose to use d8 (not recommended for now) instead by selecting so in the Options.

The resulting DEX file(s) are processed as usual.

You may use this to decompile Android Library files (*.aar files) in JEB.

Examining the android-arch-core-runtime library

Auto-Rename All

JEB 3.13 introduced a new generic action, Auto-Rename All. Its implementation is at the discretion of code plugins. The DEX plugin implements it, therefore users may execute Action, Auto-Rename All… at any time (generally after processing an obfuscated file) in order to rename code items such as field, method, or class names, to something more easily processable for our -limited- human brains.

Look at this horrendous obfuscation scheme below. It’s using right-to-left unicode characters to seriously mess up rendering:

Obfuscated name using RTL Arabic characters

Let’s run Action, Auto-Rename All… on this file:

Auto-Renaming capabilities are provided (optionally) by plugins.
After auto-renaming code items in the above file. Not clearer in terms of meaning, but at least, it’s something we can start working on.

As usual, feel free to join us on Slack, message us on Twitter, or email us privately at support@pnfsoftware.com.

Until next time!

  1. Relying on metadata leads to false negatives in the best case – e.g., when the code has been minified by something like ProGuard; it leads to false positives in the worst case – e.g. forged metadata to incite the decompiler to generate inaccurate or wrong code.