Language Translation Contribution in Python; VirusTotal Hash Check Plugin in Java.

This post is geared toward power-users who would like to take advantage of API additions that shipped with the latest JEB update.1

TL;DR: see below for a language translation contribution in Python, and a VirusTotal hash check plugin in Java.

Contributions

With JEB 2.3.6, users can now write their own unit contribution plugins in Python (or Java, of course).

First, let’s recap: JEB extensions consist of back-end plugins, and front-end scripts. Front-end scripts are written in Python and execute in the context  of a client (generally, the UI client, but it could also be a script executed by a headless, command-line JEB client). Back-end plugins form a more diverse realm: they consist of parser plugins (eg, disassemblers, decompilers, decoders, etc.), generic engines plugins, and contribution plugins.  They are mostly written in Java – although that is slowly changing as we are adding program-wide support for JEB extensions in Python.

Contribution plugins can enhance the output produced by parser plugins. A concrete example: an interactive disassembly or other text output (eg, a decompiled piece of Java or C code) is made of text items; a contribution can provide additional information to a client about a given item, when the client requests it. When it comes to the main JEB UI client, that information can be requested when a user hovers its mouse over an interactive text item.

Several contributions are already built-in, such as those providing live variable and register values when debugging a program; or the Javadoc contribution that displays API documentation on Java disassembly. Users may also write their own contributions.

  • Contributions extend IUnitContribution;
  • They can target any type of unit;
  • They can be written in Java or in Python;
  • They are plugins,  and as such, should be dropped into the JEB’s coreplugins/ folder (Python contributions will need a Jython package in that folder as well);
  • A Python contribution must be named exactly like the contribution class name (in the above below, SampleContribution.py)

The skeleton of a Python contribution that would enhance all code units would look like:

class SampleContributionPlugin(IUnitContribution):

  def __init__(self):
    pass
  
  def getPluginInformation(self):
    return PluginInformation(...)

  def isTarget(self, unit):
    return isinstance(unit, ICodeUnit)

  def setPrimaryTarget(self, unit):
    self.target = unit

  def getPrimaryTarget(self):
    return self.target

  def getItemInformation(self, targetUnit, itemId, itemText):
    # provide info about an item or a bit of text

  def getLocationInformation(self, targetUnit, location):
    return None

We uploaded a sample contribution plugin that works for text documents produced by any type of parser plugin (eg, disassembly, decompiled code, etc.). The contribution uses Google to provide real-time translations of the text snippet your mouse pointer is currently on:

The translation contribution translates foreign language text items to English when the user hovers their mouse over them; here, an Arabic string found in a malware sample of Mirai is being translated.

Note that you do not need a Google API key for it to work: the plugin scrapes Google search out; as such it is quite brittle and will almost certainly break in the future, but keep in mind this is a demo/sample to get you started for your own contributions.

VirusTotal Report Plugin

On a side-note, JEB 2.3.6 also ships with a VirusTotal hash checker plugin (disabled by default). This plugin automatically checks the hash of top-level units against the VirusTotal database.

We open-sourced it on GitHub (VirusTotalReportPlugin.java).

To set it up, run File, Plugins, Execute an Engines Plugin, VT Report Plugin:

To set up the VT plugin, you will need a VT API key.

Then, enter your VirusTotal API key; you’re good to go. Newly processed files will be automatically checked against VT and a log message as well as a notification will be stored to let you know the outcome.

The notification produced by the JEB VT plugin: here, the file looks bad (28 anti-virus products marked it as such)

That’s it for today — until next time!

Defeating an Android application protector

AppSolid is a cloud-based service designed to protect Android apps against reverse-engineering. According to the editor’s website, the app protector is both a vulnerability scanner as well as a protector and metrics tracker.

This blog shows how to retrieve the original bytecode of a protected application. Grab the latest version of JEB (2.2.5, released today) if you’d like to try this yourself.

Bytecode Component

Once protected, the Android app is wrapped around a custom DEX and set of SO native files. The manifest is transformed as follows:

Manifest of a protected app. The red boxes indicate additions. The red line indicates removal of the LAUNCHER category from the original starting activity
  • The package name remains unchanged
  • The application entry is augmented with a name attribute; the name attribute references an android.app.Application class that is called when the app is initialized (that is, before activities’ onCreate)
  • The activity list also remain the same, with the exception of the MAIN category-filtered activity (the one triggered when a user opens the app from a launcher)
  • A couple of app protector-specific activity are added, mainly the com.seworks.medusah.MainActivity, filtered as the MAIN one

Note that the app is not debuggable, but JEB handles that just fine on ARM architectures (both for the bytecode and the native code components). You will need a rooted phone though.

The app structure itself changes quite a bit. Most notably, the original DEX code is gone.

Structure of the protected app. The fake PNG file contains encrypted assets of the original app, which are handled by libmd.so.
  • An native library was inserted and is responsible for retrieving and extracting the original DEX file. It also performs various anti-debugging tricks designed to thwart debuggers (JEB is equipped to deal with those)
  • A fake PNG image file contains an encrypted copy of the original DEX file; that file will be pulled out and swapped in the app process during the unwrapping process
Snippet of high_rezolution.png – 0xDEADCODE

Upon starting the protected app, a com.seworks.medusah.app object is instantiated. The first method executed is not onCreate(), but attachBaseContext(), which is overloaded by the wrapper. There, libmd is initialized and loadDexWithFixedkeyInThread() is called to process the encrypted resources. (Other methods and classes refer to more decryption routines, but they are unused remnants in our test app. 1)

Application.attachBaseContext() override is the true entry-point of a protected app
Unwrapper thread
Calling into the native file for decryption and swapping

The rest of the “app” object are simple proxy overrides for the Application object. The overrides will call into the original application’s Application object, if there was one to begin with (which was not the case for our test app.)

Proxy stubs to the original Application’s object

The remaining components of the DEX file are:

  • Setters and getters via reflection to retrieve system data such as package information, as well as stitch back the original app after it’s been swapped in to memory by the native component.
  • The main activity com.seworks.medusah.MainActivity, used to start the original app main activity and collect errors reported by the native component.

Native Component

The protected app shipped with 3 native libraries, compiled for ARM and ARM v7. (That means the app cannot run on systems not supporting these ABIs.) We will focus on the core decryption methods only.

As seen above, the decryption code is called via:

m = new MedusahDex().LoadDexWithFixedkeyInThread(
    getApplicationInfo(), getAssets(),
    getClassLoader(), getBaseContext(),
    getPackageName(), mHandler);

Briefly, this routine does the following:

  • Retrieve the “high_resolution.png” asset using the native Assets manager
  • Decrypt and generate paths relative to the application
    • Permission bits are modified in an attempt to prevent debuggers and other tools (such as run-as) to access the application folder in /data/data
  • Decrypt and decompress the original application’s DEX file resource
    • The encryption scheme is the well-known RC4 algorithm
    • The compression method is the lesser-known, but lightning fast LZ4
    • More about the decryption key below
  • The original DEX file is then dumped to disk, before the next stage takes place (dex2oat’ing, irrelevant in the context of this post)
  • The DEX file is eventually discarded from disk

Retrieving the decryption key statically appears to be quite difficult, as it is derived from the hash of various inputs, application-specific data bits, as well as a hard-coded string within libmd.so. It is unclear if this string is randomly inserted during the APK protection process, on the server side; verifying this would require multiple protected versions of the same app, which we do not have.

ARM breakpoint on DecryptFileWithFixedKey(), step over, and destination file in r6

A dynamic approach is better suited. Using JEB, we can simply set a breakpoint right after the decryption routine, retrieve the original DEX file from disk, and terminate the app.

The native code is fairly standard. A couple of routines have been flattened (control-flow graph flattening) using llvm-obfuscator. Nothing notable, aside from their unconventional use of an asymmetric cipher to further obscure the creation of various small strings. See below for more details, or skip to the demo video directly.

Technical note: a simple example of white-box cryptography

The md library makes use of various encryption routines. A relatively interesting custom encryption routine uses RSA in an unconventional way. Given phi(n) [abbreviated phi] and the public exponent e, the method brute-forces the private exponent d, given that:
d x e = 1 (mod phi)

phi is picked small (20) making the discovery of d easy (3).

Find d given phi and e, then decrypt using (d, n)
Decryption (p = c^d (mod n)) … with a twist

The above is a simple example of white-box cryptography, in which the decryption keys are obscured and the algorithm customized and used unconventionally. At the end of the day, none of it matters though: if the original application’s code is not protected, it – or part of it – will exist in plain text at some point during the process lifetime.

Demo

The following short video shows how to use the Dalvik and ARM debuggers to retrieve the original DEX file.

This task can be easily automated by using the JEB debuggers API. Have a look at this other blog post to get started on using the API.

Conclusion

The Jar file aj_s.jar contains the original DEX file with a couple of additions, neatly stored in a separate package, meant to monitor the app while it is running – those have not been thoroughly investigated.

Overall, while the techniques used (anti-debugging tricks, white box cryptography) can delay reverse engineering, the original bytecode could be recovered. Due to the limited scope of this post, focusing on a single simple application, we cannot definitively assert that the protector is broken. However, the nature of the protector itself points to its fundamental weakness: a wrapper, even a sophisticated one, remains a wrapper.

  1. The protector’s bytecode and native components could use a serious clean-up though, debugging symbols and unused code were left out at the time of this analysis.

Crypto Monitoring with the Android Debuggers API

Updated on May 4: JEB 2.2.3 is out. All users can now use the Android debugger modules.

In this short post, we will show how the debuggers API can be used to monitor an app execution, hook into various key methods and classes of the standard Java cryptography SPI, and extract input and output data, as they flow in and out encryption/decryption routines.

Very handy to retrieve encrypted data used within an app or exchanged with a remote server. 1 Check out the following video to see what we are talking about:

The sample code of the AndroidCryptoHook plugin can be found on our public GitHub repository.

This simple plugin does the following:

  • It looks for an active Dalvik debugging session
  • It sets up a debugger listener, which will listen for BREAKPOINT and BREAKPOINT_FUNCTION_EXIT events
  • It currently “hooks” 3 methods of the javax.crypto.Cipher abstract class:
    • byte[] doFinal(byte[] input)
    • int doFinal (byte[] output, int outputOffset)
    • int update(byte[] input, int inputOffset, int inputLen, byte[] output)
  • When any of the hooked method is called, the associated hook onEntry method is executed, which will dump interesting input parameters
  • When the same hooked method returns, the associated hook onExit method is executed, which will dump interesting exit parameters and return value

The hook here consists of a double breakpoint, one triggered when a method is entered, another one, when it exits.

A hook on doFinal() capturing plain text data just before it gets encrypted

The code for that Java plugin is fairly simple. More hooks could be easily added, and hooks in native libraries could be set up in a similar fashion. Lastly, always keep in mind that the API in general (and this plugin in particular) can be leveraged by UI or headless clients. Automate things away if you need to.

The one and only entry-point for developer resources is our Developer Portal. Do not hesitate to reach out, publicly or privately, if you have issues or pointed questions. Thank you.

  1. Dynamic execution monitoring can be achieved in several ways. Debugging a target is one of them.

Deobfuscating Android Triada malware

The Triada malware has received a lot of news coverage recently. Kaspersky was one of the first firm to publish an analysis of this Trojan earlier last week.

The code is obfuscated, and most strings are encrypted. The string encryption algorithm is trivial, but ever-changing across classes: bytes are incremented or decremented by constant values, either stored in a default decryptor method, or retrieved via calls to other methods. The result is something quite annoying to handle if you decide to perform a serious static analysis of the file.

Encrypted string buffers in Triada. Decryption routines can be seen in the decompiled class on the right-hand side.

Our intern Ruoxiao Wang wrote a very handy decryption script for Triada. It needs customizing (the decryption keys are not automatically retrieved) on a per-class basis, but the overall effort is a couple of seconds versus hours spending doing tedious and repetitive semi-manual work.

The script will decrypt the encrypted byte arrays and replace the decompiled Java fields supposedly holding the final strings by their actual value, as seen in the picture below.

Decrypted strings. Comments (in the left-side red box) indicate the string use was not found via xrefs. The right-side red box shows updated String fields after decryption.

The script can also be used as a tutorial on how to use the JEB Java AST API to look for and modify the AST of decompiled code.  (More examples be seen on our GitHub sample script repo.)

Download the Triada decryptor script here:
TriadaStringDecryptor.py

(Specific instructions are located in the script header.)

Changes in JEB 2.1… And a holiday season gift

JEB 2.1 is just around the corner! Users with a valid subscription should expect software updates today or tomorrow. This major update represents the maturation of JEB 2.0, and paves the way for JEB 2.2, which will introduce modules for x86 and ARM.

View the full change log for JEB 2.1

The following is a non-exhaustive list of notable changes.

Navigation bars in text views

The navigation bar is interactive (zoom in and out for finer granularity, visualize currently loaded text area, etc.) and customizable. It is connected to the metadata manager of an interactive unit. Client code (eg, in plugins or scripts) can manipulate this metadata: add, remove, query metadata and metadata groups, etc.

Sample script: JEB2CustomizeMetadata.py

The navigation bar can be seen on the right-hand side

Java AST API

The newly introduced com.pnfsoftware.jeb.core.units.code.java API package allows compliant Java source units to offer direct manipulation of AST code to clients. Units produced by our native Dalvik decompiler are obviously compliant. Plugins and scripts may use this API to implement complex refactoring/code-cleanup operations, as was demonstrated with JEB 1.x in the past.

The IJavaSourceUnit interface is your entry-point to AST elements

Note that the Java AST API has changed significantly relative to JEB1’s. A few missing features will be implemented in future service releases of JEB 2.1.x (eg, tagging) but overall, it is more powerful than JEB1’s: most objects can be constructed and modified, AST elements that were not offered by the older API, such as Annotations, are now accessible, etc. – not to mention, those units are now persisted! (See our next section.)

Sample scripts:

Improved persistence

Semi-related to the above paragraph, we are glad to announce that the decompiled Java code (and all the modifications applied to it via the Java AST API) are now persisted to the JDB2 file when saving a project to file.

The persistence mechanism has undergone significant changes and fixes, and therefore, some JDB2 generated by JEB 2.0 might not be compatible with JEB 2.1.

The newer version of the PDF plugin also supports persistence.

API for UI scripting

The IGraphicalClientContext has been augmented to support more operations, such as enumerating views and fragments, setting the focus on a specific view, setting or retrieving the active address and active items, etc.

Sample scripts:

A sample UI script showing how to start an asynchronous, interruptible task

The official RCP client implements the UI-API. We are planning to add more UI primitives in the upcoming maintenance releases.

API changes

Here is an incomplete list of API changes that took place between the latest 2.0 and the initial 2.1 releases:

  • Most protected members of the AbstractUnit hierarchy of skeleton classes have been privatized.  The current guidelines is to use the provided getters, setters, and other accessor methods.
    • Side-note: As always, we encourage plugin developers to use abstract implementations instead of implementing interfaces from scratch, when abstracts are available.
  • Units offer a way to persist or not-persist their children units when saving a project to disk. Changes took place in IUnit.addChild() and co. methods.
  • We added a notification manager (for all units) and introduced a metadata manager (for interactive units only).
  • The formatter (aka, the unit output manager) was also revamped: it can now yield transient and persisted documents. Write-access is also permitted, which means that plugins and scripts can add documents (such as texts, tables, or trees), and request that they’re persisted in the JDB2 database. We uploaded sample scripts here:
  • The client notifications are now called ClientNotification, to avoid potential confusion with another type of notification used within JEB (the unit notifications).

Next up

We are planning a few maintenance updates before the release of JEB 2.2. The currently planned release date for JEB 2.2 is early February 2016.

We will keep you posted on this very blog. Stay tuned, and a happy holiday season to all.

PS: as an early Christmas gift, we have uploaded a new third-party plugin on our public repository. Check out the Antivirus Quarantines File Extractor plugin. We currently support Kaspersky KLQ quarantine files only, but are planning to add more soon. If you’d like to contribute, please send us an email.

Scanning PDF Files using JEB2

Update (9/13/2017): we open-sourced the PDF plugin. A compiled JAR binary is also available.

Update: Feb. 27: Slides – Automation How-To
Update: Dec. 3: List of notifications

In this blog post, we show how JEB2 can be used as a building block of a file analysis system. We will show how to use the Core API to create a headless client. That client will scan PDF files using the JEB2 PDF Analysis Module. Basics of the IUnit and co. interfaces is also demonstrated.

Source code on GitHub.

Sample execution output produced by the PDF Scanner

As this slide deck shows, the back-end and front-end components of JEB2 are separated. The official RCP desktop client uses the JEB2 Core API; other front-ends, like the PDF scanner, can be built using that same API.

JEB2 HL Architecture Diagram

Creating an Eclipse project

Let’s get started by creating a new code project. We will show how to do this in Eclipse.

0- Check your license of JEB2. Make sure to use a license that supports third-party client creation and the loading of third-party plugin. If you haven’t done so, download and drop the PDF module in your coreplugins/ sub-directory.

1- Clone our sample code repository: git clone https://github.com/pnfsoftware/jeb2-samplecode.git

2- Create a new Java project. The Java source folder should be rooted in the src/ directory.

3- Add the JEB2 back-end as a JAR dependency. The back-end software is contained in the file bin/cl/jeb.jar located within your installation folder. You may also want to link that JAR to the API documentation, contained in the doc/apidoc.jar file, or online at https://www.pnfsoftware.com/jeb/apidoc

Your Package Explorer view should now look like:

Package explorer view after setting up dependencies

5- Set up the execution options. The required Java properties for execution (jeb.engcfg and jeb.lickey) can be set in the Run Configurations panel (accessible via the Run menu). Example:

Example of a Run configuration

6- Open the com.pnf.pdfscan.PDFScanner source file. You are ready to execute main().

How the scanner works

Now, let’s focus on the scanner source code.

  • The JEB2 back-end is initialized when scanFiles() is called:
    • Use JebCoreService to retrieve an instance to ICoreContext
    • Create an IEnginesContext
    • Load a project within that context (IRuntimeProject)
    • Add artifact(s) and process them (ILiveArtifact)
      • We add a single file artifact per project in this example
    • Retrieve the products (IUnit)
      • We are retrieving the top-most unit only in this example
    • Analyze the unit (see assessPdf())
    • Close the project

[Note: A detailed explanation of the above concepts (core, engines, project, artifacts, units, etc.) is outside the scope of this tutorial. Refer to our Developer Portal for more information.]

Snippet of scanFiles()

The assessPdf() method evaluates PDF units. The evaluation performed by this sample scanner is trivial: we collect the notifications created by the PDF plugin during the analysis of the file, and see if they meet basic criteria.

About the Unit Notifications:

  • Any JEB2 plugin can attach notifications to its units. The PDF plugin does so. Notifications are meant to pin-point noteworthy areas of a unit or artifiact.
  • A notification has a “dangerosity level” ranging from 0 to 100. It also has a description, an optional address to point to which area of the unit the notification is associated with, etc.
  • The API offers standard notification types, ranging from “Interesting area” to “Definitely Malicious”.
Standard notification levels offered in the NotificationType enum

A PDF unit can contain several types of notifications. Example include: corrupt areas in stream; multiple encoding of stream; JavaScript; password-protected stream; invalid/illegal entries in stream; etc.

Link: Complete list of notifications issued by the PDF plugin.

Our simple scanner reports a file as suspicious if it contains at least 2 notifications that have a level >= 70 (POTENTIALLY_HARMFUL). These thresholds can be tweaked in the source code.

The assessPdf() routine

The screenshot below is a sample output produced by the PDF scanner:

Conclusion

The intent of this entry is to shed some light on the process of writing third-party clients for JEB2, as well as what and how to use notifications reported by Units. We encourage you to visit our Developer Portal to find additional documentations as well as the reference Javadoc of the API.

Writing client scripts for JEB2 using Python

The latest release of JEB2, version 2.0.14, introduces a feature familiar to JEB1 users: client scripts written in Python.

Both Standard and Business licenses permit running scripts. They can be written using the Python 2.5 or 2.7 syntax and features, and are executed by Jython. (A Jython stand-alone package is required to run scripts. We recommend version 2.5. Download it and drop it in your JEB2 scripts/ sub-directory.)

Feature-wise, scripts use the standard JEB2 core APIs. They are also using the client API, available in the com.pnfsoftware.jeb.client.api package. As usual, refer to our Developer Portal and Javadoc website for API reference and usage.

A client script implements the IScript interface. Upon execution, the script run() entry-point method is provided an IClientContext or derived object, such as an IGraphicalClientContext for UI clients. (The official RCP desktop client falls in the latter category.)

Here is the simplest of all scripts:

from com.pnfsoftware.jeb.client.api import IScript
class JEB2SampleScript(IScript):
  def run(self, ctx):
    print('Hello, JEB2')

Within the official desktop client, scripts can be executed via the File, Scripts menu item.

Finally, remember that scripts are meant to execute small, light-weight actions. Heavy lifting operations (such as parsing or background event-driven tasks) should be implemented by back-end plugins in Java.

Check out our GitHub repository for more sample scripts.

Developing JEB2 parsers and plugins

Update (11/2): parts 7 and 8 are available.

Our tutorials are available on the JEB2 developer portal, which aggregates all resources for API developers:

JEB2 Developer Portal

  1. Getting Started with Parsers
  2. Creating a Simple Parser
  3. Documents and Delegation
  4. Tables and Trees
  5. Development Tips
  6. Releasing a Plugin
  7. Interactivity
  8. Interactivity, Part 2
  9. Persistence (To be published)