Scanning PDF Files using JEB2

Update (9/13/2017): we open-sourced the PDF plugin. A compiled JAR binary is also available.

Update: Feb. 27: Slides – Automation How-To
Update: Dec. 3: List of notifications

In this blog post, we show how JEB2 can be used as a building block of a file analysis system. We will show how to use the Core API to create a headless client. That client will scan PDF files using the JEB2 PDF Analysis Module. Basics of the IUnit and co. interfaces is also demonstrated.

Source code on GitHub.

Sample execution output produced by the PDF Scanner

As this slide deck shows, the back-end and front-end components of JEB2 are separated. The official RCP desktop client uses the JEB2 Core API; other front-ends, like the PDF scanner, can be built using that same API.

JEB2 HL Architecture Diagram

Creating an Eclipse project

Let’s get started by creating a new code project. We will show how to do this in Eclipse.

0- Check your license of JEB2. Make sure to use a license that supports third-party client creation and the loading of third-party plugin. If you haven’t done so, download and drop the PDF module in your coreplugins/ sub-directory.

1- Clone our sample code repository: git clone https://github.com/pnfsoftware/jeb2-samplecode.git

2- Create a new Java project. The Java source folder should be rooted in the src/ directory.

3- Add the JEB2 back-end as a JAR dependency. The back-end software is contained in the file bin/cl/jeb.jar located within your installation folder. You may also want to link that JAR to the API documentation, contained in the doc/apidoc.jar file, or online at https://www.pnfsoftware.com/jeb2/apidoc

Your Package Explorer view should now look like:

Package explorer view after setting up dependencies

5- Set up the execution options. The required Java properties for execution (jeb.engcfg and jeb.lickey) can be set in the Run Configurations panel (accessible via the Run menu). Example:

Example of a Run configuration

6- Open the com.pnf.pdfscan.PDFScanner source file. You are ready to execute main().

How the scanner works

Now, let’s focus on the scanner source code.

  • The JEB2 back-end is initialized when scanFiles() is called:
    • Use JebCoreService to retrieve an instance to ICoreContext
    • Create an IEnginesContext
    • Load a project within that context (IRuntimeProject)
    • Add artifact(s) and process them (ILiveArtifact)
      • We add a single file artifact per project in this example
    • Retrieve the products (IUnit)
      • We are retrieving the top-most unit only in this example
    • Analyze the unit (see assessPdf())
    • Close the project

[Note: A detailed explanation of the above concepts (core, engines, project, artifacts, units, etc.) is outside the scope of this tutorial. Refer to our Developer Portal for more information.]

Snippet of scanFiles()

The assessPdf() method evaluates PDF units. The evaluation performed by this sample scanner is trivial: we collect the notifications created by the PDF plugin during the analysis of the file, and see if they meet basic criteria.

About the Unit Notifications:

  • Any JEB2 plugin can attach notifications to its units. The PDF plugin does so. Notifications are meant to pin-point noteworthy areas of a unit or artifiact.
  • A notification has a “dangerosity level” ranging from 0 to 100. It also has a description, an optional address to point to which area of the unit the notification is associated with, etc.
  • The API offers standard notification types, ranging from “Interesting area” to “Definitely Malicious”.
Standard notification levels offered in the NotificationType enum

A PDF unit can contain several types of notifications. Example include: corrupt areas in stream; multiple encoding of stream; JavaScript; password-protected stream; invalid/illegal entries in stream; etc.

Link: Complete list of notifications issued by the PDF plugin.

Our simple scanner reports a file as suspicious if it contains at least 2 notifications that have a level >= 70 (POTENTIALLY_HARMFUL). These thresholds can be tweaked in the source code.

The assessPdf() routine

The screenshot below is a sample output produced by the PDF scanner:

Conclusion

The intent of this entry is to shed some light on the process of writing third-party clients for JEB2, as well as what and how to use notifications reported by Units. We encourage you to visit our Developer Portal to find additional documentations as well as the reference Javadoc of the API.

Red October Malware for Android

Blue Coat Systems recently released a paper about the Inception APT (also dubbed Cloud Atlas, it may be connected to the Red October APT). One component of this APT is an Android trojan, masquerading as a Whatsapp update package. It is able to record audio calls, as well as gather, encrypt and exfiltrate user information.

The 4 strings partially written in Hindi that have been speculated on are those:

redoctober-android-img1

redoctober-android-img2

redoctober-android-img3

For researchers wanting to have a peak inside the APK, we are providing JEB decompiled Java code for one such sample.

Download is here: cloudatlas-android-malware-decompiled.zip

FinFisher FinSpy Mobile app for Android decompiled

The fully decompiled code and assets of 421and.apk can be found here: FinSpyMobileAndroid-decompiled.zip (no password).

This particular APK, although not the latest, is not obfuscated and easily reveals most capabilities of the malware:

  • Location tracker
  • Information stealer (calendar, contact list, text messages, Whatsapp databases, etc.)
  • Remotely controlled through encrypted communication over SMS and data

A great recap of the full story can be read on Netzpolitik. Real time updates are on Twitter.

Decompiled Java code for Android MisoSMS

Yesterday was eventful on the Android malware front. After Mouabad reported by Lookout, FireEye reported MisoSMS. It might also have been reported by Sophos at roughly the same time.

The malicious application is used in several campaigns to steal SMS and send them to China, according to FireEye’s blog post.

Many of you would like to examine and study its code, that’s why I uploaded an archive with the source code decompiled by JEB 1.4, as well as a cleaned-up manifest. Link: MisoSMS_JEB_decomp_20131217

misosms_mainact

Decompiling Android Mouabad

Lookout has an interesting article about Android Mouabad. Yet another Korean SMS malware!

The APK fully decompiled by JEB 1.4 can be found here: mouabad_JEB_decomp_20131217.zip. I haven’t refactored or commented the code, these are raw decompiled classes.

mouabad_sms_receiver

Sample MD5 68DF97CD5FB2A54B135B5A5071AE11CF is available on Contagio.

Android OBad decompiled sources

This threat was initially reported by Kaspersky and seems to have gathered quite some amount of attention lately.

I’m attaching the sources decompiled by JEB 1.2 to this post, for analysts who’d like to take a look at it. One particularity of OBad: it’s been protected by the commercial obfuscator Dexguard (which I briefly mentioned here and there). Dexguard employs generalized string encryption and calls methods through reflection, which makes raw source code a bit difficult to read – the sources have not been refactored and marked-up, the usefulness of it all is very limited: you be warned.

Archive: obad_decomp.zip

Sample MD5: f7be25e4f19a3a82d2e206de8ac979c8

Edit (June 11)

Jurriaan Bremer was kind enough to provide oboy.dex, a version of OBad with encrypted strings replaced by their decoded versions. JEB handles the file okay, despite the multiple string duplicates (which is one of the reasons why it could not be loaded on a phone.) I’ve attached the decompiled sources for his version below. (I also disabled the try-catch support for improved clarity.)

Archive: oboy_decompiled.zip

JEB’s decompiled sources for Android/BadNews.A

This Android malware was referenced in a recent blog post (http://www.xchg.info/?p=46) and I thought analysts would be interested in checking out the decompiled bytecode.

Java files for live.photo.savanna and com.androways.advsystem: BadNewsA_Src.zip

Sample MD5: 98cfa989d78eb85b86c497ae5ce8ca19

SMS Spy ZertSecurity with decompiled, analyzed Java sources

The implementation of this Android malware is strong and clean. Nothing really innovative though.

0

Here’s a short summary:

  • Masquerades as a German certificate installer app. The fake login/pin is used to uniquely identify the phone (on top of the phone number and IMEI.)
  • Achieves SMS interception, most likely to break 2-factor authentication. There are 3 modes of operation: do nothing, intercept and proceed, intercept and cancel SMS broadcast.
  • Multiple command-and-control servers. Two hard-coded domains are hxxp://app-smartsystem.(com|net)/sms/d_m009.php
  • The C&C urls can be updated, either by contacting the existing the existing C&C, or via SMS sent by the master. Such SMS contain the string “&Sign28tepXXX”.
  • The communication with the C&C is encrypted using AES-ECB, with the static key “0523850789a8cfed”. The server also base64-encodes its payloads. (The client does not.)
  • The malware will try to start at boot time (BOOT_COMPLETED receiver), and also registers a 15-minute timer to query the server for updated C&C urls.

 

1

The APK was run through a name obfuscator. I’m attaching the sources decompiled using JEB 1.1, and with most of the refactoring/renaming/commenting work done. For JEB users, here is the JDB file. Enjoy.

Sample MD5: 1cf41bdc0fdd409774eb755031a6f49d

Japanese Contact Stealer

Let’s have a quick look at a variant of Android.Uracto, an app that steals (and potentially spams) contacts from Android devices. There is nothing particularly interesting about this piece of malware, but it’s the occasion to demonstrate some of JEB less-known and forthcoming abilities.

Upon startup, the app displays the following spinner, that translates to “Reading the articles…”:

2

The onCreate() method for the main activity displays the above spinner, and also starts a new Thread, that will create and run a Progress object. Here it is:

3

4

The run() method will call postMailList(). This method gets the ContentResolver for the app, and enumerates all entries having the “vnd.android.cursor.item/name” MIME type. According to the documentation, these entries represent “contacts’ proper names”.

5

A buffer representing the data1, data2, and data3 fields (respectively, Display Name, Given Name, and Family Name) is dynamically created.

[JEB specific]

Notice the optimizations that allowed the creation of that compact construct:

  • for-loop optimization
  • string concatenation
  • aggressive variable substitution

Some of these optimizations are already present in 1.0.x, others will be included in the 1.1 versions.

[/JEB specific]

The final data (“contact1, contact2, …”) is dumped on the external memory storage, encoded and POST’ed to hxxp://jap2012.com/data/main.php.

6

Find the decompiled activity here: solution.newsandroid.MainviewActivity

Sample MD5: ba73e96caa95999321c1cdd766bdf58b