Automatic Identification of Mirai Original Code

Context

One of the major threat on embedded devices — the so-called “Internet of things” –, is the infamous Mirai malicious software, whose source code was made public in September 2016. This malware has the ability to infect devices by brute-forcing Telnet credentials, and is primarily used to launch distributed denial-of-service attacks.

Since the source code release, numerous Mirai variants have been deployed in the wild by miscreants, like the one we documented in a recent post.

In this blog we will first take a quick look at another Mirai-based malware, quite original in its own way, to then introduce our novel signature system that can identify Mirai original code in executables.

Yet Another Mirai Variant

On May 18th, ESET’s Michal Malík mentioned on Twitter a Mirai-based sample for MIPS that grabbed our attention. Michal pointed out new functionalities like a custom update mechanism, and some strange debug routines, so we decided to take a look with our brand new MIPS decompiler. It should be noted that this sample comes with the debug symbols, which explains the names present in the decompiler output.

The malware logic starts in its main() routine, which is shown below as decompiled by JEB.

Briefly summarized, this routine first sets up a few signal handlers, in particular to create a core file in case of segmentation fault. It then calls a homemade panic() function — not to be confused with the standard Linux panic() routine. The panic() function code is shown below, as seen in JEB.

While the routine native code — seen on the left side — can be pretty dry to read, the decompiled code on the right side is fairly straightforward: a file named file.txt is opened and a given error message is written to it, accompanied by a custom system footprint built by the footprint12() routine.

Finally, main() calls the kill_run_mobile1() function, which first kills any application listening on TCP port 18899 (likely others instances of the same malware),  and then creates a thread on the mobile_loop1() function, which is shown below.

The new thread will listen for incoming connections and process them through a custom command handler. As can be seen from the numerous debug messages in the decompiled code, the code is still in a development stage.

To summarize, this sample appears to be an attempt to repackage Mirai source code with a different update mechanism, and is still in development, as can be seen from the presence of debug routines, and the fact that plenty of code remains unused.

While the technical quality of this sample is dubious, it illustrates one of the major consequence of Mirai source code public release: it has lowered the bar of entry for malicious software developers. In particular, we can expect the strain of Mirai-based malicious software to continue to grow in the following months.

Native Code Signatures

In a context where numerous Mirai-based malware are deployed in the wild, having the ability to identify original Mirai code becomes particularly useful, as it allows the analyst to focus only on the new functionalities in each sample.

Of course, most of Mirai-based samples do not come with symbols, and hence we need a proper mechanism to identify Mirai original code. That is the purpose of the native signature system released with JEB 2.3, which can actually identify code for all native architectures supported by JEB (x86, ARM, MIPS and the associated variants).

The objective of this signature system is to identify native routines with a minimal number of false positives. In others words, we want to fully trust a successful identification, while we may miss some known routines.

To realize this low false positives goal, our signatures are primarily based on two features:

  • A custom hash computed on the binary code of the unknown routine. During this computation, we remove from the native instructions the addresses and offsets that may vary depending on where the routine is located in a binary. Hence the same routine located at a different place will have the same hash. Interestingly, as our algorithm uses the generic JEB interface on native instructions (IInstruction), the hash computation is done on all architectures in the same way.
  • The names of the routines called by the unknown routine, e.g. API routines, system calls, or already identified routines. This feature allows to distinguish wrappers that have exactly the same binary code but call a different routine.  

The whole signature process can be summarized in two steps — which will be described in details in a separate documentation:

  1. Signatures are generated from a reference file. This file can be a native file with symbols, or a JEB database with some routines renamed by the user. For each named routine, a signature containing the routine features and information is created. Signatures are then grouped into packages for each platform.
  2. When JEB analyzes an unknown routine, it tries to match it with the signatures. If there is a match, the information of the original routine are imported, e.g. the matched unknown routine is renamed as the original routine.

Due to its strict reliance on the binary code, this identification process does not offer a resistance to minor changes, like the ones introduced by compilation with a different compiler version or with different optimizations. We intend to develop others signature systems in JEB, which will be more resistant to such variations, in particular by using JEB intermediate representation.

Still, it is particularly suitable in the case of Mirai, where the public source code comes with compilation instructions, such that many samples are compiled in the same way and share the exact same binary code. Therefore, JEB 2.3 comes with a set of signatures created from a non-stripped executable created from Mirai public source code.

These signatures are automatically applied when a MIPS binary is loaded in JEB. For example, here is an extract of the initial routines list after loading in JEB a stripped Mirai sample deployed last year (SHA1: 03ecd3b49aa19589599c64e4e7a51206a592b4ef).

On the 204 routines contained in the sample, 120 are automatically identified and renamed by JEB, allowing the user to focus on the unknown routines. It should be noticed that not all recognized routines belong to Mirai specific code, some of them belong to the C library used by Mirai (uClibc).

Conclusion

The JEB native signature system is still in development, but its results are encouraging and we provide a set of signatures for Mirai on MIPS platform, and for the standard C library shipped with Microsoft Visual Studio 2013 on the x86 platform. We encourage users to try it through our demo version, and report any comments to support@pnfsoftware.com.

In the following weeks, not only will the number of signatures rapidly grow — through a specific update mechanism –, but we also intend to let users generate their own signatures with JEB public API.

Acknowledgement

The malicious software analysis presented in this post was done by our intern Hugo Genesse.

One thought on “Automatic Identification of Mirai Original Code”

Leave a Reply

Your email address will not be published. Required fields are marked *

*