Emotets Rückkehr steht bevor

Emotets Rückkehr steht bevor

Hornetsecurity analysiert die neuen Verschleierungstechniken der Malware

Emotet ist wahrscheinlich die produktivste der jüngsten Malware Distribution Operations. Dabei wird die Malware häufig verändert, um sicherzustellen, dass sie von keiner Antiviren-Software erkannt wird. Auch wenn das Emotet Botnet auf „Spam-Pause“ ist, haben die jüngsten Änderungen an einer Komponente der Malware das Security Lab von Hornetsecurity veranlasst, einen genauen Blick auf die neueste Version von Emotet zu werfen, um auf die nächsten Schritte vorbereitet zu sein. Emotet hat neue Techniken zur Code-Verschleierung wie Control Flow Flattening und Junk Code hinzugefügt. Das Security Lab erklärt daher, wie es dennoch analysiert werden kann.

Vorab weisen wir daraufhin, dass die Aktualisierungen des Emotets Loaders keinen Einfluss auf die Filter von Hornetsecurity haben, da der Emotet Loader nie direkt an eine E-Mail angehängt wird. Allerdings können die folgende Analyse und die herunterladbaren Ghidra-Skripte anderen Forschern trotz der zusätzlichen Verschleierung helfen, ihr Emotet Reverse Engineering zu starten.

 

Hintergrund

Die Malware, die heute allgemein als Emotet bekannt ist, wurde erstmals 2014 beobachtet. Es handelte sich dabei um einen Banking-Trojaner, der die Bankdaten auf den Systemen der Betroffenen ausspähte. Emotet entwickelte sich jedoch weiter und stellt mittlerweile seine Malware-Verbreitungsdienste anderen Cyberkriminellen als Malware-as-a-Service (MaaS) zur Verfügung. Wenn ein System mit Emotet infiziert wird, wird es Teil des Emotet Botnet.

Weitergehende Informationen zu Emotet, der Infektionskette und dem Botnet von Emotet.

 

(Jüngste) Entwicklungen

Zwar könnten wir tief in die verworrene Geschichte von Emotet, seine Ursprünge und seinen gemeinsamen Code mit Feodo oder seine Beziehungen zu Cridex und Dridex eindringen, doch konzentrieren wir uns eher auf die jüngere Geschichte der Malware.

Wie bekannt sein dürfte, versendet das Emotet Botnet derzeit (zum Zeitpunkt des Verfassens dieses Artikels) keinen Spam. Das Botnet pausiert häufiger, kommt dann aber mit Updates zurück, die es gefährlicher machen als zuvor. Diese Zeitleiste zeigt die jüngsten „Spam-Pausen“:

Seit dem 31.10.2018 verfügte Emotet über ein neues Email Stealer Modul. Bei der Rückkehr am 16.09.2019 wiederum wurde Epoche 3 von Epoche 1 abgespalten. Vor Kurzem wurde außerdem beobachtet, dass TrickBot — eine Malware, die häufig über Emotet verbreitet wird — am 29.04.2020 den Emotet Loader gedroppt hat. In diesem Zusammenhang wird spekuliert, dass dies eine Gegenleistung der Verantwortlichen hinter der TickBot-Malware ist, die normalerweise eigentlich von Emotet platziert wird. Da momentan allerdings kein Emotet-Malspam verschickt wird, aus dem sich das Botnet rekrutiert, und gleichzeitig infizierte Systeme gesäubert werden, besteht die Möglichkeit, dass die Anzahl der Emotet-Bots möglicherweise auf eine kritisch niedrige Zahl gesunken ist.
Dies sind jedoch nur Spekulationen. Zum jetzigen Zeitpunkt ist nichts Eindeutiges bekannt.

Was allerdings feststeht, ist, dass Veränderungen des Emotet Loaders von Epoche 2 seit 20.04.2020 verbreitet werden. Die Frage ist nun, wann und mit welchen neuen Tricks wird Emotet dieses Mal zurückkehren?

Aufgrund der analytisch-spezifischen Fachbegriffe erfolgt die Auswertung der Analyse auf Englisch.

 

Technical Analysis

Due to a lack of current Emotet spam we can not analyze any malicious emails or documents. We therefore present an analysis of the Emotet loader binary.

 

Emotet Loader Binary

We analyzed both the „old“ version that was dropped from Epoch 1 on 2020-02-20 (which has not received any notable changes since the 2020-02-05 update), and the „new“ version dropped from Epoch 2 on 2020-02-20 onwards.

 

Packer

The metadata of the new version contains fragments of news article text. The File Description, Internal Name, Original Filename, Product Name, and Copyright texts are filled with fragments of news articles:

This is another indicator that Emotet uses the same packer as Trickbot, which has been seen with news article text in its metadata shortly before Emotet featured news article texts. Researchers have duped this crypter „Exes’r’rus“ [JRoosen].

Because malware packers frequently change, this change is not unusual. It has been seen around 2020-02-14, but The Emotet loader payload can still be extracted via generic (and automated) unpacking mechanisms.

 

Obfuscation

Emotet added more obfuscation. The most notable change is control flow flattening and junk code. This makes the binary more annoying to analyze as well as allow for simpler polymorphic changes. Most of the obfuscation was already added in the 2020-02-05 change.

 

Control Flow Flattening

Control flow flattening is an obfuscation technique. It splits a code sequence into multiple parts and rearranges then in a way that is more difficult to follow. A common way is to place the code parts into a loop and on each loop run executing different code parts in the loop body determined by a state variable.

 

The code sequence:

CODE_1;
CODE_2;
CODE_3;
return;

is turned into:

state = STATE_1
do
{
	if ( state > MAGIC_VALUE_1 )
	{
		if ( state == STATE_2 )
		{
			CODE_2;
			state = STATE_3;
		}
		if ( state == STATE_1 )
		{
			CODE_1;
			state = STATE_2;
		}
	}
	else if ( state == JUNK_STATE_1 )
	{
		JUNK_CODE_1;
		state = STATE_4
	}
	else
	{
		if ( state == STATE_3 )
		{
			JUNK_CODE_2;
			state = JUNK_STATE_1;
		}
		if ( state == STATE_4 )
		{
			CODE_3;
			return;
		}
	}
} while(1);

While semantically identical the second code is harder to follow.

 

In Emotet this looks like:

Emotet uses this to move code blocks around, as can be seen in this example, where the code block setting up the headers of the HTTP C2 communication is near the beginning of the function in one binary and near the end of the function in another binary:

Junk Code

Another technique Emotet uses is called junk code. Here useless code that does not change the semantics of a program is added.

The code sequence:

unsigned int function(unsigned int n) 
{ 
    if (n == 0) 
        return 1; 
    return n * function(n - 1); 
}

is turned into:

unsigned int function(unsigned int n) 
{
    unsigned int a = 1234;
    unsigned int b = 4321;
    a = n + b;
    b = n + a;
    if (n == 0 && a > 10 && b > 20)
    {
        b = n - a;
        n = b;
        return 1;
    }
    a = b + 42;
    return n * function(n - 1); 
}

Here the calculations on the variables a and b are irrelevant code. They do not influence the result of the code at all. However, an analyst does not know which calculations are important and which are not. Hence, while semantically identical the second code is harder to understand.

 

Luckily modern analysis software is able to simplify artificially complicated code. So a complex looking function:

is automatically reduced to a return of a static value:

This is used in the next obfuscation method.

 

Opaque Predicates

Opaque predicates are branch conditions for which the outcome is already known, but which still need to be evaluated at runtime. An example would be a code that gets the current time twice. Because time never goes backwards the first value will not be greater than the second:

time_t a = time(NULL);
time_t b = time(NULL);
if ( a <= b )
    CODE;
else
    JUNK_CODE;

While to a human this is logical, a machine would still need to evaluate what values a and b have to determine whether to take the jump or not.

 

Emotet uses functions with a static return value – see previous junk code example above – and uses their return values as a branch condition:

The branches used for control flow flattening also use opaque predicates, as the state variable changes predictable but must be evaluated at runtime.

 

Dynamic Library and Function Resolution

All library calls are dynamically resolved by hash. Previous versions stored the resolved functions. Now they are just-in-time resolved before every call.

 

For each library call, first the library is obtained (emotet_get_lib()). Then the address of the desired function is obtained (emotet_get_func):

Libraries are resolved via the InLoadOrderModuleList reachable from the Process Environment Block (PEB) (FS:[0x30]):

Then the DllBaseName for every loaded module is hashed and compared against the current queried hash. If the hash matches the DllBase is returned. The GET_LIB_XOR_VALUE varies for each binary. This means that library hashes change from sample to sample and can not be pre-calculated. They must be calculated for each sample individually.

 

Addresses of functions are obtained via manually traversal of in this case the export directory data structure. Each exported function name of the previously resolved DLL’s image is iterated over and hashed. If the hash matches the function’s address is returned:

Emotet still uses the same hashing algorithm as previous versions:

The emotet_get_func() function uses a variation without mapping the uppercase characters to lowercase, i.e., its hash is case-sensitive, and ingesting a char string instead of a wchar string. The emotet_hash() function is also heavily laced with junk code, which luckily the decompiler already simplified and/or discarded.

 

Using the Ghidra analysis scripts emotet_lib_imports.py and emotet_func_imports.py the called library and function names can be reconstructed from their hashes:

The following libraries that are usually not loaded into a process by default are loaded via LoadLibraryW:

shell32.dll
userenv.dll
urlmon.dll
wininet.dll
wtsapi32.dll
advapi32.dll
crypt32.dll
shlwapi.dll

Handles to the libraries are stored to allocated memory. But never used. We have found other functions and code paths that are never used. These is likely leftover code that was not removed during code updates.

 

XOR Obfuscation

Strings and the RSA key are (as in previous versions) XOR obfuscated. The use the following structure:

typedef emotet_xor_data_t {
    uint32_t xor_key;
    uint32_t len;
    uint32_t data[];
} emotet_xor_data_t;

They are decrypted by first XOR’ing len with xor_key. Then XOR’ing decoded and to 4-byte boundaries aligned len bytes of data with xor_key:

The XOR key again changes from binary to binary. But still the decoding process can be automated, e.g. the emotet_string_decode.py script can decode used strings:

They are also just-in-time decrypted on demand to a new memory allocation. The memory allocation holding the decrypted string is deleted again after use for every use of the string.

 

Static Analysis Tutorial

With the basics of the new obfuscation techniques covered we quickly outline how Emotet can still be analyzed.

  1. Unpack your Emotet sample. E.g., using the free open source community developed CAPE sandbox [CAPE].
  2. Import into Ghidra [GHIDRA].
  3. Run Auto Analysis.
  4. Run emotet_lib_imports.py with currentAddress in 2nd function called in entry function (this is what we refer to as the emotet_get_lib function).
  5. Run emotet_func_imports.py (selecting func_names.txt) with currentAddress in 3rd function called in entry function (this is what we refer to as the emotet_get_func function).
  6. Run emotet_string_decode.py with currentAddress in emo_*_LoadLibraryW function.
  7. Run emotet_string_decode.py with currentAddress in 1st function called in emo_*_LoadLibraryW function.

Then you have:

  • Comments for library and function resolution.
  • Two enum types emotet_{lib,func}_hash with enums for the library and function hashes, which you can optionally apply to the emotet_get_{lib,func} functions.
  • Emotet strings decrypted and set as comments, labels as well as searchable bookmarks.

C2 communication is found by searching for usage of the HttpSendRequestW function. The type of request and request headers can be found by backtracking via the first parameter (hRequest) to HttpSendRequestW. Alternatively usage of the HttpOpenRequestW function, the InternetConnect function, the POST string, or the Referer: http://%s/%s ... string will yield the relevant code.

C2 IP storage can be found by searching for the %u.%u.%u.%u string. These octets are filled from data in a allocation we named emotet_c2_data. This allocation is filled from a data location we named emotet_c2_list. If you run emotet_ip_decode.py with currentAddress set to the address of emotet_c2_list the C2 list is decoded and annotated as a comment.

The C2 HTTP request template looks like:

POST
Referer: http://%s/%s
Content-Type: multipart/form-data; boundary=%s

--%S
Content-Disposition: form-data; name="%s"; filename="%s"
Content-Type: application/octet-stream

 

RSA key is found by searching the source of the 3rd parameter (pbEncoded) to the CryptDecodeObjectEx function. It is XOR encoded like the strings. You can run emotet_data_decode.py with currentAddress on the data location passed to the function that returns the pointer that is then passed as the 3rd parameter to the CryptDecodeObjectEx function. You can use openssl asn1parse -inform DER -in emotet_key.bin to check if the decoded emotet_key_bin is valid. The key must be the exact length without any appended bytes beyond the ASN1 DER sequence stream.

Search for functions using the OpenSCManagerW function. One function will contain an assignment to a data location. Emotet uses OpenSCManagerW to check whether it has admin rights or not. This is where what we call the is_admin flag is set. Use „Auto Create Structure“ on the data reference and name it emotet_data and name the field assigned the value 1 is_admin.

Search for functions using the string %s\%s.exe. This is where the executable path is constructed. In its vicinity the emotet_data data location is accessed. Determine by their usage as parameters to the _snwprintf function which of the two fields in the emotet_data structure is the exe_path and which is the exe_name.

Then use „References -> Find use of emotet_data_t.exe_name“ to find out how the exe_name is generated. Do the same for the exe_path. The exe_name will also be used to generate the service name. Relevant code is found by searching for CreateServiceW.

Persistence via run keys is found via RegCreateKeyExW and/or the SOFTWARE\Microsoft\Windows\CurrentVersion\Run string.

Searching for usage of the %s:Zone.Identifier string finds where Emotet deletes its Zone.Identifier ADS, which on Windows is used to mark files downloaded from external sites as potentially unsafe.

As an interesting side note older new versions contained code removing old Emotet binaries which names were generated from word lists:

This functionality has been removed from the latest version, indicating that the migration to the new version has been completed for Epoch 2 and there is no need to delete old binaries anymore:

(Either that or the function has moved and has not been picked up by our string deobfuscator script.)

The following video demonstrates how our scripts can be used to jump start your own analysis of the Emotet loader:

 

Download our scripts on Github.

 

 

Tier 1 C2 geolocation

Emotet’s tier 1 C2 proxies and servers are geolocated all over the world:

(C2 IP list as observed 2020-04-20.)

 

Schlussfolgerung und Empfehlungen

Zum Schutz vor Emotet empfiehlt das US-CERT, „Filter am E-Mail-Gateway zu implementieren, um E-Mails mit bekannten Malspam-Indikatoren herauszufiltern“.

Hornetsecurity’s Spam and Malware Protection wird durch die Aktualisierungen des Emotet Loaders nicht beeinträchtigt, da der Loader nie direkt per E-Mail verschickt wird. Hornetsecurity blockiert somit weiterhin alle Emotet-Malspam-Indikatoren, wie z.B. Makro-Dokumente, die zur Infektion verwendet werden, aber auch bekannte Emotet-Download-URLs. Hornetsecurity’s Advanced Threat Protection erweitert diesen Schutz noch, da auch unbekannte bösartige Links erkannt werden, indem potenziell gefährliche Inhalte dynamisch heruntergeladen und in einer Sandbox ausgeführt werden. Hornetsecurity ist daher auch für den Fall vorbereitet, dass nicht nur der Emotet Loader sondern auch die Distributionstaktik verändert wird.

Um Emotet-Attacken zu verhindern, können IT-Sicherheitsverantwortliche auch öffentlich zugängliche Informationen des Cryptolaemus-Teams nutzen. Die Gruppe von IT-Sicherheitsspezialisten hat sich zur Bekämpfung von Emotet zusammengeschlossen und stellt täglich neue Informationen über ihre Website [CryptolaemusWeb] zur Verfügung. Dort erhält man die neueste C2-IP-Liste zum Auffinden und/oder Blockieren von C2-Verkehr. Für Echtzeit-Updates kann man dem Twitter-Account [CryptolaemusTwitter] folgen.

Natürlich hat die vorgestellte Analyse nur an der Oberfläche des Emotet-Malware-Komplexes gekratzt, aber wenn Emotet zurückkehrt, wird Hornetsecurity mit aktuellen Analysen über die neuesten Entwicklungen informieren.

 

References

Samples Used

 

Hashes

SHA256 Description
cc96711da9ef7b63d5f1749d8866b0149f84506f9d53d79de018dac92e9443a0 Epoch 1 sample („old“ version) 2020-04-20
4c3acc885006faebe59e0bfdd452499056d8fcc9e6a810d7ff93762ce1a061ad Epoch 1 sample unpacked payload 2020-04-20
4250a3ab9044b4a3a5f8319e712306bb06df61b1dee8b608505c026bacba4aa1 Epoch 2 sample („new“ version) 2020-04-20
14f8463a86bbae338925fbbcb709953ae7f1bffdb065fee540b6fa88fbbce70e Epoch 2 sample unpacked payload 2020-04-20