There are many remaining mysteries in the Gauss and Flame stories. For instance, how do people get infected with the malware? Or, what is the purpose of the uniquely named Palida Narrow font that Gauss installs?
Perhaps the most interesting mystery is Gauss encrypted warhead. Gauss contains a module named Godel that features an encrypted payload. The malware tries to decrypt this payload using several strings from the system and, upon success, executes it. Despite our best efforts, we were unable to break the encryption. So today we are presenting all the available information about the payload in the hope that someone can find a solution and unlock its secrets. We are asking anyone interested in cryptology and mathematics to join us in solving the mystery and extracting the hidden payload.
The containers
Infected USB sticks have two files that contain several encrypted sections. Named System32.dat and System32.bin, they are 32-bit and 64-bit versions of the same code. These files are loaded from infected drives using the well-known LNK exploit introduced by Stuxnet. Their primary goal is to extract a lot of information about the victim system and write it back to a file on the drive named .thumbs.db. Several known versions of the files contain three encrypted sections (one code section, two data sections).
The decryption key for these sections is generated dynamically and depends on the features of the victim system, preventing anyone except the designated target(s) from extracting the contents of the sections.
By the way, the 64-bit version of the module has some debug information left in it. The module contains debug assertion strings and names of the modules:
.loader.cpp
NULL != encSection
Path
NULL != pathVar && curPos < pathVarSize
NULL != progFilesDirs && curPos < progFilesDirsSize
NULL != isExpected
NULL != key
(NULL != result) && (NULL !=str1) && (NULL != str2)
.encryption_funcs.cpp
The data
The mysterious encrypted data is stored in three sections:
The files also contain an encrypted resource 100 that seems to be the actual payload, given the relatively small size of the encrypted sections. It is most likely that the section .exsdat contains the code for decrypting the resource and executing its contents.
The algorithm
The code that decrypts the sections is very complex compared to any regular routine we usually find in malware. Here is a brief description of the algorithm:
Validation
-
- Make a list of all entries from GetEnvironmentVariableW(Path), split by separator ;
- Append the list with all entries returned by FindFirstFileW / FindNextFileW by mask %PROGRAMFILES%*, where cFileName[0] > 0x007A (UNICODE z)
Note: in essence, this means the specific program which is installed in %PROGRAMFILES% has a name which starts either with a special char such as ~, as in our example, or uses an UNICODE special char table, such as Arabic or Hebrew, where all chars are higher than 0x007A.
-
- Make all possible pairs from the entries of the resulting list.
- For each pair, append the first hard-coded 16-byte salt and calculate MD5 hash.
Example of the string pair, second string starting from ~dir and first salt
- Calculate MD5 hash from the hash ( i.e. hash = md5(hash) ), 10000 times.
- Compare if the MD5 hash matches the hard-coded value. If not, then exit.
Decryption
The sections are decrypted in the following order: .exsdat, .exrdat, .exdat
-
- Use the PATH/PROGRAMFILES pair that was used to generate the expected MD5 hash in the validation code above.
- Append the pair with the second hard-coded 16-byte salt and bytes 0x15, 0x00
Example of the string pair, second string starting from ~dir and first salt
- Calculate MD5 hash from the resulting buffer.
- Calculate MD5 hash from the hash ( i.e. hash = md5(hash) ), 10000 times.
- Derive the RC4 key from the resulting hash using WinAPIs CryptDeriveKey(hProv, CALG_RC4, hBaseData, 0, &hKey).
- Decrypt the section (RC4), treating its first DWORD as the length of the buffer to decrypt and encrypted buffer starting at offset 4 of the section.
- Compare DWORDs in the decrypted buffer at positions 0 or 7 with magic value 0x20332137. Proceed only if any of the DWORDs match.
- Increase the last WORD in the pair+salt buffer (the one initially set to 0x0015) by 1.
- Decrypt another section, goto 3.
After all the sections are decrypted: call the function at the beginning of the .exsdat section.
Sample data for validating the algorithm:
The string pair is created by concatenating the strings. The strings and the salt buffer are not separated by any character.
Sample test Strings, Unicode (without quotes):
- “C:Documents and SettingsjohnLocal SettingsApplication DataGoogleChromeApplication”
- “~dir1”
First salt, hex dump: 97 48 6C AA 22 5F E8 77 C0 35 CC 03 73 23 6D 51
MD5 at validation step 6: 76405ce7f4e75e352c1cd4d9aeb6be41
Second salt, hex dump: BB 49 4E 77 F9 25 EE C0 3B 89 FC ED C2 22 4A 21
MD5 at decryption step 5: 00916031b3e9513044436ee42b6aa273
Join the quest
We have tried millions of combinations of known names in %PROGRAMFILES% and Path, without success. The check for the first character of the folder in %PROGRAMFILES% indicates that the attackers are looking for a very specific program with the name written in an extended character set, such as Arabic or Hebrew, or one that starts with a special symbol such as “~”.
Of course, it is obvious that it is not feasible to break the encryption with a simple brute-force attack. We are asking anyone interested in breaking the code and figuring out the mysterious payload to join us.
The resource section is big enough to contain a Stuxnet-like SCADA targeted attack code and all the precautions used by the authors indicate that the target is indeed high profile.
We are providing the first 32 bytes of encrypted data and hashes from known variants of the modules. If you are a world class cryptographer or if you can help us with decrypting them, please contact us by e-mail: theflame@kaspersky.com.
Source data
We are providing up to 32 bytes from the beginning of each encrypted section, skipping the DWORD that contains the length of the encrypted buffer. Please contact us by e-mail theflame@kaspersky.com if you need more encrypted data.
Sample | 56e4fb972828fafbbdc11158a1b5fa72 |
Salt 1 | 97 48 6C AA 22 5F E8 77 C0 35 CC 03 73 23 6D 51 |
Reference MD5 | 758EA09A147DCBCAD6BD558BE30774DE |
Salt 2 | BB 49 4E 77 F9 25 EE C0 3B 89 FC ED C2 22 4A 21 |
Exsdat | 4C CC BA E2 E0 BA 2E 44 C7 60 17 9A 72 F4 2F 27 DD FD DB 11 03 94 E3 4B 0A 16 66 F3 36 97 6C D8 |
Exrdat | C9 27 BE 67 4D 3B 39 36 AB 14 44 32 88 60 7A 64 B0 92 9B 3A A1 5B C5 21 A7 6E 09 0C F8 71 84 87 |
Exdat | B8 EB 6D 61 2B 4F 70 65 75 A2 1C 03 1C DF 26 2F |
Sample | 695056ffacef1fdaa326d7c8bb0f88ba |
Salt 1 | 6E E3 47 2C 06 A5 C8 59 BD 16 42 D1 D4 F5 BB 3E |
Reference MD5 | EB2F172398261ED94C8D05216650919B |
Salt 2 | 8F 42 B5 87 E8 9A B2 32 C8 1C 1A EC B5 2D 55 19 |
Exsdat | CE 31 D0 5D 7D CB 57 9A 83 06 09 8D 42 2B 44 34 24 13 B2 39 22 48 8F F3 76 E5 9C DA 87 8F BC 42 |
Exrdat | 50 1F F8 BA 18 1B 3E 36 23 9D 95 DC 5A 07 E4 EC 76 38 78 79 BA 84 A5 4E 24 BA 0E 27 94 63 F7 3D |
Exdat | 9D 5B B8 3B B2 17 00 DC 76 81 1D 4E 54 80 9B 31 |
Sample | 089d45e4c3bb60388211aa669deab26a |
Salt 1 | 0E A5 01 D1 24 71 CD CD 0E 9E AC 6E 48 5A F9 32 |
Reference MD5 | 52DD4D6B792D84C422E6A08E4272ACB8 |
Salt 2 | 38 F9 A6 5B 82 08 E7 61 1D 10 73 53 50 BC B4 F0 |
Exsdat | D3 CA 9D 9F 87 FB 25 43 7E C6 57 7C D9 06 10 8D D2 5B B2 88 18 6E FD B4 C4 30 12 2E 1E EC E0 64 |
Exrdat | B4 43 8F B8 0A 67 7D 88 C1 CD F3 E8 D9 61 1B E9 5A 8A 41 16 8B 8A 18 AD 25 5A 81 87 8F 8D 1A 40 |
Exdat | F6 C9 81 C9 86 27 16 0C B7 33 93 AB 3E 71 5B E2 |
Sample | 8d90e3c68030fbb91ad5b920d5e17b32 |
Salt 1 | C3 23 4D 51 5D 52 A5 8E 81 46 FA 8A 6D 93 DF 7D |
Reference MD5 | 53B3FAEA53CC1B90AA2C5FCF831EF9E2 |
Salt 2 | 21 9D 04 35 7B 96 74 53 B0 9C CD 7F 2F E6 63 AA |
Exsdat | AB 01 6A 8E 42 F0 F2 92 1D F1 4A 42 01 63 72 78 D6 F7 A5 0C 54 37 21 2C B8 59 6A D0 7E 68 19 2D |
Exrdat | 6C 2D D7 E4 F6 08 15 C0 69 D9 9E FF EA 68 63 4F 56 59 DA 28 E5 2E A1 EF 21 FB F9 2B C2 BC E7 CE |
Exdat | 55 A7 F3 93 E0 AF 5B 7E 17 22 7E 82 8A 6F 25 21 3D 64 D7 E8 |
The Mystery of the Encrypted Gauss Payload