Identifying a botnet is not an easy task sometimes, especially when one gets lost in different components like droppers, infectors and other bad stuff. Some two weeks ago, Jose Nazario from Arbor Networks pointed me to a new varmint that appears to be another peer-to-peer bot. When executed, the program installs tons of stuff that holds a number of goodies, for example
- an executable hidden in an Alternate Data Stream,
- three Bitcoin miners: the Ufasoft miner, the RCP miner and the Phoenix miner,
- a file with geo-location information for IP address ranges.
However, we leave these aside for now and focus on the botnet's architecture instead, which is really just a channel for pushing software to infected machines. Scrabbling about in the installed programs finally brought up the actual bot, which we detect as Trojan.Win32.Miner.h. The binary has some layers of obfuscation to make analysis harder but eventually writes a UPX packed executable into a memory section from where to original binary can be restored.
One of the first things that come to attention is a list of 1953 hard-coded IP address strings that are contained in the binary. These addresses are contacted by the bot during its bootstrapping phase in order to join the peer-to-peer network.
To verify if a remote host is really part of the botnet, it is first probed on port 62999/tcp. Afer that, all subsequent communication with that host takes place over HTTP connections on port 8080/tcp. If a bot wants to receive a piece of information from the botnet, it sends a GET request for the URL /search=[resource] to another peer (see red part below). The response (shown in blue) contains the requested data. In the example below the bot asks if a file named ip_list_2 exists.
GET /search=ip_list_2.txt HTTP/1.1
HTTP/1.1 200 OK
Date: Thu, 28 Jul 2011 1:46:30 PM GMT
Last-Modified: Thu, 28 Jul 2011 1:46:30 PM GMT
Expires: Thu, 28 Jul 2011 1:46:30 PM GMT
The remote peer confirms the existence of the file by sending back an MD5 hash of its content. A non-existing file or otherwise invalid request would have been indicated by the string null. To actually download the searched file, the .txt suffix is left out in the request:
GET /search=ip_list_2 HTTP/1.1
HTTP/1.1 200 OK
Date: Thu, 28 Jul 2011 1:46:32 PM GMT
Last-Modified: Thu, 28 Jul 2011 1:46:32 PM GMT
Expires: Thu, 28 Jul 2011 1:46:32 PM GMT
The response contains a list of IP addresses belonging to other peers in the botnet. This information is sufficient to recursively enumerate the peer-to-peer network, or at least the part of it that lives on public IP addresses. We crawled a part of the network and recorded the IP addresses we got back. A graph plot of the resulting data shows a highly interconnected network. The graph will soon get too big to render in reasonable time, thus we terminated our crawler after a few seconds. As a result, some nodes link to 'dead ends', peers that have not been followed further.
There are three separate host lists: ip_list, ip_list_2 and ip_list_3, however, during our tests the latter has always been empty. A run where we crawled the networks corresponding to the first two lists for seven hours resulted in 9.141 hosts for ip_list and 28.675 hosts for ip_list_2 with only 57 hosts being present in both lists — a total of almost 38.000 different public IP addresses. Taking into account that most machines are behind network address translation or some gateway nowadays, the real number of infected machines can easily be magnitudes bigger.
A bot may retrieve its Internet-facing IP address via /search=get_my_ip and check if it can be reached from the outside with /search=listen_test. Another interesting thing is the request for /search=soft_list, a list of executables:
GET /search=soft_list HTTP/1.1
HTTP/1.1 200 OK
Date: Thu, 28 Jul 2011 21:54:04 GMT
Last-Modified: Thu, 28 Jul 2011 21:54:04 GMT
Expires: Thu, 28 Jul 2011 21:54:04 GMT
This list contains a number of files that the bot will download from the peer-to-peer network and run. Again, it requests them by sending the file name as a parameter for a /search= request. Each file has a unique ID, the number before the first dash. The number string in the next field has been shortened in the above example for better readability. Its purpose is not known yet. Newer files seem to have higher ID numbers. For example, the files client_3.exe, client_4.exe, client_6.exe and client_7.exe which are not present in the software list but still available for download have ascending ID numbers:
- client_3.exe: 1555
- client_4.exe: 1596
- client_6.exe: 1607
- client_7.exe: 1611
- client_8.exe: 1864
Checking if new software is being distributed in the botnet is easy: all we have to do is download the software list and look for new IDs. We will continue to do so and add signatures for new bad things to our detection as they show up.