Lxj616 2014/07/28 11:45
from:https://www.virusbtn.com/virusbulletin/archive/2014/07/vb201407-Mayhem
0 x01 profile
It is increasingly common for websites or even entire servers to become infected. Usually this infection is used to eavesdrop on communications, black hat SEOs, pirate downloads, and so on. And in the vast majority of cases the malware consists of relatively simple PHP scripts. But in the last two years, many more sophisticated families of malware have been discovered. Mayhem is a multi-purpose modular bot for web servers. Our team studied the bot to gain an understanding of not only its malware client, but also some of its C&C server instructions, which allowed us to gather some statistics. This article should be considered a supplement to previous article 1 by the Malware Must Die team. We encountered Mayhem Bot in April 2014, and this article is the result of our independent research. 2 is the only other publication we found about Mayhem. In our research, we also discovered that Mayhem was a continuation of the larger ‘Fort Disco’ brute-force cracking movement (revealed in 3)
0x02 Malware Is Displayed
First, this part of the malware appears as a PHP script. We analyzed the release version of the PHP virus SHA256 hash: 728 b0379dd07a0a035 b3cc1aa3259cd934f56937e6371f270c23edf96d2c0801. The results of analyzing this script with VirusTotal are presented in Table 1
Date | VirusTotal results |
---|---|
2014-06-17 | 3/54 |
2014-06-05 | 3/51 |
2014-06-03 | 3/52 |
2014-04-06 | 1/51 |
2014-03-18 | 1/49 |
Table 1 checks the results of the PHP virus release using VirusTotal
After execution, the script kills all ‘/usr/bin/host’ processes, identifies the system architecture (x64 or x86) and the system type (Linux or FreeBSD), and releases a malicious dynamically linked library named ‘libworker.so’. This code also defines a variable ‘AU’ that holds the full URL at which the script is executed. The first part of this PHP script is shown in Figure 1
Figure 1 the first part of the PHP virus release
After that, the PHP virus release creates a Shell script called ‘1.sh’, the contents of which are described in Figure 2. In addition, the script also creates the environment variable ‘AU’, which is the same as the one defined in the PHP script.
Figure 2. Contents of the ‘1.sh’ script
The PHP virus release then executes the SHELL script by running the command ‘at now -f 1.sh’. This command adds a scheduled task. After execution, the virus release waits up to five seconds before deleting the scheduled task. If the ‘at’ command fails, the virus release will simply run the ‘1.sh’ script. The code for this part of the PHP virus release is shown in Figure 3
Figure 3 final section of PHP virus release
0x03 Dynamic Link Library Initialization
The LD_PRELOAD technology allows a dynamically linked library to be loaded first and allows it to be easily hooked into different functions. If a standard library function is overwritten in such a dynamically linked library, the library will intercept all calls to that function. This malicious sample contains its own implementation of the ‘exit’ function, so this malicious function replaces the original function when called by ‘/usr/bin/host’. An additional initialization function is called during the execution of the hooked ‘exit’ function, and the workflow for this function is shown in Figure 4. During this initialization, the following steps are performed:
• An ELF file containing only the ‘exit’ function is released
• The process forks then the child process runs the ELF file and finishes its execution
• The parent process does more initialization: it tries to connect to the Google DNS service (IP address 8.8.8.8), decrypts and parses the configuration file and retrieves various system parameters
Figure 4 shows the workflow of the initialization function
Once initialization is complete, the dynamic link library file is deleted from the hard disk. The malware then tries to open a file, a hidden file system, and maps it to memory, where a hidden file system is initialized. Then the process forks, the parent exits, and the child continues. The highly abstract workflow for the hooked ‘exit’ function is shown in Figure 5. Successful execution routes are marked in red on the flow chart. As you can see, the execution path is neither pure parent nor pure child. We assume that this is a counter-debugging technique for debuggers who have been set up to track only child execution or only parent execution after fork.
Figure 5 shows the highly abstract workflow of the hooked ‘exit’ function
After these steps, the child process (the only one still alive) runs the main loop of the malicious program. The malware will wait for the time set in the configuration and then run the function that does the actual work
0x04 Main loop function
This function first establishes a socket to communicate with the C&C server, and then checks to see if the infected host has been sent to C&C since this valid session, that is, since the malware was executed. If the flag message has been successfully delivered to the C&C server, the malware sends a ping packet, receives it and executes a C&C command.
If this flag message has not been successfully delivered, the malware prepares an HTTP packet containing the output of the ‘uname -a’ command, architectural information about the infected system, and information about the system user’s access to the execution process. After the packet is sent, the malware reads the C&C response and exits the function if an error occurs. If all is well, the malware updates the flag and tries to read and execute the commands in the other C&C responses. A highly abstract main loop function workflow is shown in Figure 6
Figure 6. Highly abstract workflow of the main loop function of the dynamic link library
At work, the malware consists of four lists and two queues. One queue is for the input string (the string received from the C&C server), and then another queue is for the output string (the string that will be sent to the C&C server). The first list stores the addresses of the plug-in’s worker functions, the second list stores the addresses of the functions that process data before writing to the socket (the one used to transfer data to C&C), and the third list stores the addresses of the data processing functions before reading data from the socket (the data received from C&C). And the fourth stores the address of the function that will process data from the string queue. Figure 7 shows how these queues and lists are used in the malware workflow
Figure 7 shows the workflow as data is read from the C&C server
Figure 8 shows the workflow of the malware as it processes the task
Figure 8 shows the string processing workflow of the plug-in
0 x05 for C&C orders
Seven different commands were used when the C&C server communicated with the malware. These commands can be divided into two groups: input instructions (C&C to BOT) and output instructions (bot to C&C). All of these commands are sent in HTTP POST requests and responses, that is, input instructions are sent in HTTP POST requests and output instructions are sent in HTTP responses to POST requests.
‘R’ instruction (output)
The malware notifies C&C that it has successfully loaded and is ready to work by sending this command. If the WEB server is running under root, the ‘R’ command sent to C&C looks like this:
R,20130826,< system architect-64 or 32>,< '/usr/bin/host' ELF header EI_OSABI value >, ROOT,< 'uname -a' command output >Copy the code
If the WEB server is running under restricted permissions, the command is the same, but ‘ROOT’ is replaced by the output of getenv(‘ AU ‘) – the URL where the PHP script starts executing the malware. If all is well, the C&C server returns’ R,200 ‘
‘G’ command (input)
This command was sent to the malware by the C&C server. This directive has the following format:
G, < task ID >Copy the code
If the current task ID is not equal to the received ID, the malware will complete the currently running task and start a certain number of new worker threads. The number of worker threads is the ‘F’ instruction (output) set by the ‘L’ command, which is used to request files from the server. If the malware wanted to request a new file, it would send the following command:
F,< filename >,0Copy the code
If the malware wants to check if a previously acquired file still has an updated version, it sends:
F,< file name >,< file CRC32 validation >Copy the code
If the file is not found on the C&C server, the server responds: F,404,< filename >
If the file has not been changed since it was accepted, C&C will respond:
F,304,-
Copy the code
If the new or updated file is found, the server will respond:
F,200,< file name >,<BASE64 encoded file data >Copy the code
After receiving a command carrying data, the malware decodes Base64 and writes it to a hidden file system on the hard disk. It then tries to determine whether the received file is a plug-in. If the file is a plug-in, the malware checks its CRC32 check stored in the unused ELF header field and loads the plug-in into memory
The ‘L’ command (input) the ‘L’ command is used by the C&C server to configure the malware and make it load a plug-in. If C&C wanted to configure the malware’s core module, it would send:
L,core,< number of worker threads >,<sleep timeout>,<socket timeout>Copy the code
After receiving this command, the malware will complete all worker threads and then update the number of worker threads, sleep Timeout and Socket timeout. If C&C wants the malware to load a plug-in, it will send:
L,< plug-in file name >,< comma-separated plug-in parameters >Copy the code
If the malware receives this command and another plug-in is already running, the running plug-in will be terminated and the new plug-in will be retrieved from the hidden file system. If the search fails, a file with the plug-in will be requested from C&C via the F directive. The plug-in will then be loaded, initialized, and run
‘Q’ instruction (INPUT & Output) This instruction is used to transfer working data from C&C to malware – and vice versa. If the C&C wants to add a string to the malware’s processing queue, it will send: Q,string. All of these strings are added to the malware’s input queue and will be processed by the running plug-in. If the malware wanted to upload the result of its work, it would send: Q,< plug-in name >, < result string >
It then removes the strings from its output queue
‘P’ command (output) This command is used by the malware to send its current state to the C&C server. The format of this directive is:
P,< task running flag>,,< worker thread count >,< read/write requests from server per second >,< total read/write operands to server since set to 0 >
‘S’ instruction (input)
If the malware receives this instruction it will complete all currently working threads, emptying input and output queues and releasing other system resources. After that, it will be ready to tackle a new task. To sum up, these instructions are as follows: Output instruction: R – echo report F – request file Q – send data P – report status input instruction: G – run new task L – load plug-in Q – send data S – terminate current task
0 x06 configuration
Dynamic link libraries store configuration information in encrypted form in data segments. The decrypted key is also stored in the data segment. First, only the first eight bytes are decrypted, and then the malware checks to see if the next four bytes are equal to 0xDEADBEEF. If so, the first four bytes represent the length of the encrypted data. After that, the rest of the ciphertext can be decrypted. Figure 9 shows the pseudocode for the decryption algorithm
Figure 9. Decryption algorithms used by the malware
We analyzed the code of this algorithm and found that it was an implementation of the XTEA4 encryption algorithm, round 32 of 5, operating mode is ECB6,7. Figure 10 shows a sample of the decryption configuration content
Figure 10. Sample configuration content for decryption
All the samples we analyzed had the same configuration format, with the first part of the configuration containing special flags and offsets to the rest of the configuration array data. The decrypted configuration format is shown in Table 2
Offset | Size in bytes | Description |
---|---|---|
4 | This field contains the number of eight-byte blocks in the configuration — in other words, the length of the configuration in eight-byte blocks | |
4 | 4 | Special marker 0xDEADBEEF |
8 | 4 | Offset to the C&C URL |
12 | 4 | Sleep time between executions of the main loop function of the malware |
16 | 4 | Size of file mapping for the hidden file system |
20 | 4 | Offset to the name of the file that contains the hidden file system |
Table 2 describes the malware configuration
As you can see in Table 2, a C&C address is defined directly in the malware configuration and does not use DGA.
0x07 Hiding a File System
As discussed earlier, the malware uses a hidden file system to store its files, which consists of a file created during initialization. The name of the hidden file system file is defined in the configuration, but its name is usually ‘.sd0 ‘. An open source ‘FAT 16/32 File System Library’ is used to work with this File. But it is not used in the original version, and some functions have been modified to support encryption. Each block is encrypted by 32 rounds of XTEA algorithm ECB mode and each encryption key varies from block to block. This hidden file system is used to store plug-ins and files containing strings to be processed: lists of urls, usernames, passwords, and so on. The contents of a file system instance are shown in Figure 11:
Figure 11. Contents of a file system instance
We developed a simple tool based on open source libraries that can decrypt and extract files from such file systems
Analysis of 0x08 plug-in
As mentioned earlier, this malware has the ability to use plug-ins. In our research we found eight different plug-ins that worked for the bot. Plug-ins and their configuration files are stored in a hidden file system. All of the plug-ins described here were discovered when the malware was deployed and used outside the home. Plug-in interface Each plug-in exports a structure containing two special tags: a pointer to a useful plug-in function and a string containing the plug-in name. Each plug-in contains at least two such Pointers: one to the plug-in initialization function and one to the function that performs “deinitialization.” Two tags in this structure are constants: 0xDEADBEEF and a constant 20130826 which we assume is the plugin version. An example of such a structure is shown in Figure 12:
Figure 12 shows an example of a plug-in structure
Based on the fact that all plug-ins are stored in a hidden file system, none of them are detected by VirusTotal during detection with any anti-virus vector rfiscanso
SHA256 hash sum: 9efed12a67e5835c73df5882321c4cd2dd2 3e4a571e5f99ccd7ec13176ab12cb
This plugin is used to find web sites with remote file inclusion vulnerability (RFI). During initialization, the plugin downloads a list of Pattern patterns and a list of web sites to check. Then it sends a special HTTP request to the site and try to include “http://www.google.com/humans.txt” and analyze the corresponding HTTP response. If the HTTP response contains the ‘We can shake’ substring, the plugin confirms that the site has a remote file inclusion vulnerability. A portion of the list with the Pattern pattern is shown in Figure 13
Figure 13. Some of the patterns used by ‘rFIScan. so’ to find RFI websites
These results are sent to the C&C server using the ‘Q’ command. The meaning of these instructions is shown in Table 3
Command | Description |
---|---|
Q,rfiscan,, | An RFI vulnerability has successfully been found |
Q,rfiscan,,- | RFI had been found after December |
Table 3 describes the ‘Q’ instruction of ‘RFIScan’ plug-in
wpenum.so SHA256 hash sum: 9707e7682dd4f2c7850fdff0b0b33a3f499e93513f025174451b503eaeadea88
This plugin is used to enumerate the username of a WordPress site. The plug-in’s working function takes a URL, converts it, and sends an HTTP request with the following query template: < initial query with the last part removed >/? The author = < user id >
The user ID ranges from 0 to 5. If the corresponding HTTP response contains the substring ‘Location:’ and the destination URL contains the substring ‘/author/’ then the user name is extracted from the destination URL. Use the ‘Q’ command to transfer the first discovered user to the C&C server. The meanings of these instructions are shown in Table 4
Command | Description |
---|---|
Q,wpenum,,, | Username has successfully been found |
Q,wpenum,,,no_matches | No username has been found |
Q,wpenum,,- | Connection failed |
Table 4 describes the ‘Q’ instruction of the ‘Wpenum’ plug-in
cmsurls.so
SHA256 hash sum: 84725fb3f68bde780a6349d0419bec39b03c85591e4337c6a02dcaa87b2e4ea3
The plug-in’s working function receives hostname, constructs an HTTP GET request to assemble the ‘/wp-login.PHP’ query, and then looks for the substring ‘name=”log” in the corresponding response. So this plug-in looks for a user login page on a WordPress CMS based site. The result is sent to C&C through the ‘Q’ command. The meanings of these instructions are shown in Table 5
Command | Description |
---|---|
Q,cmsurls,, | URL for login page has successfully been found |
Q,cmsurls, | URL for login page has not been found |
Q,cmsurls,,- | Connection failed |
Table 5 describes the ‘Q’ command of the ‘cmsurls.so’ plug-in
bruteforce.so
SHA256 hash sum: 6f96d63ab5288a38e8893043feee668eb6cee7fd7af8ecfed16314fdba4d32a6
This plugin is used to brute force passwords on WordPress and Joomla CMS sites. This plug-in does not support HTTPS. During our research, we found a dictionary containing passwords used by the plugin. This dictionary contains 17,911 passwords. These passwords are between 1 and 32 symbols in length.
bruteforceng.so
SHA256 hash sum: 992c36b2fcc59117cf7285fa39a89386c62a56fe4f0a192a05a379e7a6dcdea6
This plugin is also used for bruteforce cracking of site passwords, but unlike bruteforces. so, this plugin supports HTTPS, as well as regular expressions, and can be configured to bruteforce cracking of any landing page. An example of such a configuration is shown in Figure 14
Figure 14. An example of the bruteForceng. so plug-in configuration
We analyzed other configurations of the plugin and found that it was also used to brute force sensitive information in the DirectAdmin control panel
ftpbrute.so SHA256 hash sum: 38ee32e644cb8421a89cbcba9c844a5b482b4524d51f5c10dcb582c3c4ed8101
This plugin is used to brute force FTP accounts
crawlerng.so
SHA256 hash sum: d9d3d93c190e52cc0860f389f9554a86c8c67d56d2f4283356ca7cf5cda178a0
This plug-in is used to crawl WEB pages and extract useful information. Get a list of sites to climb from the C&C server, along with other parameters like crawl depth. The plugin also supports the HTTPS protocol and uses the SLRE 10 library to handle regular expressions. This plug-in is very flexible, and a configuration file for this plug-in is shown in Figure 15. As you can see, this plug-in is used in this example to find and collect drug-related Web pages.
Figure 15. Configuration file for a ‘crawlerng.so’ plug-in
crawlerip.so
SHA256 hash sum: 1fc6a6a98bf854421054254bd504f0b596f01fcb9118a3e525c16049a26e3e11
This plugin is the same as the ‘crawlerng.so’ plugin, except that it uses an IP list instead of a URL list
0x09 Analysis of C&C
In our research we found that three C&C servers were used to manage botnets. We managed to get into two of them and get some statistics. An overview of the C&C management panel is shown in Figure 16. The interface that allows users to add tasks to the BOT is shown in Figure 17
Figure 16 (Bot list displayed in C&C Admin panel)
Figure 17. Other task interfaces in C&C
Together, the two C&C servers control about 1,400 bots. The first botnet contained about 1,100 bots, the second about 300. At the time of the analysis, Botnet’s bots were used to brute force WordPress passwords. A picture of such a brute-force task is shown in Figure 18, and the results of these brute-force tasks are shown in Figure 19.
Figure 18 shows the brute force task in the larger Botnet control panel
Figure 19. Results of some Botnets performing brute force cracking tasks
The geographic distribution of infected servers in botnet is shown in Figure 20. As you can see, the countries with the highest infection rates are the United States, Russia, Germany and Canada.
Figure 20 shows the geographic distribution of infected servers in a larger Botnet.
The darker the blue the more infected the third C&C server was also located by the Malware Must Die 1 team and it was shut down at the time of our analysis we analyzed the two C&C servers that were still running. In addition to the main page, the source code contains two additional PHP scripts: config.php and update.php. The first script contains configuration data: database confidential data, password MD5 for the admin panel, maximum decision time for the task, bot wake up time, and so on. A portion of this script is shown in Figure 21
Figure 21. Data for C&C configuration
The update.php script was used to wake up the bot. This script accesses an idle bot and runs the PHP script mentioned in the section ‘Malware Representation’. We also found that the C&C server supports a certain number of plug-ins that are not found outside. For example, a plug-in exploits the recently released Heartbleed bug and collects information from the vulnerable server. A snippet of code describing all the available plug-ins is shown in Figure 22
Figure 22. This code shows a certain number of plug-ins that we haven’t found out there
C&C uses MySQL and memcached (if available) for data storage, but the plugin is stored on hard disk. We have also found that the C&C script code contains some security issues, but describing these vulnerabilities is beyond the scope of this article
0x10 Comparison with other malware families
In our analysis, we found some common features between Mayhem and other * Nix malware. The malware is similar to ‘Trololo_mod’ and ‘Effusion’ 11 – two hacking tools targeting Apache and Nginx servers respectively. All three malware families have the following in common: • The configuration uses the same format
• Encrypt using THE XTEA algorithm in ECB mode
• The 0xDEADBEEF tag is widely used in configuration files and other code parts
• ELF Headers for dynamically linked libraries is corrupted in the same way
Despite the lack of evidence, we suspect that all three malware families were developed by the same gang. After completing this study, we can safely say that botnets made for * NIx Web servers are becoming more and more popular, just like the trend of malware modernization. Why is that? Here’s why we think:
• Web server Botnets provide a unique model for monetisation through traffic redirection, leason downloads, black hat SEO, and the like
• Web servers have good online time, network access and better performance than regular personal computers
• Auto-update technology is not widely used in the * NIx world, especially when comparing desktops and smartphones. Most webmasters and system administrators need to manually upgrade their software and test to make sure their basic business works. For ordinary sites, professional maintenance is expensive and webmasters don’t have the opportunity to do so. This means that it is easier for hackers to find such vulnerable Web servers and add them to botnet.
• Anti-virus technology is not widely used in the * NIx world. Many carriers do not provide active defense mechanisms or process memory detection modules. Besides, a typical webmaster usually doesn’t want to spend time reading the software’s specs and working out the performance issues that might arise.
Mayhem is a very interesting and sophisticated piece of malware with a flexible and complex architecture. We hope our research can help secure communities combat such threats.
0 x11: thank you.
We wish to thank Fraser Howard and Charles McCathie Nevile for their comments and suggestions that helped us improve this article.
0x12 References
http://blog.malwaremustdie.org/2014/05/elf-shared-so-dynamic-library-malware.html.
http://sysadminblog.net/2013/11/fake-wordpress-plug-ins/.
FortDiscoBruteforceCampaign.
http://www.arbornetworks.com/asert/2013/08/fort-disco-bruteforce-campaign/.
Wheeler,D.; Needham,R.CorrectiontoXTEA.
http://www.movable-type.co.uk/scripts/xxtea.pdf.
http://en.wikipedia.org/w/index.PHP?title=XTEA&oldid=558387953.
Wikipedia.Blockciphermodeofoperation. http://en.wikipedia.org/w/index.PHP?title=Block_cipher_mode_of_operation&oldid=582012907.
Schneier,B.AppliedCryptography.JohnWiley&Sons,1996.
http://ultra-embedded.com/fat_filelib.
https://github.com/freeoks/SD0_reader.
http://slre.sourceforge.net/.
Effusion – anewsophisticatedinjectorforNginxwebservers.
https://www.virusbtn.com/virusbulletin/archive/2014/01/vb201401-Effusion.
http://www.linuxjournal.com/article/7795.
0x13
[1] Bot Definition: Each such compromised device, known as a “bot”, is created when a computer is penetrated by software from a malware (malicious software) Distribution (from http://en.wikipedia.org/wiki/Botnet)
[2] for C&C definition: This server is known as the command – and – control (for C&C) server (from http://en.wikipedia.org/wiki/Botnet)
Definition: [3] the DGA Domain generation algorithm (from http://en.wikipedia.org/wiki/Domain_generation_algorithm)