A year ago, Apple computer users were mostly design and DTP professionals, photographers and musicians. Last year, however, was a breakthrough year for the Mac in many ways. After Apple announced plans to manufacture computers with Intel processors, many began to look into Apple computers and consider them for home use. Software developers also noticed the growing popularity of Mac OS X and began to sell their own products for the new platform.
Nevertheless, Mac OS X is still poorly understood and a bit of a mystery both for users as well as IT security experts. This article aims to help readers better understand the features of Mac OS X which are critical when researching malicious programs designed for this operating system.
It should be noted that Mac OS X is a Unix-type operating system and has many of the features found in other Unix-type systems. This article will therefore be more accessible to those with experience with systems like Linux or FreeBSD. Some experience in researching programs for any kind of operating system will also help in understanding this article.
Key features of Mac OS X
It always helps to know about the key features of an operating system when analyzing programs, including malicious ones. Many features of Mac OS X are due to its origins: Mac OS X is based on Unix, which is evident in OS X design and overall system principles. The operating system inherited interprocessor interaction from Mach and network stacking from BSD.
Support for system calls from Mach and BSD systems
Xnu - the Mac OS X kernel - is based on the Mach and FreeBSD kernels but also includes features from MkLinux, NetBSD, OpenBSD and several Mach development projects. Mac OS X supports system calls from both Mach and BSD systems. Since the OS X kernel is equally based on Mach and FreeBSD, the Mac OS X kernel xnu contains two system call tables (Mach and BSD) and supports API for BSD and Mach systems.
In order to provide at least some legacy support for previous operating systems, Mac OS X has a runtime environment with three components:
- Dyld Runtime Environment, based on the dyld dynamic loader.
- CFM Runtime Environment. This OS 9 legacy provides support for applications that cannot be launched by dyld but which use Mac OS X capabilities. This is implemented in the Carbon library.
- Classic Runtime Environment is used for launching older applications for OS 9 in OS X.
A range of applications can therefore be launched in Mac OS X, including older versions of the Mac operating system.
The Mach-O executable file format
In Mac OS X, almost all files that contain executable code, including applications, libraries, and kernel modules, are in Mach-O file format.
The Mach-O format was not originally developed by Apple; it was designed by the Open Source Foundation for the OSF/1 operating system (which is based on Mach) and adapted by Apple for the x86 architecture as part of the OpenStep project.
The Mach-O file format and Application Binary Interface (ABI) specifications describe how an executable should be loaded and launched by the kernel. They pass the following information to the operating system:
- how the dynamic loader works,
- how to load separate libraries,
- how to organize a process's address space,
- where to find the entry point,
Since Mach-O is the main format for executable files in Mac OS X, let's take a more detailed look at its structure.
Mach-O files can be roughly divided into three parts: the header, load commands and segments that may be comprised of several sections. The header and the load commands describe a file's main features, and the data segment contains a set of bytes that link to the load commands.
The Header. The first four bytes in the header determine the so-called magic number, which identifies the file as either a 32- or 64-bit file. It also helps determine the byte order for the CPU. The header determines the architecture for which the file has been compiled. This helps the kernel guarantee that files will be launched only on the platform for which the file was compiled. Sometimes binary files may contain code for more than one architecture. This format is known as Universal Binaries. In this case, the file will start with a fat header.
Load commands. The load commands area contains a list of commands that tell the kernel how to load different file segments. These commands describe how each segment is balanced in memory, what access rights it has and where it is located in memory.
Segments and sections. Mach-O format executable files usually have 5 segments:
- __PAGEZERO is located at the zero virtual address and does not have any kind of protection. This segment does not have an area in the file on disk.
- __TEXT contains data which can only be accessed for reading or execution.
- __DATA contains data which can be written to. This section is marked as copy-on-write.
- __OBJC contains data used for Objective-C runtime environments.
- __LINKEDIT contains data used to establish dynamic connections.
The __TEXT and __DATA segments contain zero or more sections. Each section contains a certain kind of data, for example executable code, constants, strings, etc. That way, executable and nonexecutable code is stored within the same segment, but separate from each other.
Program analysis utilities
There are two main approaches when analyzing programs: dynamic and static. Dynamic analysis involves launching program code within a debugger or in a virtual environment and analyzing its behavior. Static program analysis is uses a disassembler without the code being launched.
Which approach is best depends on the individual situation. The methods are not mutually exclusive and are often used to complement each other.
Dynamic analysis utilities
As with most Unix systems, Mac OS X offers a number of utilities that can help in dynamic analysis of applications and system diagnosis. Many of them came to Mac from Unix, but there are also programs that were designed exclusively for Mac OS X. Below are brief descriptions of several utilities that can be installed from the Mac OS X distributive.
The utilities can be divided into two categories.
- Utilities used to examine processes:
- fs_usage -- provides information about system calls pertaining to file system activity;
- heap -- lists all memory blocks allocated to dynamic memory by a separate process;
- lsof -- displays files opened by various processes;
- top -- displays usage statistics for different system resources;
- vm_stat -- displays virtual memory usage statistics;
- gdb -- a debugger that makes it possible to remotely debug programs;
- ddb -- a kernel debugger that requires connection via a serial port;
- ktrace -- tracks information about system events at the kernel level for specified processes;
- kdump -- displays information generated by the ktrace program;
- sc_usage -- displays statistics for a specified process, such as use of processor time, use of system calls, etc.
- Network utilities.
The system utilities listed below are well known in the Unix world.
- netstat -- provides a range of data relating to the network subsystem;
- tcpdump -- displays network traffic.
Many other network tools that are well known to Unix users can also be used by Mac OS X, such as nmap and WireShark.
Readers should note that most Unix-based programs with open source code can easily be adapted to run on Mac OS X. An experienced Unix user will be able to create a work environment that is nearly identical to the familiar Unix environment.
Static analysis utilities
Quite often there is no opportunity to actually launch the program which is being analyzed, and sometimes it is best to avoid launching a program for security reasons. That is why sometimes file disassembly is resorted to when analyzing software.
Since the executable file format for Mac OS X is Mach-O, there are some things that need to be taken into account in relation to static analysis in this operating system.
The primary utility for analyzing Mach-O files is the otool program, which can be used to get information on the file header, load commands and the entry point. It can even be used to disassemble the contents of sections that contain executable code.
- file -- determines file type;
- otool -- used to analyze Mach-O files;
- xxd -- makes it possible to translate binary files into hexadecimal format and vice versa;
- IDAPro -- a disassembler.
With the growing popularity of Mac OS X running on Intel, many software manufacturers have released versions of their applications for this platform. The new 5.1 version of Mac OS X includes IDAPro in its list of supported operating systems.
This will make life for users migrating from Windows much easier. Sometimes using IDAPro makes it possible to execute certain tasks more quickly and easily than using standard Mac OS X tools.
Analyzing malicious code: examples
In order to provide some examples, we will analyze IM-Worm.OSX.Leap and Virus.OSX.Macarena using some of the utilities named above. Note that the malicious programs we examine below are proof of concept only; they do not have any malicious payload and do not represent a serious threat. Their primary purpose is to demonstrate that it is possible to create such a program.
Leap cannot spread via the Internet. Instead, it uses iChat to spread. At the first stage, Leap spreads using the iChat application as a link to a file on RapidShare, claiming that the file contains screenshots of the latest version of Mac OS X, Leopard. For the victim machine to become infected, the recipient has to click on the link, confirm file download, unpack the file and then open it. Once the computer is infected, the file will send itself (without any modifications) to the user's entire contact list via Bonjour.
Leap spreads as a file named latestpics.tgz. When this file is unpacked in Finder, it will be appear to be a jpeg file.
Since Leap uses Spotlight, it will run only under Tiger (Mac OS X version 10.4.x). In order to launch, Leap requires InputManager, although InputManager does not work on systems for x86. Furthermore, the binary file contains code only for PowerPC. As a result, Leap works only on computers running PowerPC.
To start the analysis, we need to determine the actual format of the latestpics file. First, we launch the file utility with the argument latestpics #file latestpics. The result shows that this file is actually a Mach-O file.
Then we use otool to view the header of the binary file: #otool -h latestpics.
After that, we can see Leap's entry point. The entry point in a Mach-O format file can be found using #:otool -l latestpics, which displays the load commands. In this case, the interesting command is LC_UNIXTHREAD, which displays the initial state of the main thread of the process. On a PowerPC, we will want to find the contents of the srr0 registry - this is the entry point.
Next, we use nm - a utility familiar to all Unix users - to view a list of all symbols in the binary file, including the functions listed below. Their names speak for themselves and confirm that this is a potentially malicious file:
Now we can take a closer look at the code, again using the otool -vt utility, which will allow us to view the contents of the section in the __TEXT segment, where the executable code of the latestpics file is located:
Strings are sent to system functions, but they are encrypted with the _xor function:
After decrypting the strings, we get the following result:
/bin/rm -rf /Library/InputManagers/apphook
/bin/mv -f /tmp/apphook /Library/InputManagers
/bin/rm -rf ~/Library/InputManagers/apphook
/bin/mv -f /tmp/apphook ~/Library/InputManagers
%s/Contents/MacOS/%s /bin/cp '%s' '%s/..namedfork/rsrc'
/bin/cp -f '%s' '%s'
(kMDItemKind == 'Application') (kMDItemLastUsedDate >= $time.this_month) /usr/bin/ditto %s /tmp/latestpics
/usr/bin/gzip -f -q /tmp/latestpics
An analysis of these strings tells us what IM-Worm.OSX.Leap actually does:
- It copies itself to /tmp as latestpics;
- It then creates a tgz file;
- It extracts Input Manager, called "apphook.bundle", and copies it to /tmp;
- If the uid is 0, then the directory /Library/InputManagers/ is created; any existing apphooks are deleted, and the new apphook is copied from /tmp;
- If the uid is not 0, then ~/Library/InputManagers/ is created;
- Now when any Mac OS X application is launched, the new apphook will be loaded into its address space;
- An attempt will then be made to send latestpics.tgz via iChat each time an application is launched.
In order to start analysis, the file format has to be identified. As above, we will use the file utility.
The results show that this is a Mach-O format file.
We then use otool to view the file header and determine the entry point:
Careful examination reveals an unusual entry point at a zero address. The next step should be an analysis of the code, starting from the entry point. But now we have a bit of a problem. We cannot disassemble this part of the Mach-O file using otool, since otool can only be used to analyze code in the text section of the __TEXT segment.
In this situation, we can use IDAPro. But first we need to upload the file to IDA as a binary file. Then the file can be dissassembled.
Macarena is the first virus that actually infects Mach-O format files in the current directory.
When the infected file is analyzed, the following points are noticeable.
The virus changes the entry point to a zero address. This is where the __PAGEZERO segment loads in the Mach-O format. As noted above, when examining the structure of Mach-O files, __PAGEZERO does not have any place in the file on disk. That is why the code writes itself to the end of the file on disk. This technique may have an unexpected effect: applications like gdb, IDA and otool won't display the virus code.
Macarena is a relatively simple virus. When it is run, it shuffles through the files in the current folder and infects Mach-O format files for x86 architecture. There are newer versions of this virus that also infect ppc files, but these variants differ very little from the original.
This virus does not have any other payload.
Mac OS X is continuing to win over consumers. Although the only malicious programs for the operating system are proof of concept, malicious users will increasingly focus on Mac OS X as the number of users grows. This means it will be necessary to analyze malicious programs for Mac OS X on a more frequent basis.
Happily, Mac OS X has many tools which can be used both to analyze other programs and for system diagnosis in general. Furthermore, more third-party programs are emerging which can be used both by IT experts and amateur researchers alike.