Working on an efficient generic shellcode detection engine and verifying results with randomly generated input, I’ve effectively ended up fuzzing different open source disassembler libraries. The disassembler library of choice for my current project is libdasm because of its comparatively long history and public domain license. But writing a sound and complete x86 disassembler is obviously not a trivial task due to the complex nature of the x86 instruction set.
libdasm used to have issues correctly disassembling certain floating point instructions in the past, but this was simply caused by an off-by-three error in the opcode lookup tables (three NULL rows missing) and thus the fix was comparatively easy.
What I stumbled across today seems not to be a opcode specific issue but instead a bug in decoding instructions correctly. When libdasm disassembles instructions with a 16-bit address prefix, it decodes the address immediate wrong:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
[~] Verifying shellcode candidate offset 8eb0f0 008fe0f0[ 67a02232e830] > mov al,[0x30e83222] 008fe0f6[ 61] > popa 008fe0f7[ f9] > stc 008fe0f8[ ff4038] > inc [eax+0x38] 008fe0fb[ b269] > mov dl,0x69 008fe0fd[ 52] > push edx 008fe0fe[ 3f] > aas 008fe0ff[ 5e] > pop esi 008fe100[ 1a3dc31168aa] > sbb bh,[0xaa6811c3] 008fe106[ 59] > pop ecx 008fe107[ 9c] > pushf 008fe108[................] < |
The instruction at the virtualized guest’s memory address 008fe0f0
is not decoded correctly:
67
is the previously mentioned 16-bit address size prefixa0
is the opcode formov al, moffs8
2232
is the 16-bit address that should be interpreted as the operande830
does not belong to this instruction
Just like you should always consult a second doctor about exotic diseases, I gave udis86, a different disassembler library, a shot:
1 2 3 4 |
$ udcli -noff -32 -s `python -c 'print 0x8eb0f0'` -c 10 shellcode/urandom.bin 67a02232 a16 mov al, [0x3222] e83061f9ff call 0xfffffffffff96139 40 inc eax |
Nice, the mov
instruction got disassembled correctly this time. And since e830
is not interpreted as part of mov
‘s immediate anymore, it now correctly disassembles as a call rel32
instruction. Unfortunately, udis86 is a x86-64 aware disassembler and internally sign-extends the operand to call
, yet again giving incorrect disassembly.
So what does my CPU actually execute and see? Since this is part of a virtualization / emulation code anyway, we can simply add a cc
breakpoint to the block’s prologue and step through it with gdb (omitting some junk):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
Program received signal SIGTRAP, Trace/breakpoint trap. (gdb) disas $eip, $eip+5 => 0x0804b0c1: jmp 0x804b134 (gdb) si (gdb) disas $eip, $eip+10 Dump of assembler code from 0x804b134 to 0x804b13e: => 0x0804b134: addr16 mov 0x3222,%al 0x0804b138: call 0x7fe126d 0x0804b13d: inc %eax End of assembler dump. (gdb) si (gdb) si (gdb) disas $eip, $eip+10 Dump of assembler code from 0x7fe126d to 0x7fe1277: => 0x07fe126d: Cannot access memory at address 0x7fe126d |
So the CPU really sees a call instruction and tries to execute it. In this particular case, this would have been a devestating scenario as it would allow a privilegue escalation vulnerability for arbitrary user input, likely shellcode, to break out of the virtualization isolation. For this specific approach to work correctly, all control flow modifying instructions like call
must be emulated in software. If we however do not see such an instruction in the disassembly, we cannot handle it correctly.
After patching libdasm (which turned out to ignore address size prefixes for operand parsing entirely), the disassembly is correct:
1 2 3 4 5 6 |
[*] 543 shellcode candidate offsets [~] Verifying shellcode candidate offset 8eb0f0 008fe0f0[ 67a02232] > mov al,[0x3222] 008fe0f4[................] < Emulating 008fe0f4: call 0x894229 Emulating CALL instruction from 8fe0f9. |
Lessons learned today:
- Fuzzing your software with random input as a part of your testing process is always a good idea and like in this case can always reveal interesting vulnerabilities. Exploiting this particular case would have still been very hard, since the code segment descriptor and the data segment descriptors where pointing to different base addresses, but a skilled attacker could have succeeded nevertheless.
- The public version of libdasm incorrectly disassembles all instructions with a address size override prefix. This will result in interesting attack vectors against some projects using libdasm. Look out for a patch for libdasm!
Different x86 Bytecode Interpretations