How we developed our simple Harbour decompiler

Every once in a while we get a request that leaves us scratching our heads. With these types of requests, existing tools are usually not enough and we have to create our own custom tooling to solve the “problem”. One such request dropped onto our desk at the beginning of 2018, when one of our customers – a financial institution – asked us to analyze a sample. This in itself is nothing unusual – we receive requests like that all the time. But what was unusual about this particular request was that the sample was written in ‘Harbour’. For those of you who don’t know what Harbour is (just like us), you can take a look here, or carry on reading.

Harbour is a programming language originally designed by Antonio Linares that saw its first release in 1999. It’s based on the old Clipper system and is primarily used for the creation of database programs.

Understanding Harbour’s internals

Let us take a “Hello, world” example from the Harbour repository tests – hello.prg.

Figure 1: Harbour version of “Hello, world!”

It prints the “Hello, world!” message to the terminal. For that, you need to build a Harbour binary to execute the code (there are also other ways to run it without building a binary, but we chose this path because the received sample was an executable).

Compiling is as simple as calling:

hbmk2.exe hello.prg

This command will generate C code and compile the C code into the final executable. The generated C code for hello.prg can be found in Figure 2.

Figure 2: Generated C code for hello.prg

You can see that the MAIN function starts the Harbour virtual machine (HVM) function hb_vmExecute with two parameters: the pcode, precompiled Harbour code; and the symbols, which are defined by a different macro above the MAIN function. As you can imagine, the pcode (portable code) contains the instructions that are interpreted by the HVM. You can find the official explanation of the pcode in the link below.

After the C program has been compiled (by MINGW in our case), we find almost the same structures inside it: symbol table and pcode (Figure 3 and Figure 4 respectively).

Figure 3: Harbour symbols table of hello.exe

Figure 3: Harbour symbols table of hello.exe

Figure 4: Harbour pcode of hello.exe

Figure 4: Harbour pcode of hello.exe


Back to our sample. We now know that we need to find the pcodes and symbols in the executable and see which opcodes belong to which instruction. If we do this, we can get a fair grasp of how the program works. As you can probably guess, there were no readily available tools to do this. So, we wrote our own.

The aim of our decompiler is to make the bytecode readable by a human. We chose to mix the resulting pseudocode in assembler with C (it’s still hard to read in some places, but was fine for our purposes).

Figure 5 – The decompiled output of the hello.prg program

Figure 5 also shows that Harbour is a stack-based compiler. The first push argument is the function name, after which the variables are pushed, followed by the call 1 command, where 1 is the number of function parameters to be popped. The above lines can thus be interpreted in C pseudocode as:

We hope this decompiler makes analyzing samples written in Harbour a little bit easier for others as well.

How we developed our simple Harbour decompiler

Your email address will not be published. Required fields are marked *



LuminousMoth APT: Sweeping attacks for the chosen few

We recently came across unusual APT activity that was detected in high volumes, albeit most likely aimed at a few targets of interest. Further analysis revealed that the actor, which we dubbed LuminousMoth, shows an affinity to the HoneyMyte group, otherwise known as Mustang Panda.

WildPressure targets the macOS platform

We found new malware samples used in WildPressure campaigns: newer version of the C++ Milum Trojan, a corresponding VBScript variant with the same version number, and a Python script working on both Windows and macOS.

Ferocious Kitten: 6 years of covert surveillance in Iran

Ferocious Kitten is an APT group that has been targeting Persian-speaking individuals in Iran. Some of the TTPs used by this threat actor are reminiscent of other groups, such as Domestic Kitten and Rampant Kitten. In this report we aim to provide more details on these findings.

Andariel evolves to target South Korea with ransomware

In April 2021, we observed a suspicious Word document with a Korean file name and decoy. It revealed a novel infection scheme and an unfamiliar payload. After a deep analysis, we came to a conclusion: the Andariel group was behind these attacks.

Subscribe to our weekly e-mails

The hottest research right in your inbox