Extracting type information from Go binaries

27 Oct 2021

minute read

Ivan Kwiatkowski

Authors

Ivan Kwiatkowski

During the 2021 edition of the SAS conference, I had the pleasure of delivering a workshop focused on reverse-engineering Go binaries. The goal of the workshop was to share basic knowledge that would allow analysts to immediately start looking into malware written in Go. A YouTube version of the workshop was released around the same time. Of course, the drawback of providing entry-level or immediately actionable information is that a few subtleties must be omitted. One particular topic I brushed aside was related to the way that Go creates objects.

In this screenshot taken from IDA Pro, we can see a call to the runtime.newobject function, which receives a structure as an argument (here, in the RDX register, two lines above the call). The malware presented in the workshop (Sunshuttle, from the DarkHalo APT, MD5 5DB340A70CB5D90601516DB89E629E43) is straightforward to the extent that it can be understood without paying too much attention to these objects. In the videos, I recommend ignoring these calls and instead focusing on documented Golang API functions. With the help of a debugger, it is easy to obtain the arguments and mentally reconstruct the original source code of the application.
However, Go malware following different coding practices could be littered with this kind of objects, to a point where the reverse engineer has no choice but to understand their nature to figure out what the code is supposed to do. Unfortunately, the contents of the structure passed as an argument to runtime.newobject does not immediately appear to contain useful information:

To find out more about this structure, we need to have a look at the Go source code to find the definition for the rtype structure. At the time of writing, its definition for the latest version of Go is as shown below.

type rtype struct {
    size       uintptr
    ptrdata    uintptr // number of bytes in the type that can contain pointers
    hash       uint32  // hash of type; avoids computation in hash tables
    tflag      tflag   // extra type information flags
    align      uint8   // alignment of variable with this type
    fieldAlign uint8   // alignment of struct field with this type
    kind       uint8   // enumeration for C
    // function for comparing objects of this type
    // (ptr to object A, ptr to object B) -> ==?
    equal     func(unsafe.Pointer, unsafe.Pointer) bool
    gcdata    *byte   // garbage collection data
    str       nameOff // string form
    ptrToThis typeOff // type for pointer to this type, may be zero
}

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

type rtype struct {

size uintptr

ptrdata uintptr // number of bytes in the type that can contain pointers

hash uint32 // hash of type; avoids computation in hash tables

tflag tflag // extra type information flags

align uint8 // alignment of variable with this type

fieldAlign uint8 // alignment of struct field with this type

kind uint8 // enumeration for C

// function for comparing objects of this type

// (ptr to object A, ptr to object B) -> ==?

equal func(unsafe.Pointer, unsafe.Pointer) bool

gcdata *byte // garbage collection data

str nameOff // string form

ptrToThis typeOff // type for pointer to this type, may be zero

}

There are two fields in this structure that are relevant to us. The first one is “kind”, which is an enum (defined in the same file) representing a sort of base type for the object: Boolean, integers of various lengths, but also arrays, maps, interfaces, etc. The other is “nameOff”, which is a pointer to a string representation of the described type for the purposes of reflection. The latter is extremely useful to reverse engineers, as it immediately tells us what the object is. This structure can itself be contained in specialized ones for interfaces, maps, and so on.

Alas, the result of creating these structures in IDA Pro and applying the correct one to the newobject argument is somewhat underwhelming:

Where is our human-readable name? It turns out that the offset provided by nameOff is relative to the .rdata section of the PE in the case of Windows programs – this is something you can confirm with a hex editor.

The offset leads us to another structure, which contains some information about the string, including its size, and finally, the string itself. Initially, the size of the string had a fixed length (2 bytes), but that appears to have changed in Go 1.17 (now varint-encoded). Nonetheless, the coveted information lies here: the object instantiated in our original newobject call was an md5.digest, which we can now look up in the documentation if needed.

Go programs may contain hundreds of these calls, and newobject is not the only function that relies on these rtype structures (i.e. runtime.makechan, runtime.makemap, etc.), so it is obviously impractical to manually look up each type using a hex editor. Enter IDA scripting! It is, in fact, possible to entirely automate this operation by writing a few lines of Python.

The script I use in my daily work has been included in SentinelOne’s recently released AlphaGoLang repository, as step 5 of the process. It performs the following actions:

Inspect all the calls to functions, such as newobject, and look at their arguments to find rtype
Apply the structure shown above to those bytes in IDA to make them easier to read.
Look up the corresponding string representation for the type and add it as a comment wherever it is referred to.

One thing the script struggles with a little is figuring out how the string size is encoded, as I was not able to find an easy way of determining the Go version from a Python script (yet). Should this cause problems, the many comments should allow you to update the script to fit your use case. If you are new to IDA scripting, I would also recommend that you go have a look at the source code, as it is a great example of the many things you can do with the Python API! And if you would like to learn even more on the subject (and more) with detailed video tutorials, please consider signing up for our online reverse-engineering course on the Xtraining platform.

I hope you find the script useful! Feel free to report any bugs or submit fixes and updates on GitHub!

Extracting type information from Go binaries

Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Jayme Silvestri

Posted on November 11, 2021. 6:17 am

Gentlemen……I think the software on a computer is called a programme.

I’m astonished to hear some one reasonably technically capable calling soft ware an “”””app”” (!!!!!)…………….. that is for 16 year olds with toy phones…….

Viz a vai you have a computer programmer……..I haven’t yet heard of computer “”app”” writer??

Reply

Latest Posts

Latest Webinars

Reports

Kaspersky researchers analyze updated CoolClient backdoor and new tools and scripts used in HoneyMyte (aka Mustang Panda or Bronze President) APT campaigns, including three variants of a browser data stealer.

Kaspersky discloses a 2025 HoneyMyte (aka Mustang Panda or Bronze President) APT campaign, which uses a kernel-mode rootkit to deliver and protect a ToneShell backdoor.

Kaspersky GReAT experts analyze the Evasive Panda APT’s infection chain, including shellcode encrypted with DPAPI and RC5, as well as the MgBot implant.

Kaspersky expert describes new malicious tools employed by the Cloud Atlas APT, including implants of their signature backdoors VBShower, VBCloud, PowerShower, and CloudAtlas.

Extracting type information from Go binaries

Comment

GReAT Ideas. Balalaika Edition

GReAT Ideas. Green Tea Edition

GReAT Ideas. Powered by SAS: malware attribution and next-gen IoT honeypots

GReAT Ideas. Powered by SAS: threat actors advance on new fronts

GReAT Ideas. Powered by SAS: threat hunting and new techniques

‘Unpacking’ technical attribution and challenges for ensuring stability in cyberspace

DarkHalo after SolarWinds: the Tomiris connection

The leap of a Cycldek-related threat actor

The future of cyberconflicts

IAmTheKing and the SlothfulMedia malware family

QakBot attacks with Windows zero-day (CVE-2024-30051)

How to train your Ghidra

OpenTIP, command line edition

Cybersecurity Research During the Coronavirus Outbreak and After

How we developed our simple Harbour decompiler

Latest Posts

The SOC Files: Time to “Sapecar”. Unpacking a new Horabot campaign in Mexico

Free real estate: GoPix, the banking Trojan living off your memory

BeatBanker: A dual‑mode Android Trojan

Exploits and vulnerabilities in Q4 2025

Latest Webinars

SOC: Build, buy, or hybrid?

Inside the dark web job market: Their talent, our threat

Hunt Hub: Opening the black box of EDR detection

Signal in the noise. What 2025 hacktivism reveals about the modern threat landscape

Reports

HoneyMyte updates CoolClient and deploys multiple stealers in recent campaigns

The HoneyMyte APT evolves with a kernel-mode rootkit and a ToneShell backdoor

Evasive Panda APT poisons DNS requests to deliver MgBot

Cloud Atlas activity in the first half of 2025: what changed

Subscribe to our weekly e-mails