Dissecting ELF files

Thumbnail

Introduction

Vous me demandez si vos vers sont bons. Et c'est moi que vous interrogez. Vous avez, auparavant, demandé leur avis à d'autres gens. Vous avez envoyé ces vers à des revues. Vous les comparez à d'autres poèmes, et vous êtes inquiet lorsque certaines rédactions refusent vos essais. Puisque vous m'avez autorisé à vous donner quelque conseil, je vous prierai de cesser tout cela. Votre regard est tourné vers l'extérieur, et c'est d'abord cela que vous ne devriez désormais plus faire. Personne ne peut vous conseiller ni vous aider, personne. Il n'existe qu'un seul moyen: plongez en vous-même, recherchez la raison qui vous enjoint d'écrire; examinez si cette raison étend ses racines jusqu'aux plus extrêmes profondeurs de votre cœur; répondez franchement à la question de savoir si vous seriez condamné à mourir au cas où il vous serait refusé d'écrire. Avant toute chose, demandez-vous, à l'heure la plus tranquille de votre nuit : est-il nécessaire que j'écrive? Creusez en vous-même en quête d'une réponse profonde. Et si elle devait être positive, si vous étiez fondé à repondre à cette question grave par un puissant et simple « je ne peux pas faire autrement », construisez alors votre existence en fonction de cette nécessité.

Rainer Maria Rilke

The Executable and Linkable Format (ELF) [1] is the common binary format on many Unix-like systems.

This article is a hands-on primer on 64-bit ELF. Instead of jumping between readelf, objdump, and documentation, you can explore the same structures in one place: header, program headers, section headers, symbols, and relocations.

Have fun.

ELF header

The format comes with structures that differ in 32-bit and 64-bit architectures, this is noticable in the naming convention used (i.e: Elf64_Ehdr). The ELF Header (Ehdr) is the principal structure that exists at the start of the file.

The architecture (e_machine) we're focusing on in this article is EM_X86_64, the types of that are interesting for us are ET_EXEC (executables) and ET_REL (relocatable objects). Executables provide an entry point, this address contains code that runs first.

You might expect it to point to main(), but that'd be wrong. In fact, gcc links against "startup files" that define a function called _start, it does initialization (__libc_start_main), and calls main at the end with the famous arguments int argc and char** argv. (You thought they came from the sky ?)

_start
 \
  __libc_start_main_impl
   \
    exit(main(argc, argv))
Let's override e_entry with a function called fn, we'll omit the startup objects with -nostartfiles and indicate to the linker our entry point (The function must not "return" like a normal C function, there is no caller to return to, so you should terminate the process explicitly via exit()).
#include <stdlib.h>

void fn()
{
    // Perform main
    // logic before
    // terminating.

    exit(EXIT_SUCCESS);
}
gcc -nostartfiles -Wl,-efn entry.c
This primary header also describes where to find the program and section header tables in the file. For the program header table, the file offset is stored in e_phoff, it contains e_phnum entries, and each entry is e_phentsize bytes long. The section header table follows the same pattern using e_shoff, e_shnum and e_shentsize.

The field e_shstrndx stores the index of the section header that points to the section-name string table (.shstrtab), which is used to resolve section names.

Mental model

To help understand the difference between program headers and section headers, it's important to look at the bigger picture, one is required and the other is optional, one is used during runtime, the other provides symbols and debug information during build-time.

If you debug a binary you compiled yourself, it often still contains symbol information. In that case, when you ask gdb to disassemble main, it can locate it by name because the binary includes a symbol table (named .symtab).

Disassembly of the main() function

These symbols make debugging much easier, but they also make it easier for others to understand and reverse-engineer your program. For production releases, you may want to reduce the amount of symbol and debug information therein. One common approach is to remove the full symbol table using strip, producing a "stripped" executable. strip main -o main_stripped Stripping a binary from symbols

You can see that readelf still prints section names because the section-header string table (.shstrtab) is still present. And while the regular symbol table may be removed, the dynamic symbol table (.dynsym, with its strings in .dynstr) remains because it's needed at runtime to resolve symbols like __libc_start_main, __gmon_start__, and puts.

Let's use radare to patch the ELF file. We'll hide (not remove) the section headers to show they're optional and that the program can run without them. To do that, we'll zero out the three fields e_shoff, e_shnum, and e_shstrndx.

Hiding section headers in an ELF

And, it runs !

Executable runs even without section headers
"But wait" I hear you say, "didn't we say that .dynsym and .dynstr are important?"
Yes, they're still referenced by a program header of type PT_DYNAMIC.

Note: if you want to remove sections that aren't referenced by the program headers, use the sstrip tool.

Program headers

Thus, we understand that program headers are essential for executing an ELF binary. Each program header describes a segment, loadable or not, and in 64-bit ELF files, the structure is Elf64_Phdr (see elf.h [2]).

The table below helps you explore the program segments. Move between entries using the next and previous controls, and click inspect to open a hex view of the data referenced by the selected header.

They act as instructions for the loader, guiding the operating system as it maps the executable into memory and starts execution. In the following subsections, we'll walk through the different types and explain what each one does.

PT_NULL

The loader ignores PT_NULL entries.

An image of PT_NULL ignored

PT_PHDR

This segment references the program header table itself.

PT_LOAD

Loadable segments are represented as PT_LOAD entries, each describes where the segment should appear in memory (p_vaddr), its size and alignment (p_memsz, p_align), and the access permissions in p_flags (PF_R (read), PF_W (write), PF_X (execute)).

// The size, aligned to
// page boundary
// (1024).
size_t size;
size = phdr->p_memsz + (phdr->p_align - (phdr->p_memsz % phdr->p_align));
        
mmap(
    // The virtual address at
    // which the segment
    // starts.
    phdr->p_vaddr,
    size,
    // The protections of
    // the page, at first
    // data needs to be
    // copied.
    PROT_READ | PROT_WRITE,
    MAP_FIXED | MAP_ANONYMOUS,
    -1,
    0);

At load time, the loader maps the segment and copies p_filesz bytes from the file starting at p_offset into memory, and if p_memsz is larger than p_filesz, the remaining bytes are zero-filled.
fread(
    // Read directly to
    // the mmap()'d
    // region.
    phdr->p_vaddr,
    // The number of
    // bytes to read
    // from the file.
    phdr->p_filesz, 1,
    // The file pointer
    // to the ELF.
    fp);

int prot = 0;
prot |= (phdr->p_flags & PF_X) ? PROT_EXEC : 0;
prot |= (phdr->p_flags & PF_R) ? PROT_READ : 0;
prot |= (phdr->p_flags & PF_W) ? PROT_WRITE : 0;

// The protection is
// applied after
// reading data.
mprotect(phdr->p_vaddr, size, prot);

PT_INTERP

This executable can't really run in isolation. Due to the complexity of the structures involved, it relies on a dynamic linker/loader, which maps the executable and its shared-library dependencies into memory, resolves external symbols, applies relocations, before transfering control to the program's entry point.

The kernel determines which dynamic linker to invoke from the PT_INTERP program header: it contains the filesystem path to the loader. The kernel maps it and transfers control to it, in glibc, its startup routine is _dl_start().

PT_DYNAMIC

This segment provides all information required for dynamic linking at runtime, it's an array of Elf64_Dyn entries, each starting with a tag that tells what its value holds, the tags include:

Even when section headers are stripped, PT_DYNAMIC provides everything needed for loading dependencies, resolving symbols and applying relocations.

PT_GNU_STACK

PT_GNU_STACK is a GNU extension program header that tells the loader what memory protections to use for the stack in p_flags. In the past, attackers could place injected code on the stack (via a local buffer overflow or crafted environment variables) and redirect control flow to it. Modern defenses, most notably non-executable (NX) stacks, together with ASLR killed classic "stack shellcode" attacks.

case PT_GNU_STACK:
    stack_flags = ph->p_flags;
    break;
gcc -Wl,-z,execstack main.c # Enable executable stack.

PT_GNU_RELRO

Because of defenses like stack canaries, attackers had to find other routes to control the instruction pointer, their attention turned to overwriting GOT entries (to redirect indirect calls) or function pointers stored in .fini_array (destructors executed at exit).

case PT_GNU_RELRO:
    l->l_relro_addr = ph->p_vaddr;
    l->l_relro_size = ph->p_memsz;
    break;
Partial RELRO (enabled by default) protects sensitive sections by making them read-only, while leaving part of .got.plt writable to support lazy binding, in this example, it's the last sizeof(void*)=8 bytes for puts().

Sections affected by partial RELRO
Sections affected by partial RELRO

For Full RELRO, all symbols are resolved at startup and the entire GOT becomes read-only. gcc -Wl,-z,relro,-z,now main.c # Enable Full RELRO.

Section headers

Section headers are optional metadata used mainly for linking, symbols, and debugging. Each entry in the section header table is an Elf64_Shdr. The sections marked as SHF_ALLOC fall within ranges that are mapped by loadable segments.

The section header field sh_name is an offset into the section-name string table (.shstrtab), which is typically not mapped at runtime (it lacks SHF_ALLOC). The ELF header field e_shstrndx stores the index of .shstrtab in the section header table.

When you navigate to a SHT_DYNSYM or SHT_SYMTAB section, the symbol table view updates to show its entries. When you navigate to a SHT_RELA section, the relocation table view updates instead. This lets you inspect each section's contents directly as you browse.

String tables

The SHT_STRTAB sections are string tables, null-terminated strings addressed with an offset: .shstrtab contain section names, .strtab names symbols in .symtab (often stripped), and .dynstr names symbols in .dynsym (used at runtime by the dynamic linker).

Symbol tables

A symbol table is an array of symbol entries (Elf64_Sym). Each one stores its name as an offset in st_name. That offset is resolved in the string table linked by the symbol table section header (sh_link).


The st_info field is a single byte that packs two values: the symbol's binding and type.
#define ELF32_ST_BIND(val) (((unsigned char) (val)) >> 4)
#define ELF32_ST_TYPE(val) ((val) & 0xf)
#define ELF32_ST_INFO(bind, type) (((bind) << 4) + ((type) & 0xf))
The symbol type indicates what it represents. The binding determines the symbol's linkage. If st_shndx is SHN_UNDEF, the symbol is undefined here and must be resolved externally (an import). Otherwise, st_shndx points to the defining section and st_value gives its address or section-relative offset.

Relocations

Relocation sections are either SHT_REL or SHT_RELA (similar, but has an extra field called r_addend in each relocation entry). Before reaching main, the loader performs calculations based on the relocation type and stores results in r_offset for entries in .rela.dyn.

#define ELF64_R_SYM(i) ((i) >> 32)
#define ELF64_R_TYPE(i) ((i) & 0xffffffff)
#define ELF64_R_INFO(sym,type) ((((Elf64_Xword) (sym)) << 32) + (type))
Relocation types determine the formula that the loader uses to calculate the value to place at the relocation site (r_offset), the variables used in these formulas are:

R_X86_64_64 S + A
R_X86_64_PC32 S + A - P
R_X86_64_GLOB_DAT S
R_X86_64_JMP_SLOT S
R_X86_64_RELATIVE S
R_X86_64_GOTPCRELX GOT + G + A - P

Because this file is a fully linked executable (ET_EXEC), the relocation sections you see (.rela.dyn and .rela.plt) contain dynamic relocations, their sh_link field points to the dynamic symbol table (SHT_DYNSYM), these entries are intended for the runtime loader, and r_offset is the virtual address of the location to patch.

Lazy binding

A program can use multiple external functions, and it'd be expensive to resolve all symbols at once during loading process, since sometimes, the code invoking or using such entries might not even be reached.

To optimize this process, lazy binding exists, it uses the concept of PLT (Procedure Linkage Table) which we'll talk about shortly, and uses the relocation entries, the link_map and a function that performs symbol resolution upon request (_dl_runtime_resolve).

This structure holds information about loaded files, starting from the main executable and references through double-linked list pointers, the libraries it depends on (the dynamic linker, VDSO and libc), or one that is force-loaded using dlopen().

Most important, it has a field named l_ld that references PT_DYNAMIC segment, it holds all the structures that are needed during runtime, including string tables, dynamic symbols and relocations. Other fields are l_addr for the base address, l_name contains the path of the file, but most of them are internal and aren't exposed in the link.h file.

void* handle;
struct link_map* link_map;

// Get a handle
// to current
// executable.
handle = dlopen(NULL, RTLD_LAZY);

// Get its
// link_map.
dlinfo(handle, RTLD_DI_LINKMAP, &link_map);

// Close
// handle.
dlclose(handle);

GOT

The resolution process happens once, and the result must be stored for subsequent calls, this is a role for the Global Offset Table (GOT). For example, a puts("Hello world") requires that puts be resolved upon the first call and its address to be stored in a pre-defined entry in the GOT.

The GOT is a table of addresses used by position-independent code to access global data and external functions. It introduces two sections, .got that holds symbols resolved before main() is called (i.e: __libc_start_main_impl, __gmon_start__). And .got.plt comes with a header, and reserved areas for symbols to be resolved upon an initial call.

The header is prepared by the dynamic linker's _dl_start(), specifically the inlined function elf_machine_runtime_setup().

.got.plt[0] = .dynamic
			.got.plt[1] = struct link_map*
			.got.plt[2] = _dl_runtime_resolve_xsave

PLT

The Procedure Linkage Table (PLT) has call stubs for external functions in a section named .plt, the first instruction jumps to the address specified in GOT, initially, it's set to point to the next instruction after the jmp, it finds a push $n, pushing a relocation index and going to PLT0.

PLT stubs

The PLT0 pushes the struct link_map* (in .got.plt[1]), and transfers execution to the resolver _dl_runtime_resolve_xsave (in .got.plt[2]), so it finds the .rela.plt (DT_JMPREL) and performs this relocation at runtime.

puts resolves to PLT0 initially

Lazy-binding is simple to disable (no PLT remains), and the call points directly to puts@GLIBC_2.2.5 instead of puts@plt. gcc main.c -fno-plt