Relocations, Relocations


June 24, 2012

So, what’s a relocation?

The man entry for elf contains the following:

Relocation is the process of connecting symbolic references with symbolic definitions.

While there are a few different types, the ones I’m interested in here are the runtime variety which enable the compilation of position independent code. In order to use absolute addressing, all the code – the application and all its DSOs – would have to be loaded into virtual memory locations known at compile time. Now while the linker can specify the virtual memory load-address for the executable, it cannot hope to do so for all the DSOs, since many of them will undoubtedly have been compiled separately (e.g. glibc). Even if that were possible, there’s still ASLR to consider – and in some cases even the executable is compiled as position-independent to enable certain security features. There’s therefore no practical alternative to indirect or PC-relative addressing, and in this post I’m going to focus on how to code for them.

An ELF Virtual Memory Image

In order to have a discussion about the how relocations work, it really helps to understand how an ELF binary is laid out in memory. The following graphic was constructed (not to scale) from the virtual memory addresses given by the proc file-system file /proc/self/maps. There are security constraints on which users can read this file, so the usual example given during these sorts of discussion is as follows:

$ cat /proc/self/maps

where self is an OS-supplied synonym for the process’ own PID. I did something a little different, which I’ll get to later on, if it’s helpful, so the process image below is not for cat, but for my own application which relies on a simple DSO, can read out the contents of its own /proc/self/maps file, and that data forms the basis for the graphic.

I suppose the first thing to mention is that I’ve drawn it the other way up from the way /proc/self/maps is displayed. The contents of that file appear below, which to avoid confusion I’ve sorted so that it too has higher memory addresses at the top:

start            end              perm offset  inode   image
ffffffffff600000-ffffffffff601000 r-xp       a       0
ffffffffff600000-ffffffffff601000 r-xp       0       0 [vsyscall]
    7fff103ff000-    7fff10400000 r-xp       0       0 [vdso]
    7fff102cb000-    7fff102e0000 rw-p       0       0 [stack]
    7fda06220000-    7fda06221000 rw-p       0       0
    7fda0621f000-    7fda06220000 rw-p   20000  262258 /lib/
    7fda0621e000-    7fda0621f000 r--p   1f000  262258 /lib/
    7fda0621a000-    7fda0621e000 rw-p       0       0
    7fda06202000-    7fda06205000 rw-p       0       0
    7fda05fff000-    7fda0601f000 r-xp       0  262258 /lib/
    7fda05ffe000-    7fda05fff000 rw-p    1000    1156 /home/michael/lib/
    7fda05ffd000-    7fda05ffe000 r--p       0    1156 /home/michael/lib/
    7fda05dfe000-    7fda05ffd000 ---p    1000    1156 /home/michael/lib/
    7fda05dfd000-    7fda05dfe000 r-xp       0    1156 /home/michael/lib/
    7fda05df8000-    7fda05dfd000 rw-p       0       0
    7fda05df7000-    7fda05df8000 rw-p  17d000  262282 /lib/
    7fda05df3000-    7fda05df7000 r--p  179000  262282 /lib/
    7fda05bf4000-    7fda05df3000 ---p  17a000  262282 /lib/
    7fda05a7a000-    7fda05bf4000 r-xp       0  262282 /lib/
         18f1000-         1912000 rw-p       0       0 [heap]
          601000-          602000 rw-p    1000     664 /home/michael/appself
          600000-          601000 r--p       0     664 /home/michael/appself
          400000-          401000 r-xp       0     664 /home/michael/appself

So, what are we looking at here? The first two columns give the virtual memory start and end addresses for that section of the application image. The third column describes the permissions for the section, and it should be obvious which ones are read-only, read/write or have the execute flag set (it’s worth noting for now that there are two types of read/write data sections: initialised and uninitialised). The p flag denotes “private”, copy-on-write data1. The “offset” column describes the offset into the binary image at which the section starts. I’ve omitted the “device” column from the print-out, which would otherwise have appeared between the “offset” and “inode” columns. The “offset” value can be mapped onto the ELF binary by consulting the program and section headers using readelf2.

A DSO Virtual Memory Image

To complete the picture, I’ve created an accurately marked-up version from hexdump, although again, the higher addresses are at the top. This image is of the DSO. Note that it shows the two loadable segments, and the permissions associated with each address:

PC-Relative Addressing

You may have noticed that the .rodata section is at a constant offset from the .text section and that consequently you should be able to address its data using a PC-relative offset. You would be right, but PC-relative addressing can be applied to any symbols in the library, not just those in the same loadable segment. The precondition when writing 64-bit assembly is that the symbol must be “local” to the library (i.e. not global), otherwise the linker will complain, a bit like this:

/usr/bin/ld: libpcrel.o: relocation R_X86_64_PC32 against symbol `Lhello' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Bad value

There are ways to subvert this sanity check, of course, the most obvious being that you could have two symbols at the same address with only one of them being declared “global”. However, this might break the C-standard’s symbol-override mechanism – imagine if the data was read/write and a R_X86_64_COPY relocation was applied. You would end up addressing a different value to the rest of the application. Anyway, here’s the example code:

.section .rodata
        .asciz "Hello!"

.section .text
        .globl someRelocations
        .type someRelocations, STT_FUNC
        leaq Lhello(%rip), %rdi
        call puts@PLT

We can see using objdump how that piece of addressing has been encoded, namely, as 0x53 bytes from the start address of the next instruction:

$ objdump -dj .text

00000000000005cc <someRelocations>:
 5cc: 48 8d 3d 53 00 00 00  lea    0x53(%rip),%rdi        # 626 <Lhello>
 5d3: e8 00 ff ff ff        callq  4d8 <puts@plt>

We can use hexdump this time to verify the contents at that location (which is 0x5d3 + 0x53 = 0x626):

$ hexdump -vCs 0x626 -n 7

00000626  48 65 6c 6c 6f 21 00                              |Hello!.|

The following graphic shows what this means in terms of the DSO image. The load address of the executable segment in this case was 0x7feff54b3000:

The Global Offset Table and Procedure Linkage Table

In a gross generalisation, I think there are two basic types indirect addressing: by PC-relative offset (i.e. by an offset from the current address of the instruction pointer), or through a pointer. There are two ELF constructs which assist in locating the runtime addresses of symbols (by which I mean function addresses as well): the Global Offset Table (GOT) and the Procedure Linkage Table (PLT). I’m sure you’ve already heard of them.

So what is the GOT? Where is it? The GOT is comprised of two sections, .got and .got.plt, which are in the second loadable segment. Their addresses are as follows:

$ readelf -SW
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [21] .got              PROGBITS        0000000000200fc8 000fc8 000020 08  WA  0   0  8
  [22] .got.plt          PROGBITS        0000000000200fe8 000fe8 000078 08  WA  0   0  8

The two sections contain address data for locating symbols at runtime, and act as a look-up table for and a source of indirection between code attempting to reference a symbol and the runtime location of that data. The values stored in the DSO’s .got section are adjusted as part of the dynamic loader’s initialisation process. Only some of the values in the .got.plt section are initialised at start-time, with the rest being initialised on-demand. There are some other facets of the initialisation process which I don’t want to get into at the moment, including how symbol definitions can be overridden by values in other libraries or how “common” symbols can be copied to the data section of the executable.

A symbol less important to 64-bit code is the _GLOBAL_OFFSET_TABLE_ which in for this library is the same as the start-address of the .got.plt section. It can usually be discovered using readelf -s (unless the symbol table has been stripped). In this case it’s still in .symtab:

$ readelf -sW
Symbol table '.symtab' contains 74 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
    49: 0000000000200fe8     0 OBJECT  LOCAL  HIDDEN  ABS _GLOBAL_OFFSET_TABLE_

I’m sure that anyone who has written any x86 code on Linux will at some point have come across the strange coding idiom which calculates the current relative offset from _GLOBAL_OFFSET_TABLE_. I’m talking about the following:

 call .label
 popl %ebx
 addl $_GLOBAL_OFFSET_TABLE_+\[.-.label\], %ebx
 leal symbol@GOTOFF(%ebx), %eax

You may also have seen this variant, which is an equivalent shorthand for the above, and is described as more instruction-pipeline friendly3:

call __i686.get_pc_thunk.bx

While almost all position-independent 32-bit code will reference _GLOBAL_OFFSET_TABLE_ at some point, 64-bit code can use ther relocation types to address values in the GOT, so you typically won’t see the symbol being used in code as frequently.

The address of _GLOBAL_OFFSET_TABLE_ is within the read-only segment, in this case at offset 0xfe8, or in other words -0x18 bytes from the end of the read-only segment and the start of the read/write segment. This is interesting, since certain elements of the GOT need to be fixed-up at run-time (i.e. those addresses within the GOT which participate in PLT calculations). So what’s going on? Let’s have another look at the runtime GOT image, and please note that this higher addresses are at the top, the opposite way round to hexdump:

The address of the _GLOBAL_OFFSET_TABLE_ symbol may be located anywhere within the GOT, and that offsets from it may therefore be negative. I wonder whether all incarnations of gcc compile _GLOBAL_OFFSET_TABLE_ to be the same as the start-address of .got.plt. The first three quad-words of the .got.plt section are special and reserved, and act as part of the “trampoline” function in lazy function-linking, which leaves me in no doubt as the reason why they’re designated as read-only. The operation of the PLT is one of the better-documented areas of an ELF binary, so I won’t repeat it all here. Suffice it to say that on a lazy-linked function’s initial invocation, the instruction pointer jumps using an address in .got.plt to the dynamic linker — which changes the contents at that .got.plt address to be the real address of the function — and then jumps again to that target function (all the jumping is important as the target function’s eventual ret statement will still find the original return address on the stack). Subsequent invocations no longer detour via the dynamic linker but jump from the PLT directly to the target function. I’ve gone into a bit more detail about this in the footnotes4 (some of which look more like mini-epics).

Relocation Types

Yes, you’re right, this has been a huge amount of preamble. There appear to be several different types of relocation which can be made, a couple of the more common would, for example, be written in AT&T/gas as follows:

movq stderr@GOTPCREL(%rip), %rax
call printf@PLT

However, there are several other types, and it was a frustrating lack of documentation on the Gnu assembler pages about how to write these that led me to do this investigation in the first place. The AMD64 ABI Draft 0.99.5 – September 3, 20105 provides the following example of direct and indirect function calls in the large code model (which you’ll most likely not need, unless your codebase is a few exabytes in size, but it’s interesting to see how it hangs together):

.section .data
        .quad                                # static void (*ptr) (void);
        .globl foo                           # extern void foo (void);

.section .text
        .type Lbar, STT_FUNC
        movabs    $foo@GOT,%r11              # foo();     (R_X86_64_GOTPLT64)
        call      *(%r11,%r15)
        movabs    $Lbar@GOTOFF,%r11          # bar();     (R_X86_64_GOTOFF64)
        leaq      (%r11,%r15),%r11
        call      *%r11

        movabs    $Lptr@GOTOFF,%rax          # ptr = foo; (R_X86_64_GOTOFF64)
        movabs    $foo@PLTOFF,%r11           #            (R_X86_64_PLTOFF64)
        leaq      (%r11,%r15),%r11
        movq      %r11,(%rax,%r15)
        movabs    $Lbar@GOTOFF,%r11          # ptr = bar; (R_X86_64_GOTOFF64)
        leaq      (%r11,%r15),%r11
        movq      %r11,(%rax,%r15)

        movabs    $Lptr@GOTOFF,%r11          # (*ptr)();  (R_X86_64_GOTOFF64)
        call      *(%r11,%r15)

This example code unsurprisingly contains solely 64-bit relocations:

$ readelf -r abi64eg.o

Relocation section '.rela.text' at offset 0x3b8 contains 6 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000002  00060000001b R_X86_64_GOT64    0000000000000000 foo + 0
000000000010  000500000019 R_X86_64_GOTOFF64 0000000000000000 Lbar + 0
000000000021  000400000019 R_X86_64_GOTOFF64 0000000000000000 Lptr + 0
00000000002b  00060000001f R_X86_64_PLTOFF64 0000000000000000 foo + 0
00000000003d  000500000019 R_X86_64_GOTOFF64 0000000000000000 Lbar + 0
00000000004f  000400000019 R_X86_64_GOTOFF64 0000000000000000 Lptr + 0

Anyway, let’s have a look at what’s on offer. The following types are listed in the AMD64 PDF, and I’ve taken the list which appears on page 35 in section 3.5.2 and added some detail.


The AMD64 ABI describes the relocation given by name@GOT as follows:

name@GOT: specifies the offset to the GOT entry for the symbol name from the base of the GOT.

Since the Gnu assembler is (embarassingly) short of docs regarding symbol modifiers for x86 assembly, here’s what the YASM docs say about @GOT relocations:

As in elf32, referring to an external or global symbol using wrt causes the linker to build an entry in the GOT containing the address of the symbol, and the reference gives the distance from the beginning of the GOT to the entry; so you can add on the address of the GOT, load from the resulting address, and end up with the address of the symbol.

Succinct, to the point, all we need really. Let’s test-drive that:

.section .rodata
        .asciz "Hello!"

.section .text
        .globl someRelocations
        .type someRelocations, STT_FUNC

        movq $Lhello@GOT, %rax                  # Store the offset from GOT to its entry for Lhello in RAX
        leaq _GLOBAL_OFFSET_TABLE_(%rip), %rcx  # Store the address of _GLOBAL_OFFSET_TABLE_ in RCX
        addq %rcx, %rax                         # Calculate abs address of GOT entry for Lhello
        movq (%rax), %rdi                       # Resolve absolute address of Lhello (dereference the pointer)
        call puts@PLT

The above code generates a relocation of type R_X86_64_GOT32. The AMD 64 ABI describes this relocation type in table 4.10 as follows:

Name            Value   Field   Calculation
R_X86_64_GOT32  3       word32  G+A

G “represents the offset into the global offset table at which the relocation entry’s symbol will reside during execution” and
A “represents the addend used to compute the value of the relocatable field.”

It’s probably worth noting that the above code can be shortened enormously by using the @GOTPCREL modifier to load the address of Lhello in one instruction:

        movq Lhello@GOTPCREL(%rip), %rdi
        call puts@PLT

Another thing is that while it demonstrates how the modifier operates and what you get for your money, given the other options available in AMD64 assembly, it’s unlikely you’ll need to reach for this unless you really need to know the offset.

As you’ve probably noticed, I’m quite keen on diagrams as an aid to understanding how things hang together, so I’ve had a go at representing the offset represented by the name@GOT modifier. I’m not convinced it’s much use, since the dimensions are necessarily so small, but here’s a go using the previous hexdump-style graphic. Please note that the following diagrams represent different examples, and not necessarily the ones we’ve discussed so far.

This is probably a bit clearer:


The AMD64 ABI describes the relocation given by name@GOTPLT as follows:

name@GOTPLT: specifies the offset to the GOT entry for the symbol name from the base of the GOT, implying that there is a corresponding PLT entry.

The docs for this are incredibly scarce and so far as the Gnu assembler are concerned, non-existent for the x86 instruction set (or i386 as as would have it). The docs for the s390 do contain some information and state the following:

The @gotplt modifier can be used for .long and .quad. A procedure linkage table entry is generated for the symbol and a jump slot for the symbol is added to the GOT. The symbol term is replaced with the offset from the start of the GOT to the jump slot for the symbol.

Given that these relocations target ELF binaries and not an instruction set, it doesn’t seem unreasonable to think that the same may be true for x86_64. It seems that in x86-64 assembler the relocation requires a .quad field, since it’s an 8-byte relocation. This modifier doesn’t appear to be capable of being used in an instruction, although I haven’t exhausted the search space and don’t profess to have tried everything.

Anyway, I’ve cooked up a contrived example of using @GOTPLT. I suppose it’s worth noting that this example calls the target function directly, without going via the PLT/.got.plt trampoline:

.section .rodata
        .asciz "Hello!"

.section .got
        .quad sayHello@GOTPLT                    # It's happy in .got, but not .got.plt

.section .text
        .globl someRelocations
        .type someRelocations, STT_FUNC

        movq LhelloOff(%rip), %rax              # Store the offset from GOT to its entry for sayHello in RAX
        leaq _GLOBAL_OFFSET_TABLE_(%rip), %rcx  # Store the address of _GLOBAL_OFFSET_TABLE_ in RCX
        addq %rcx, %rax                         # Calculate abs address of sayHello
        call *(%rax)


        .globl sayHello
        .type sayHello, STT_FUNC
        movq Lhello@GOTPCREL(%rip), %rdi
        call puts@PLT

The docs for the s390 say that a PLT entry is created for the symbol - this is true for x86 as well: if you want a PLT entry created for some symbol you can use this modifier to achieve that; however, there are more reasonable ways to achieve that.

So what’s happened? Well, the following is done by the code:

  1. The value at address LhelloOff is read into %rax. The value in this case is negative 0x18 and already present in the DSO (i.e. written by the static linker).
  2. The value -0x18 is added to the address of _GLOBAL_OFFSET_TABLE_. This yields an offset in this case of 0x200fe8 - 0x18 = 0x200fd0.
  3. At offset 0x200fd0 of the runtime image is the value 0x7f7cf1843650, which given a virtual-memory load-address of 0x7f7cf1843000 gives an offset of 0x650 into the executable segment.
  4. The function sayHello begins at offset 0x650 of the executable segment.

It’s fairly clear that using sayHello@GOTPLT has created an two entries in the GOT: the value containing the offset (-0x18) and also the virtual address of sayHello. The address of sayHello would not otherwise be written to the GOT, only to the trampoline entry in the .got.plt section.

Anyway, here’s something for the more visual among you:


The AMD64 ABI describes the relocation given by name@GOTOFF as follows:

name@GOTOFF: specifies the offset to the location of the symbol name from the base of the GOT.

The @GOTOFF modifier generates an 8-byte relocation of type R_X86_64_GOTOFF64.

.section .rodata
        .asciz "Hello!"

.section .data
        .asciz "Goodbye!"

.section .text
        .globl someRelocations
        .type someRelocations, STT_FUNC
        movabs $Lhello@GOTOFF, %rdi               # Store 64-bit offset to Lhello from the GOT in RDI
        lea _GLOBAL_OFFSET_TABLE_(%rip), %rdx  # Get absolute address of _GLOBAL_OFFSET_TABLE_
        add %rdx, %rdi                         # Get absolute address of Lhello
        call puts@PLT                           # Print to stdout

        movabs $Lgoodbye@GOTOFF, %rdi             # Demo offset to somewhere in the .data section this time
        lea _GLOBAL_OFFSET_TABLE_(%rip), %rdx
        add %rdx, %rdi
        call puts@PLT

Using objdump, we can see the relative offsets generated by the linker in the DSO:

00000000000005cc <someRelocations>:
 5cc:   48 bf 5e f6 df ff ff ff ff ff   mov    $0xffffffffffdff65e,%rdi   # Largish negative offset from _G_O_T_ to the .rodata section
 5d6:   48 8d 15 0b 0a 20 00            lea    0x200a0b(%rip),%rdx        # 200fe8 <_GLOBAL_OFFSET_TABLE_>
 5dd:   48 01 d7                        add    %rdx,%rdi
 5e0:   e8 f3 fe ff ff                  callq  4d8 <puts@plt>
 5e5:   48 bf 30 00 00 00 00 00 00 00   mov    $0x30,%rdi                 # Smaller positive offset from _G_O_T_ to .data section
 5ef:   48 8d 15 f2 09 20 00            lea    0x2009f2(%rip),%rdx        # 200fe8 <_GLOBAL_OFFSET_TABLE_>
 5f6:   48 01 d7                        add    %rdx,%rdi
 5f9:   e8 da fe ff ff                  callq  4d8 <puts@plt>
 5fe:   c3                              retq

Despite my better judgment and probably because of a lack of current alternative options I thought I’d spend the time making you lovely people another graphic (and show how you can cram most of a binary image into a screen-width. There’s a complete one lower down):

Hopefully this shows what the relocation represents, and if you really squint you can see the ascii codes for “Hello!\0” and “Goodbye!\0”. The relocation modifier doesn’t leave you with the eventual address as you still need the address of the GOT and an add instruction. I suppose there’s a place in nature for everything, despite the existence of GOTPCREL relocations…


The AMD64 ABI describes the relocation given by name@GOTPCREL as follows:

name@GOTPCREL: specifies the offset to the GOT entry for the symbol name from the current code location.

The use of the @GOTPCREL modifier is straightforward:

.section .rodata
        .asciz "Hello!"

.section .text
        movq Lhello@GOTPCREL(%rip), %rdi       # Copy the address of Lhello in the GOT into RDI
        call puts@PLT                          # Print to stdout

The code above generates a relocation in the object file (which is elided by the linker) of type R_X86_64_GOTPCREL. If you wanted to find the address in the GOT which contains the pointer to the string, you could simply use lea instead:

        leaq Lhello@GOTPCREL(%rip), %rdi       # Copy the address in the GOT which contains the pointer to Lhello into RDI

Simple stuff!

The docs for the YASM assembler have the following to say about GOTPCREL relocations, and although the syntax is different the underlying ELF statements still apply:

While RIP-relative addressing allows you to encode an instruction pointer relative data reference to foo with [rel foo], it’s sometimes necessary to encode a RIP-relative reference to a linker-generated symbol pointer for symbol foo; this is done using wrt ..gotpcrel, e.g. [rel foo wrt ..gotpcrel]. Unlike in elf32, this relocation, combined with RIP-relative addressing, makes it possible to load an address from the [Global Offset Table] using a single instruction. Note that since RIP-relative references are limited to a signed 32-bit displacement, the GOT size accessible through this method is limited to 2 GB.

The above graphic hopefully illustrates the following code:

        mov name@GOTPCREL(%rip), %rdi

The executable section was loaded into memory at address 0x7fb717760000. At offset 0x6d4 into the executable segment the mov instruction appears. It is seven bytes long and encodes a PC-relative relocation of +0x2008e5 from the address of the following instruction. The sum of 0x7fb7177606db (0x7fb717760000 + 0x6d4 + 7) and 0x200e85 is 0x7fb717960fc0. This appears in the read-only segment whose VM load-address was on this occasion 0x7fb717960000.


The AMD64 ABI describes the relocation given by name@PLT as follows:

name@PLT: specifies the offset to the PLT entry of symbol name from the current code location.

The classic use of this is in the call statement:

       call funcName@PLT

The interesting thing about this is that the offset generated is relative to the address of the next instruction (i.e. PC-relative), yet we don’t need to use the %rip register: the call instruction encodes this for us. Here’s the description from the Intel Software Developers Manual:

Call near, relative, displacement relative to next instruction. 32-bit displacement sign extended to 64-bits in 64-bit mode.

To get at the absolute address of an entry in the PLT we can use the %rip register directly:

        leaq funcName@PLT(%rip), %rax

The above code snippets result in a relocation of type R_X86_64_PLT32.

Given the following simple assembly file:

.section .rodata
        .asciz "Hello!"

.section .text
        .globl someRelocations
        .type someRelocations, STT_FUNC
        call sayHello@PLT

        .globl sayHello
        .type sayHello, STT_FUNC
        movq Lhello@GOTPCREL(%rip), %rdi
        call puts@PLT

Which once compiled into a DSO provides the following partial objdump output of the .text and .plt sections:

$ objdump -dj .text libplt.s
000000000000062c <someRelocations>:
 62c:   e8 07 ff ff ff          callq  538 <sayHello@plt>
 631:   c3                      retq

0000000000000632 <sayHello>:
 632:   48 8b 3d 8f 09 20 00    mov    0x20098f(%rip),%rdi        # 200fc8 <_DYNAMIC+0x190>
 639:   e8 ea fe ff ff          callq  528 <puts@plt>
 63e:   c3                      retq

$ objdump -dj .plt libplt.s
0000000000000538 <sayHello@plt>:
 538:   ff 25 ca 0a 20 00       jmpq   *0x200aca(%rip)        # 201008 <_GLOBAL_OFFSET_TABLE_+0x20>
 53e:   68 01 00 00 00          pushq  $0x1
 543:   e9 d0 ff ff ff          jmpq   518 <_init+0x18>

It’s possible to derive the following image illustrating the call sayHello@PLT instruction:


The AMD64 ABI describes the relocation given by a name@PLTOFF as follows:

name@PLTOFF: specifies the offset to the PLT entry of symbol name from the base of the GOT.

This relocation results in an entry of type R_X86_64_PLTOFF64.

The offset/value/result of this relocation can be calculated as follows, using readelf:

$ readelf -SW | grep ' \\.plt'
  \[11\] .plt              PROGBITS        0000000000000580 000580 000050 10  AX  0   0  4
$ readelf -sW | grep _GLOBAL_OFFSET_TABLE_
    41: 0000000000200fe8     0 OBJECT  LOCAL  HIDDEN  ABS _GLOBAL_OFFSET_TABLE_

The relocation is from the GOT to the entry in the PLT, so the offset will usually be negative, unless you’ve got a very strange ELF file. In this case, it is 0x580 - 0x200fe8 = -0x200a68. We need to add the offset from the start of the PLT to the relevant entry within it. Given a PLT which looks like the one below, that offset value would be +0x40 for symbol sayGoodbye (0x5c0 - 0x580):

$ objdump -dj .plt     file format elf64-x86-64

Disassembly of section .plt:

0000000000000580 <puts@plt-0x10>:
 580: ff 35 6a 0a 20 00     pushq  0x200a6a(%rip)        # 200ff0 <_GLOBAL_OFFSET_TABLE_+0x8>
 586: ff 25 6c 0a 20 00     jmpq   *0x200a6c(%rip)       # 200ff8 <_GLOBAL_OFFSET_TABLE_+0x10>
 58c: 0f 1f 40 00           nopl   0x0(%rax)
# ...
00000000000005c0 <sayGoodbye@plt>:
 5c0: ff 25 52 0a 20 00     jmpq   *0x200a52(%rip)       # 201018 <_GLOBAL_OFFSET_TABLE_+0x30>
 5c6: 68 03 00 00 00        pushq  $0x3
 5cb: e9 b0 ff ff ff        jmpq   580 <_init+0x18>
# ...

This would yield a final relocation value for sayGoodbye@PLTOFF of negative 0x200a08:

0x580 - 0x200fe8 + offset = -0x200a68 + 0x40 = -0x200a28

Using this value and the address of _GLOBAL_OFFSET_TABLE_, you could inspect the various values in the PLT for symbols of interest. Let’s try this out. The code below can be compiled into something called, whose upside-down hexdump image appears more or less in full above. The code uses the PLTOFF declaration to find in the PLT the addresses in the .got.plt section which relate to the two functions sayHello and sayGoodbye. It then overwrites the value for sayHello with that of sayGoodbye, so that calls to sayHello are in fact redirect to sayGoodbye. It then proves this by calling the two functions in turn:

.section .rodata
        .asciz "Hello!"
        .asciz "Goodbye!"

.section .text
        .globl someRelocations
        .type someRelocations, STT_FUNC
        movabs $sayHello@PLTOFF, %rax        # store offset from GOT in RAX
        movabs $sayGoodbye@PLTOFF, %rbx
        leaq _GLOBAL_OFFSET_TABLE_(%rip), %rcx

        addq %rcx, %rax                      # Calculate abs address of PLT entry for sayHello
        movl 2(%rax), %r9d                   # Skip the jump instruction to find the RIP-addend
        leaq 6(%r9,%rax), %rax               # Add the two numbers to get the GOT trampoline target
                                             # and six bytes to account for the jmp instruction bytes

        addq %rcx, %rbx                      # Do the same for the 'sayGoodbye' function 
        movl 2(%rbx), %r11d
        leaq 6(%r11,%rbx), %rbx

        movq (%rbx), %rdx                    # copy the contents of the GOT trampoline for sayGoodbye ...
        movq %rdx, (%rax)                    # into the GOT trampoline for sayHello

        call sayHello@PLT                    # invoke both functions to see what's happening
        call sayGoodbye@PLT                  # if the hackery is sound it should print 'Goodbye!' twice

        .globl sayHello
        .type sayHello, STT_FUNC
        movq Lhello@GOTPCREL(%rip), %rdi
        call puts@PLT

        .globl sayGoodbye
        .type sayGoodbye, STT_FUNC
        movq Lgoodbye@GOTPCREL(%rip), %rdi
        call puts@PLT

This code looks like this when the DSO is run through objdump:

$ objdump -d$j .text
000000000000069c <someRelocations>:
 69c:   48 b8 b8 f5 df ff ff    mov    $0xffffffffffdff5b8,%rax   # -0x200a48
 6a3:   ff ff ff 
 6a6:   48 bb d8 f5 df ff ff    mov    $0xffffffffffdff5d8,%rbx   # -0x200a28
 6ad:   ff ff ff 
 6b0:   48 8d 0d 31 09 20 00    lea    0x200931(%rip),%rcx        # 200fe8 <_GLOBAL_OFFSET_TABLE_>
 6b7:   48 01 c8                add    %rcx,%rax
 6ba:   44 8b 48 02             mov    0x2(%rax),%r9d
 6be:   49 8d 44 01 06          lea    0x6(%r9,%rax,1),%rax
 6c3:   48 01 cb                add    %rcx,%rbx
 6c6:   44 8b 5b 02             mov    0x2(%rbx),%r11d
 6ca:   49 8d 5c 1b 06          lea    0x6(%r11,%rbx,1),%rbx
 6cf:   48 8b 13                mov    (%rbx),%rdx
 6d2:   48 89 10                mov    %rdx,(%rax)
 6d5:   e8 c6 fe ff ff          callq  5a0 <sayHello@plt>
 6da:   e8 e1 fe ff ff          callq  5c0 <sayGoodbye@plt>
 6df:   c3                      retq   

$ objdump -dj .plt

00000000000005a0 <sayHello@plt>:
 5a0:   ff 25 62 0a 20 00       jmpq   *0x200a62(%rip)        # 201008 <_GLOBAL_OFFSET_TABLE_+0x20>
 5a6:   68 01 00 00 00          pushq  $0x1
 5ab:   e9 d0 ff ff ff          jmpq   580 <_init+0x18>

To continue this blogopus’ theme of illustrating the memory-layouts and what each relocation actually represents, here’s one showing the relocation $sayHello@PLTOFF. The library was loaded on this occasion at address 0x7f436d118000 and symbol _GLOBAL_OFFSET_TABLE_ is at offset 0x200fe8 therefrom. You can see from the objdump output above that the PLT entry for sayHello is at offset 0x5a0. Once laid out in memory, the offset -0x200a48 points to that PLT entry. Again, if you check, you can see the opcodes at 0x5a0 in the image correspond to the hex-output above.

And that, as they say, is that.


Excerpts from the AMD ABI

Draft version 0.99.5, September 3, 2010
static void (*ptr)(void);
extern void foo (void);
static void bar (void);
Lptr: .quad
      .globl foo
Lbar: ...
foo ();

bar ();
movabs    $foo@GOT,%r11       ; R_x86_64_GOTPLT64
call      *(%r11,%r15)
movabs    $Lbar@GOTOFF,%r11   ; R_X86_64_GOTOFF64
leaq      (%r11,%r15),%r11
call      *%r11
ptr = foo;

ptr = bar;
movabs    $Lptr@GOTOFF,%rax   ; R_X86_64_GOTOFF64
movabs    $foo@PLTOFF,%r11    ; R_X86_64_PLTOFF64
leaq      (%r11,%r15),%r11
movq      %r11,(%rax,%r15)
movabs    $Lbar@GOTOFF,%r11   ; R_X86_64_GOTOFF64
leaq      (%r11,%r15),%r11
movq      %r11,(%rax,%r15)
movabs    $Lptr@GOTOFF,%r11   ; R_X86_64_GOTOFF64
call      *(%r11,%r15)

The following notations are used for specifying relocations in table 4.10:

Notes to Table 4.10: Relocation Types
A Represents the addend used to compute the value of the relocatable field.
B Represents the base address at which a shared object has been loaded into memory during execution. Generally, a shared object is built with a 0 base virtual address, but the execution address will be different.
G Represents the offset into the global offset table at which the relocatio n entry’s symbol will reside during execution.
GOT Represents the address of the global offset table.
L Represents the place (section offset or address) of the Procedure Linkage Table entry for a symbol.
P Represents the place (section offset or address) of the storage unit being relocated (computed using r_offset).
S Represents the value of the symbol whose index resides in the relocation entry.
Z Represents the size of the symbol whose index resides in the relocation entry.

The AMD64 ABI architectures uses only Elf64_Rela relocation entries with explicit addends. The r_addend member serves as the relocation addend.

Table 4.10: Relocation Types
Name Value Field Calculation
R_X86_64_NONE 0 none none
R_X86_64_64 1 word64 S+A
R_X86_64_PC32 2 word32 S+A-P
R_X86_64_GOT32 3 word32 G+A
R_X86_64_PLT32 4 word32 L+A-P
R_X86_64_COPY 5 none none
R_X86_64_GLOB_DAT 6 word64 S
R_X86_64_JUMP_SLOT 7 word64 S
R_X86_64_RELATIVE 8 word64 B+A
R_X86_64_GOTPCREL 9 word32 G+GOT+A-P
R_X86_64_32 10 word32 S+A
R_X86_64_32S 11 word32 S+A
R_X86_64_16 12 word16 S+A
R_X86_64_PC16 13 word16 S+A-P
R_X86_64_8 14 word8 S+A
R_X86_64_PC8 15 word8 S+A-P
R_X86_64_DTPMOD64 16 word64
R_X86_64_DTPOFF64 17 word64
R_X86_64_TPOFF64 18 word64
R_X86_64_TLSGD 19 word32
R_X86_64_TLSLD 20 word32
R_X86_64_DTPOFF32 21 word32
R_X86_64_GOTTPOFF 22 word32
R_X86_64_TPOFF32 23 word32
R_X86_64_PC64 24 word64 S+A-P
R_X86_64_GOTOFF64 25 word64 S+A-GOT
R_X86_64_GOTPC32 26 word32 GOT+A-P
R_X86_64_GOT64 27 word64 G + A
R_X86_64_GOTPCREL64 28 word64 G + GOT - P + A
R_X86_64_GOTPC64 29 word64 GOT - P + A
R_X86_64_GOTPLT64 30 word64 G + A
R_X86_64_PLTOFF64 31 word64 L - GOT + A
R_X86_64_SIZE32 32 word32 Z+A
R_X86_64_SIZE64 33 word64 Z+A
R_X86_64_GOTPC32_TLSDESC 34 word32
R_X86_64_TLSDESC_CALL 35 none
R_X86_64_TLSDESC 36 word64x2

The special semantics for most of these relocation types are identical to those used for the Intel386 ABI 2, 3.

2: Even though the AMD64 architecture supports IP-relative addressing modes, a GOT is still required since the offset from a particular instruction to a particular data item cannot be known by the static linker.
3: Note that the AMD64 architecture assumes that offsets into GOT are 32-bit values, not 64-bit values. This choice means that a maximum of 232=8 = 229 entries can be placed in the GOT. However, that should be more than enough for most programs. In the event that it is not enough, the linker could create multiple GOTs. Because 32-bit offsets are used, loads of global data do not require loading the offset into a displacement register; the base plus immediate displacement addressing form can be used.

The R_X86_64_GOTPCREL relocation has different semantics from the R_X86_64_GOT32 or equivalent i386 R_I386_GOTPC relocation. In particular, because the AMD64 architecture has an addressing mode relative to the instruction pointer, it is possible to load an address from the GOT using a single instruction. The calculation done by the R_X86_64_GOTPCREL relocation gives the difference between the location in the GOT where the symbol’s address is given and the location where the relocation is applied.

The R_X86_64_32 and R_X86_64_32S relocations truncate the computed value to 32-bits. The linker must verify that the generated value for the R_X86_64_32 (R_X86_64_32S) relocation zero-extends (sign-extends) to the original 64-bit value.

A program or object file using R_X86_64_8, R_X86_64_16, R_X86_64_PC16 or R_X86_64_PC8 relocations is not conformant to this ABI, these relocations are only added for documentation purposes. The R_X86_64_16, and R_X86_64_8 relocations truncate the computed value to 16-bits resp. 8-bits.

The relocations R_X86_64_DTPMOD64, R_X86_64_DTPOFF64, R_X86_64_TPOFF64, R_X86_64_TLSGD, R_X86_64_TLSLD, R_X86_64_DTPOFF32, R_X86_64_GOTTPOFF and R_X86_64_TPOFF32 are listed for completeness. They are part of the Thread-Local Storage ABI extensions and are documented in the document called “ELF Handling for Thread-Local Storage”. The relocations R_X86_64_GOTPC32_TLSDESC, R_X86_64_TLSDESC_CALL and R_X86_64_TLSDESC are also used for Thread-Local Storage, but are not documented there as of this writing. A description can be found in the document “Thread-Local Storage Descriptors for IA32 and AMD64/EM64T”.

In order to make this document self-contained, a description of the TLS relocations follows.

R_X86_64_DTPMOD64 resolves to the index of the dynamic thread vector entry that points to the base address of the TLS block corresponding to the module that defines the referenced symbol. R_X86_64_DTPOFF64 and R_X86_64_DTPOFF32 compute the offset from the pointer in that entry to the referenced symbol. The linker generates such relocations in adjacent entries in the GOT, in response to R_X86_64_TLSGD and R_X86_64_TLSLD relocations. If the linker can compute the offset itself, because the referenced symbol binds locally, the relocations R_X86_64_64 and R_X86_64_32 may be used instead. Otherwise, such relocations are always in pairs, such that the R_X86_64_DTPOFF64 relocation applies to the word64 right past the corresponding R_X86_64_DTPMOD64 relocation.

R_X86_64_TPOFF64 and R_X86_64_TPOFF32 resolve to the offset from the thread pointer to a thread-local variable. The former is generated in response to R_X86_64_GOTTPOFF, that resolves to a PC-relative address of a GOT entry containing such a 64-bit offset.

R_X86_64_TLSGD and R_X86_64_TLSLD both resolve to PC-relative offsets to a DTPMOD GOT entry. The difference between them is that, for R_X86_64_TLSGD, the following GOT entry will contain the offset of the referenced symbol into its TLS block, whereas, for R_X86_64_TLSLD, the following GOT entry will contain the offset for the base address of the TLS block. The idea is that adding this offset to the result of R_X86_64_DTPMOD32 for a symbol ought to yield the same as the result of R_X86_64_DTPMOD64 for the same symbol.

R_X86_64_TLSDESC resolves to a pair of word64s, called TLS Descriptor, the first of which is a pointer to a function, followed by an argument. The function is passed a pointer to the this pair of entries in %rax and, using the argument in the second entry, it must compute and return in %rax the offset from the thread pointer to the symbol referenced in the relocation, without modifying any registers other than processor flags. R_X86_64_GOTPC32_TLSDESC resolves to the PC-relative address of a TLS descriptor corresponding to the named symbol. R_X86_64_TLSDESC_CALL must annotate the instruction used to call the TLS Descriptor resolver function, so as to enable relaxation of that instruction.

There’s not much on the internet which really deals with relocations in assembly, so I was delighted to find the resource Hacking Haskell, which has this excellent summary of how to do the basics. You can tell they’re functional programmers because of the way they have paid special attention to the tail calls:

# x86_64-linux, -fPIC
# x86_64-darwin is almost the same,
#   .. but with leading underscores and no @PLT suffixes

# get the address of variable bar:
    leaq bar(%rip), %rax
# read a 4-byte-variable bar:
    movl bar(%rip), %eax
# call function foo:
    call foo
# tail-call foo_info:
    jmp foo_info

# get the address of imported symbol xbar:
    movq xbar@GOTPCREL(%rip), %rax
# read a 4-byte-variable xbar:
    movq xbar@GOTPCREL(%rip), %rax
    movl (%rax), %eax
# call imported function xfoo:
    call xfoo@PLT
# tail-call imported xfoo_info:
    jmp *xfoo_info@GOTPCREL(%rip)

Copyright © Michael Guyver 2012, except as otherwise attributed.

Unauthorized use and/or duplication of this material without express and written permission from this blog’s author is strictly prohibited. Excerpts and links may be used, provided that full and clear credit is given to Michael Guyver and with appropriate and specific direction to the original content.


  1. The man page for proc says that the p flag denotes “copy on write”. To explain the copy-on-write reference, I’ll refer you to Daniel Bovet’s Understanding The Linux Kernel, in which he says the following:

    Notice that all regions listed are implemented by means of private memory mappings (the letter p in the permissions column). This is not surprising: these memory regions exist only to provide data to a process; while executing instructions, a process may modify the contents of these memory regions but the files on disk associated with them stay unchanged. This is precisely how private memory mappings act.

    There used to be a first edition knocking around online in PDF format somewhere.↩︎

  2. The loadable sections for an ELF file can be displayed with readelf -l:

    $ readelf -lW
    Elf file type is DYN (Shared object file)
    Entry point 0x920
    There are 7 program headers, starting at offset 64
    Program Headers:
    Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
    LOAD           0x000000 0x0000000000000000 0x0000000000000000 0x000d54 0x000d54 R E 0x200000
    LOAD           0x000e10 0x0000000000200e10 0x0000000000200e10 0x000258 0x000268 RW  0x200000 DYNAMIC        0x000e38 0x0000000000200e38 0x0000000000200e38 0x000190 0x000190 RW  0x8
    NOTE           0x0001c8 0x00000000000001c8 0x00000000000001c8 0x000024 0x000024 R   0x4
    GNU_EH_FRAME   0x000ce8 0x0000000000000ce8 0x0000000000000ce8 0x000014 0x000014 R   0x4
    GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0x8
    GNU_RELRO      0x000e10 0x0000000000200e10 0x0000000000200e10 0x0001f0 0x0001f0 R   0x1
    Section to Segment mapping:
    Segment Sections...
    00 .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame
    01     .ctors .dtors .jcr .dynamic .got .got.plt .data .bss
    02     .dynamic
    04     .eh_frame_hdr
    06     .ctors .dtors .jcr .dynamic .got

    Loadable sections have an initial image taken from the ELF binary, described by an offset into the file and an extent, given by “FileSiz”. It’s not immediately clear how the offset value for the read/write section of given in /proc/self/maps correlates with the information in the table above. I suppose the first thing to say about it is that the virtual memory page size on Linux (for both x86 and x86-64 systems) is 4kB, which in hex is – you guessed it – 0x1000, which would make that the minimum size of a segment. The “Program Headers” section shows that the image for the second loadable section will be loaded from between offsets 0xe10 and 0x1068 (Offset + FileSiz = 0xe10 + 0x258 = 0x1068) of the file. By using readelf -SW (show sections, wide format), we can see the offsets of individual sections within the image (and there are always a lot more than in this cut-down example):

    $ readelf -SW
    Section Headers:
      [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
      [23] .data             PROGBITS        0000000000201060 001060 000008 00  WA  0   0  8
      [24] .bss              NOBITS          0000000000201068 001068 000010 00  WA  0   0  8

    One subtle thing to notice straight away is that the .bss section’s offset is directly after the last byte of the loadable segment. It’s also no coincidence that the .bss section is the final section in the list for the second loadable segment. The .bss section is special in that it contains uninitialised data and occupies no space in the file, but is initialised with zeroed bytes at program start-time. This is indicated by the NOBITS flag in the section header. Making the .bss section the final entry in the loadable segment is efficient since the rest of the read/write image can be written in a contiguous manner, with the .bss bytes being zeroed over whatever data happened to follow it.

    The data (and its immediate surrounds) for the initialised read/write .data section in section 23 is as follows:

    00001050  06 09 00 00 00 00 00 00  16 09 00 00 00 00 00 00  |................|
    00001060  60 10 20 00 00 00 00 00  47 43 43 3a 20 28 55 62  |`. .....GCC: (Ub|
    00001070  75 6e 74 75 20 34 2e 34  2e 33 2d 34 75 62 75 6e  |untu 4.4.3-4ubun|

    The library not only reads the contents of /proc/self/maps but also the various memory images from /proc/self/mem. It writes them to disk for later inspection using a name composed from the entry in /proc/self/maps. The data we’re interested in is saved to a file named 7fda05ffe000_7fda05fff000_rw-p_1000. Using hexdump, we find the following at offset 0x60:

    00000050  a0 74 b7 05 da 7f 00 00  b0 2e ae 05 da 7f 00 00  |.t..............|
    00000060  60 e0 ff 05 da 7f 00 00  00 00 00 00 00 00 00 00  |`...............|
    00000070  00 00 00 00 00 00 00 00  2e 33 2d 34 75 62 75 6e  |.........3-4ubun|

    The eight bytes which constitute the entirety of the read/write .data section has been changed from its initial image and now contains its own virtual address (0x7fda05ffe060), but you can see the initialisation of the .bss section, in which 16 bytes were zeroed after the end of the initialised read/write data section (it starts at offset 0x1068 and is 0x10 bytes in length). I’ve highlighted the .bss bytes above, and you can see how it’s simply overwritten the data outside the segment’s extents.

    And the offset in /proc/self/maps of 0x1000? Well, the .data section is at 0x1060 of, and at offset 0x60 of the loadable segment. You do the maths!↩︎

  3. Agner Fog discusses the importance of pairing call and ret instructions in his Optimizing Assembly manual, since mismatched call/ret pairs can interfere with prediction in the return stack-buffer. From his manual The microarchitecture of Intel, AMD and VIA CPUs, at section 3.14:

    A Last-In-First-Out buffer, called the return stack buffer, remembers the return address every time a call instruction is executed, and it uses this for predicting where the corresponding return will go. […]

    In order to make this mechanism work, you must make sure that all calls are matched with returns. Never jump out of a subroutine without a return and never use a return as an indirect jump.

  4. After the first three quad-words of the .got.plt, the GOT spills into the read/write segment, and it’s this part which is fixed-up by the dynamic linker. The addresses within the read/write segment of the .got.plt section will eventually point to functions – assuming they’re invoked at all. If they haven’t, they will look something like this (where the top two comments show the offset of the address into the image of

    $ # Get offset from binary image base address
    $ # _GLOBAL_OFFSET_TABLE_ + 0x20 = 0x7fda05dfd876; 0x7fda05dfd876 - 7fda05dfd000 = 0x876
    $ objdump -D
    Disassembly of section .plt:
    0000000000000870 <__fprintf_chk@plt>:
    870:   ff 25 92 07 20 00       jmpq   *0x200792(%rip)        # 201008 <_GLOBAL_OFFSET_TABLE_+0x20>
    876:   68 01 00 00 00          pushq  $0x1
    87b:   e9 d0 ff ff ff          jmpq   850 <_init+0x18>

    The above section shows the way the PLT trampoline works: the second pointer in the read/write section of’s .got.plt section references offset 0x876, which is the instruction directly after the jmpq instruction. This shows that the the pointer in .got.plt hasn’t yet been fixed-up to point to the target function. I’m sure that’s going to sound confusing unless you already understand how the PLT works. Sorry.

    Other pointers in the GOT have been fixed-up, for example the following point into GLibC:

    $ # _GLOBAL_OFFSET_TABLE_ + 0x18 = 0x7fda05ad3440; 0x7fda05ad3440 - 0x7fda05a7a000 = 0x59440
    $ # _GLOBAL_OFFSET_TABLE_ + 0x30 = 0x7fda05b777d0; 0x7fda05b777d0 - 0x7fda05a7a000 = 0xfd7d0
    $ # _GLOBAL_OFFSET_TABLE_ + 0x38 = 0x7fda05ae2580; 0x7fda05ae2580 - 0x7fda05a7a000 = 0x68580
    $ readelf -sW /lib/
    Symbol table '.dynsym' contains 2157 entries:
    Num:    Value          Size Type    Bind   Vis      Ndx Name
    1040: 0000000000059440   375 FUNC    GLOBAL DEFAULT   12 __isoc99_fscanf@@GLIBC_2.7
    113: 00000000000fd7d0   413 FUNC    GLOBAL DEFAULT   12 __printf_chk@@GLIBC_2.3.4
    167: 0000000000068580    10 FUNC    GLOBAL DEFAULT   12 fopen@@GLIBC_2.2.5

    I’m sure you get the idea.↩︎

  5. is the one to which I’ve been referring, but there’s a later version, 0.99.6, here, as of 15th May, 2012.↩︎