diff --git a/README.md b/README.md index a675004..a903944 100644 --- a/README.md +++ b/README.md @@ -62,7 +62,7 @@ Here's a collection of links about the subject, I'm putting these here because people seem to find these useful. * [`elf(5)` manpage](https://linux.die.net/man/5/elf) -* [unofficial ELF docs](https://cs.stevens.edu/%7Ejschauma/631A/elf.html) (has +* [unofficial ELF docs](elf.html) (has more than the manpage, also has extra links) * [glibc internals](http://s.eresi-project.org/inc/articles/elf-rtld.txt) * [stuff about `.gnu.hash`](https://web.archive.org/web/20111022202443/http://blogs.oracle.com/ali/entry/gnu_hash_elf_sections) diff --git a/elf.html b/elf.html new file mode 100644 index 0000000..1adfb9e --- /dev/null +++ b/elf.html @@ -0,0 +1,2503 @@ +
+ + + +This page is a copy of the Archive.org +copy of the now no longer availabel http://www.acsu.buffalo.edu/~charngda/elf.html. +It is kept here online as a reference only.
+ +ABI | Application binary interface |
a.out | Assembler output file format |
BSS | Block started by symbol. The uninitialized data segment containing statically-allocated variables. |
COFF | Common object file format |
DTV | Dynamic thread vector (for TLS) |
DWARF | A standardized debugging data format |
GD | Global Dynamic (dynamic TLS) One of the Thread-Local Storage access models. |
GOT | Global offset table |
IE | Initial Executable (static TLS with assigned offsets) One of the Thread-Local Storage access models. |
LD | Local Dynamic (dynamic TLS of local symbols) One of the Thread-Local Storage access models. |
LE | Local Executable (static TLS) One of the Thread-Local Storage access models. |
Mach-O | Mach object file format |
PC | Program counter. On x86, this is the same as IP (Instruction Pointer) register. |
PE | Portable executable |
PHT | Program header table |
PIC | Position independent code |
PIE | Position independent executable |
PLT | Procedure linkage table |
REL RELA | Relocation |
RVA | Relative virtual address |
SHF | Section header flag |
SHT | Section header table |
SO | Shared object (another name for dynamic link library) |
VMA | Virtual memory area/address |
+System V Application Binary Interface +
+AMD64 System V Application Binary Interface +
+The gen on function calling conventions +
+Section II of Linux Standard Base 4.0 Core Specification +
+Self-Service Linux: Mastering the Art of Problem Determination by Mark Wilding and Dan Behman +
+Solaris Linker and Libraries Guide +
+Linkers and Loaders by John Levine +
+Understanding Linux ELF RTLD internals by mayhem (this article gives +you an idea how the runtime linker ld.so works) +
+Prelink by Jakub Jelinek (and prelink man page) + +
+Usually there is another kind of header called Section Header, which describe +attributes of an ELF section (e.g. .text, .data, +.bss, etc) The Section Header is +described by struct Elf32_Shdr/struct Elf64_Shdr in /usr/include/elf.h +
+The Program Headers are used during execution (ELF's "execution view"); it tells the kernel or the runtime linker +ld.so what to load into memory and how to find dynamic linking information. +
+The Section Headers are used during compile-time linking (ELF's "linking view"); it tells the link editor ld +how to resolve symbols, and how to group similar byte streams from different ELF binary +objects. +
+Conceptually, the two ELF's "views" are as follows (borrowed from Shaun Clowes's Fixing/Making Holes in Binaries slides): +
+-----------------+ + +----| ELF File Header |----+ + | +-----------------+ | + v v + +-----------------+ +-----------------+ + | Program Headers | | Section Headers | + +-----------------+ +-----------------+ + || || + || || + || || + || +------------------------+ || + +--> | Contents (Byte Stream) |<--+ + +------------------------+ ++
+In reality, the layout of a typical ELF executable binary on a disk file is like this: +
+-------------------------------+ + | ELF File Header | + +-------------------------------+ + | Program Header for segment #1 | + +-------------------------------+ + | Program Header for segment #2 | + +-------------------------------+ + | ... | + +-------------------------------+ + | Contents (Byte Stream) | + | ... | + +-------------------------------+ + | Section Header for section #1 | + +-------------------------------+ + | Section Header for section #2 | + +-------------------------------+ + | ... | + +-------------------------------+ + | ".shstrtab" section | + +-------------------------------+ + | ".symtab" section | + +-------------------------------+ + | ".strtab" section | + +-------------------------------+ ++The ELF File Header contains the file offsets of the first Program Header, +the first Section Header, and .shstrtab section which contains +the section names (a series of NULL-terminated strings) +
+The ELF File Header also contains the number of Program Headers +and the number of Section Headers. +
+Each Program Header describes a "segment": It contains the permissions (Readable, Writeable, or Executable) +, offset of the "segment" (which is just a byte stream) into the file, and the size of the +"segment". The following table shows the purposes of special segments. +Some information +can be found in GNU Binutil's source file include/elf/common.h: +
+
ELF Segment | +Purpose | +
---|---|
DYNAMIC | +For dynamic binaries, this segment hold dynamic linking information and is usually + the same as .dynamic section in ELF's linking view. See paragraph below. + | +
GNU_EH_FRAME | +Frame unwind information (EH = Exception Handling). This segment is usually the same as .eh_frame_hdr section in ELF's linking view. + | +
GNU_RELRO | +This segment indicates the memory region which should be made Read-Only after relocation is done. + This segment usually appears in a dynamic link library and it + contains .ctors, .dtors, .dynamic, .got + sections. See paragraph below. + | +
GNU_STACK | +The permission flag of this segment indicates whether the + stack is executable or not. + This segment does not have any content; it is just an indicator. + | +
INTERP | +For dynamic binaries, this holds the full pathname of runtime linker ld.so + This segement is the same as .interp section in ELF's linking view. + |
+
LOAD | +Loadable program segment. Only segments of this type are loaded into memory during execution. | +
NOTE | +Auxiliary information. For core dumps, this segment contains the status of the process (when the core dump is created), + such as the signal (the process received and caused it to dump core), pending & held signals, + process ID, parent process ID, user ID, nice value, + cumulative user & system time, values of registers (including the program counter!) For more info, see + struct elf_prstatus and struct elf_prpsinfo in Linux kernel source file + include/linux/elfcore.h + and struct user_regs_struct in + arch/x86/include/asm/user_64.h |
+
TLS | +Thread-Local Storage | +
+Likewise, each Section Header contains the file offset of its corresponding "content" +and the size of the "content". +The following table shows the purposes of some special sections. Most information +here comes from LSB specification. +Some information can be found in GNU Binutil's source file +bfd/elf.c (look for +bfd_elf_special_section) +and bfd/elflink.c (look for +double-quoted section names such as ".got.plt") +
+
ELF Section | +Purpose | +
---|---|
.bss | +Uninitialized global data ("Block Started by Symbol").
+ Depending on the compilers, uninitialized global variables could + be stored in a nameness section called COMMON (named after + Fortran 77's "common blocks".) To wit, consider + the following code: + int globalVar; + static int globalStaticVar; + void dummy() { + static int localStaticVar; + } ++ Compile with gcc -c, then on x86_64, the resulting object file has the + following structure: + $ objdump -t foo.o + + SYMBOL TABLE: + .... + 0000000000000000 l O .bss 0000000000000004 globalStaticVar + 0000000000000004 l O .bss 0000000000000004 localStaticVar.1619 + .... + 0000000000000004 O *COM* 0000000000000004 globalVar ++ so only the file-scope and local-scope global variables are in + the .bss section. + + If one wants globalVar to reside in the .bss section, + use the -fno-common + compiler command-line option. Using -fno-common + is encouraged, as the following example shows: + $ cat foo.c + int globalVar; + $ cat bar.c + double globalVar; + int main(){} + $ gcc foo.c bar.c ++ Not only there is no error message about redefinition of the same symbol + in both source files (notice we did not use the extern keyword here), + there is no complaint about their different data + types and sizes either. However, if one uses -fno-common, + the compiler will complain: + /tmp/ccM71JR7.o:(.bss+0x0): multiple definition of `globalVar' + /tmp/ccIbS5MO.o:(.bss+0x0): first defined here + ld: Warning: size of symbol `globalVar' changed from 8 in /tmp/ccIbS5MO.o to 4 in /tmp/ccM71JR7.o ++ |
+
.comment | +A series of NULL-terminated strings containing compiler information. | +
.ctors | +Pointers to functions which are marked as
+ __attribute__ ((constructor)) as well as static C++ objects' constructors.
+ They will be used by __libc_global_ctors function. + See paragraphs below. + |
+
.data | +Initialized data. | +
.data.rel.ro | +Similar to .data section, but this section + should be made Read-Only after relocation is done. + | +
.debug_XXX | +Debugging information (for the programs which are compiled with -g option)
+ which is in the DWARF 2.0 format.
+ + See here for DWARF debugging format. + |
+
.dtors | +Pointers to functions which are marked as
+ __attribute__ ((destructor)) as well as static C++ objects' destructors.
+ + See paragraphs below. + |
+
.dynamic | +For dynamic binaries, this section holds dynamic linking information used by ld.so. + See paragraphs below. | +
.dynstr | +NULL-terminated strings of names of symbols in .dynsym section.
+ One can use commands such as readelf -p .dynstr a.out to see these strings. + |
+
.dynsym | +Runtime/Dynamic symbol table. For dynamic binaries, this section is the symbol table of
+ globally visible symbols. For example, if a dynamic link library wants to export
+ its symbols, these symbols will be stored here. On the other hand, if
+ a dynamic executable binary uses symbols from a dynamic link library,
+ then these symbols are stored here too.
+ + The symbol names (as NULL-terminated strings) are stored in .dynstr section. + |
+
.eh_frame .eh_frame_hdr |
+ Frame unwind information (EH = Exception Handling).
+ + See here + for details. + To see the content of .eh_frame section, use + readelf --debug-dump=frames-interp a.out+ |
+
.fini | +Code which will be executed when program exits normally. See paragraphs below. | +
.fini_array | +Pointers to functions which will be executed when program exits normally. See paragraphs below. | +
.GCC.command.line | +A series of NULL-terminated strings containing
+ GCC command-line (that is used to compile the code) options. This feature is supported since GCC 4.5 + and the program must be compiled with -frecord-gcc-switches option. + |
+
.gnu.hash | +GNU's extension to hash table for symbols. + See here for its structure and the hash algorithm. + + The link editor ld calls bfd_elf_gnu_hash in + in GNU Binutil's source file bfd/elf.c + to compute the hash value. + + The runtime linker ld.so calls do_lookup_x in + elf/dl-lookup.c + to do the symbol look-up. The hash computing function here is dl_new_hash. + |
+
.gnu.linkonceXXX | +GNU's extension. It means only a single copy of the section will be used in linking. + This is used to by g++. g++ will emit each template expansion in its own section. + The symbols will be defined as weak, so that multiple definitions + are permitted. + | +
.gnu.version | +Versions of symbols.
+ See here, + here, + here, + and + here + for details of symbol versioning. + |
+
.gnu.version_d | +Version definitions of symbols. | +
.gnu.version_r | +Version references (version needs) of symbols. | +
.got | +For dynamic binaries, this Global Offset Table holds the addresses of variables which are + relocated upon loading. See paragraphs below. + | +
.got.plt | +For dynamic binaries, this Global Offset Table holds the addresses of functions in dynamic libraries. + They are used by trampoline code in .plt section. + If .got.plt section is present, it contains at least three entries, which + have special meanings. See paragraphs below. + | +
.hash | +Hash table for symbols. + See here for its structure and the hash algorithm. + + The link editor ld calls bfd_elf_hash in + in GNU Binutil's source file bfd/elf.c + to compute the hash value. + + The runtime linker ld.so calls do_lookup_x in + elf/dl-lookup.c + to do the symbol look-up. The hash computing function here is _dl_elf_hash. + |
+
.init | +Code which will be executed when program initializes. See paragraphs below. | +
.init_array | +Pointers to functions which will be executed when program starts. See paragraphs below. | +
.interp | +For dynamic binaries, this holds the full pathname of runtime linker ld.so | +
.jcr | +Java class registration information. + Like .ctors section, it contains a list of addresses + which will be used by _Jv_RegisterClasses function + in CRT (C Runtime) startup files (see gcc/crtstuff.c + in GCC's source tree) + |
+
.note.ABI-tag | +This Linux-specific section is structured as a note + section in ELF specification. Its content is mandated + here. + | +
.note.gnu.build-id | +A unique build ID. See here and + here + | +
.note.GNU-stack | +See here + | +
.nvFatBinSegment | +This segment contains information of nVidia's CUDA fat binary container. Its format + is described by struct __cudaFatCudaBinaryRec in __cudaFatFormat.h + | +
.plt | +For dynamic binaries, this Procedure Linkage Table holds the trampoline/linkage code. See paragraphs below. | +
.preinit_array | +Similar to .init_array section. See paragraphs below. | +
.rela.dyn | +Runtime/Dynamic relocation table.
+ + For dynamic binaries, this relocation table holds information of variables which + must be relocated upon loading. Each entry in this table is a + struct Elf64_Rela (see /usr/include/elf.h) which + has only three members: +
|
+
.rela.plt | +Runtime/Dynamic relocation table.
+ + This relocation table is similar to the one in .rela.dyn section; + the difference is this one is for functions, not variables. + The relocation type of entries in this table is + R_386_JMP_SLOT or R_X86_64_JUMP_SLOT and + the "offset" refers to memory addresses which are + inside .got.plt section. + Simply put, this table holds information to relocate entries in + .got.plt section. + |
+
.rel.text .rela.text |
+ Compile-time/Static relocation table.
+ For programs compiled with -c option, + this section provides information to the link editor ld + where and how to "patch" executable code in .text section. + The difference between .rel.text and .rela.text + is entries in the former does not have addend member. + (Compare struct Elf64_Rel with struct Elf64_Rela in /usr/include/elf.h) + Instead, the addend is taken from the memory location + described by offset member. + + Whether to use .rel or .rela is platform-dependent. + For x86_32, it is .rel and for x86_64, .rela + |
+
.rel.XXX .rela.XXX |
+ Compile-time/Static relocation table for other sections. For example, + .rela.init_array is the relocation table for .init_array + section. + | +
.rodata | +Read-only data. | +
.shstrtab | +NULL-terminated strings of section names.
+ One can use commands such as readelf -p .shstrtab a.out to see these strings. + |
+
.strtab | +NULL-terminated strings of names of symbols in .symtab section.
+ One can use commands such as readelf -p .strtab a.out to see these strings. + |
+
.symtab | +Compile-time/Static symbol table.
+ This is the main symbol table used in compile-time linking + or runtime debugging. + + The symbol names (as NULL-terminated strings) are stored in .strtab section. + Both .symtab and .symtab can be stripped away by the strip + command. + |
+
.tbss | +Similar to .bss section, but for Thread-Local data. See paragraphs below. | +
.tdata | +Similar to .data section, but for Thread-Local data. See paragraphs below. | +
.text | +User's executable code | +
+When the shell makes an execvc system call to run an executable binary, the Linux kernel responds as +follows (see here and +here for more details) in sequence: +
load_elf_binary also examines + whether the user's executable binary contains an INTERP segment or not. + +
To see this, use command readelf -p .interp a.out +
According to AMD64 System V Application Binary Interface, + the only valid interpreter for programs conforming to AMD64 ABI is /lib/ld64.so.1 + and on Linux, GCC usually uses /lib64/ld-linux-x86-64.so.2 + or /lib/ld-linux-x86-64.so.2 instead: +
$ gcc -dumpspecs +.... + +*link: +... + %{!m32:%{!dynamic-linker:-dynamic-linker %{muclibc:%{mglibc:%e-mglibc and -muclibc used +together}/lib/ld64-uClibc.so.0;:/lib/ld-linux-x86-64.so.2}}}} +... ++
To change the runtime linker, compile the program using something like
gcc foo.c -Wl,-I/my/own/ld.so+
The System V Application Binary Interface + specifies, the operating system, instead of running the user's executable binary, should run this + "interpreter". This interpreter should complete the binding of user's executable binary + to its dependencies. + +
+This link +provides general tips for building Glibc. Glibc's own +INSTALL and +FAQ documents +are useful too. +
+To compile Glibc (ld.so cannot be compiled independently) download and unpack Glibc source tarball. +
/scratch/elf/librtld.os: In function `process_envvars': +/tmp/glibc-2.x.y/elf/rtld.c:2718: undefined reference to `__open' +... ++ +
/tmp/glibc-2.x.y/+Then edit /tmp/glibc-2.x.y/Makefile.in: Un-comment the line
# PARALLELMFLAGS = -j 4and +change 4 to an appropriate number.
+
all-subdirs = csu assert ctype locale intl catgets math setjmp signal \ + ... ++and change it to +
all-subdirs = csu elf gmon io misc posix setjmp signal stdlib string time ++ +
$ cd /scratch +$ /tmp/glibc-2.x.y/configure --prefix=/scratch --disable-profile +$ gmake ++ +
+_dl_debug_printf +is not the full-blown printf and has very limited capabilities. +For example, to print the address, one would need to use +
_dl_debug_printf("0x%0*lx\n", (int)sizeof (void*)*2, &foo); ++
(gdb) break _dl_start +Function "_dl_start" not defined. +Make breakpoint pending on future shared library load? (y or [n]) y +Breakpoint 1 (_dl_start) pending. +(gdb) run +Starting program: a.out + +Breakpoint 1, 0x0000003433e00fa0 in _dl_start () from /lib64/ld-linux-x86-64.so.2 +(gdb) bt +#0 0x0000003433e00fa0 in _dl_start () from /lib64/ld-linux-x86-64.so.2 +#1 0x0000003433e00a78 in _start () from /lib64/ld-linux-x86-64.so.2 +#2 0x0000000000000001 in ?? () +#3 0x00007fffffffe4f2 in ?? () +#4 0x0000000000000000 in ?? () +... +(gdb) x/10i $pc + 0x3433e00a70 <_start>: mov %rsp,%rdi + 0x3433e00a73 <_start+3>: callq 0x3433e00fa0 <_dl_start> + 0x3433e00a78 <_dl_start_user>: mov %rax,%r12 + 0x3433e00a7b <_dl_start_user+3>: mov 0x21b30b(%rip),%eax # 0x343401bd8c <_dl_skip_args> +... ++At this breakpoint, we can use pmap to see the memory map of a.out, which would +look like this: +
0000000000400000 8K r-x-- a.out +0000000000601000 4K rw--- a.out +0000003433e00000 112K r-x-- /lib64/ld-2.5.so +000000343401b000 8K rw--- /lib64/ld-2.5.so +00007ffffffea000 84K rw--- [ stack ] +ffffffffff600000 8192K ----- [ anon ] + total 8408K ++The memory segment of /lib64/ld-2.5.so indeed starts at 3433e00000 (page aligned) and +this can be verified by running readelf -t /lib64/ld-2.5.so. +
+If we put another breakpoint at main and continue, then when it stops, the memory +map would change to this: +
0000000000400000 8K r-x-- a.out +0000000000601000 4K rw--- a.out +0000003433e00000 112K r-x-- /lib64/ld-2.5.so +000000343401b000 4K r---- /lib64/ld-2.5.so +000000343401c000 4K rw--- /lib64/ld-2.5.so +0000003434200000 1336K r-x-- /lib64/libc-2.5.so <-- The first "LOAD" segment, which contains .text and .rodata sections +000000343434e000 2044K ----- /lib64/libc-2.5.so <-- "Hole" +000000343454d000 16K r---- /lib64/libc-2.5.so <-- Relocation (GNU_RELRO) info -+---- The second "LOAD" segment +0000003434551000 4K rw--- /lib64/libc-2.5.so <-- .got.plt .data sections -+ +0000003434552000 20K rw--- [ anon ] <-- The remaining zero-filled sections (e.g. .bss) +0000003434e00000 88K r-x-- /lib64/libpthread-2.5.so <-- The first "LOAD" segment, which contains .text and .rodata sections +0000003434e16000 2044K ----- /lib64/libpthread-2.5.so <-- "Hole" +0000003435015000 4K r---- /lib64/libpthread-2.5.so <-- Relocation (GNU_RELRO) info -+---- The second "LOAD" segment +0000003435016000 4K rw--- /lib64/libpthread-2.5.so <-- .got.plt .data sections -+ +0000003435017000 16K rw--- [ anon ] <-- The remaining zero-filled sections (e.g. .bss) +00002aaaaaaab000 4K rw--- [ anon ] +00002aaaaaac6000 12K rw--- [ anon ] +00007ffffffea000 84K rw--- [ stack ] +ffffffffff600000 8192K ----- [ anon ] + total 14000K ++Indeed, ld.so has brought in all the required dynamic libraries.
Note that there +are two memory regions of 2044KB with null permissions. +As mentioned earlier, the ELF's 'execution view' is concerned with how to load an executable +binary into memory. When ld.so brings in the dynamic libraries, it looks at the segments labelled +as LOAD (look at "Program Headers" and "Section to Segment mapping" +from readelf -a xxx.so command.) Usually there are two LOAD segments, and +there is a "hole" between the two segments (look at the VirtAddr and MemSiz of these +two segments), so ld.so will +make this hole inaccessible deliberately: Look for the PROT_NONE symbol in +_dl_map_object_from_fd in elf/dl-load.c +
+Also note that each of +libc-2.5.so and libpthread-2.5.so has a read-only memory region +(at 0x343454d000 and 0x3435015000, respectively). This is a for +elf/dl-reloc.c. +The GNU_RELRO segment is contained in the the second LOAD segment, which +contains the following sections (look at "Program Headers" and "Section to Segment mapping" +from readelf -l xxx.so command): +.tdata, .fini_array, .ctors, .dtors, __libc_subfreeres, +__libc_atexit, __libc_thread_subfreeres, .data.rel.ro, .dynamic, +.got, .got.plt, .data, and .bss. Except for +.got.plt, .data, and .bss, all sections in the the second LOAD segment +are also in the GNU_RELRO segment, and they are thus made read-only. +
+The two [anon] memory segments at 0x3434552000 and 0x3435017000 are for sections which do not take space in the ELF +binary files. For example, readelf -t xxx.so will show that .bss section +has NOBITS flag, which means that section takes no disk space. When segments +containing NOBITS sections are mapped into memory, ld.so allocates +extra memory pages to accomodate these NOBITS sections. A LOAD +segment is usually structured as a series of contiguous sections, and if +a segment contains NOBITS sections, these NOBITS sections will +be grouped together and placed at the tail of the segment. +
+So what does _dl_start do ? +
+ It calls process_envvars to handle these LD_ prefix environmental + variables such as LD_PRELOAD, LD_LIBRARY_PATH.
+ It examines the NEEDED field(s) in the user executable binary's DYNAMIC segment + section (see below) to determine the dependencies.
+ It calls _dl_init_paths (in elf/dl-load.c) + to initialize the dynamic libraries search paths. + According to ld.so man page + and this page, + the dynamic libraries are searched in the following order: +
+
RPATH can be specified when + the code is compiled with gcc -Wl,-rpath=... +
Use of RPATH is deprecated + because it has an obvious drawback: There is no way to override + it except using LD_PRELOAD environmental variable + or removing it from the DYNAMIC segment. +
Both RPATH and RUNPATH can + contain $ORIGIN + (or equivalently ${ORIGIN}), which will be + expanded to the value of environmental variable LD_ORIGIN_PATH + or the full path of the loaded object + (unless the programs use setuid or setgid) +
+
+ It calls _dl_map_object_from_fd (in elf/dl-load.c) + to load the dynamic libraries, sets up the right read/write/execute permissions for the memory segments, + (within _dl_map_object_from_fd, look at calls to mmap, mprotect and symbols such as + PROT_READ, PROT_WRITE, PROT_EXEC, PROT_NONE), + zeroes out BSS sections of dynamic libraries (inside _dl_map_object_from_fd function, look at calls to memset), + updates the link map, and performs relocations.
+ It calls _dl_relocate_object (in elf/dl-reloc.c) to perform runtime relocations (see details below). +
+ +
Note that _dl_init_internal is defined in elf/dl-init.c as: +
void +internal_function +_dl_init (struct link_map *main_map, int argc, char **argv, char **env) ++ call_init is also in elf/dl-init.c
+ +
_init will do the following things: +
+ For x86_64, .ctors section contains only one function: init_cacheinfo
+
+
+To see ld.so in action, set the environmental +variable LD_DEBUG to all and then run a user program. +
The above debugging information does not show mmap and mprotect calls. +However, we can use strace. If we run the user program again with +
strace -e trace=mmap,mprotect,munmap,open a.outwe should see something like the +following: +
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2ae62c0d1000 + + .... (a lot of failed attempts to open 'libpthread.so.0' using LD_LIBRARY_PATH) + +open("/etc/ld.so.cache", O_RDONLY) = 3 +mmap(NULL, 104801, PROT_READ, MAP_PRIVATE, 3, 0) = 0x2ae62c0d2000 +open("/lib64/libpthread.so.0", O_RDONLY) = 3 +mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2ae62c0ec000 +mmap(0x3434e00000, 2204528, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x3434e00000 <-- Bring in the first "LOAD" segment +mprotect(0x3434e16000, 2093056, PROT_NONE) = 0 <-- Make the "hole" inaccessible +mmap(0x3435015000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x15000) = 0x3435015000 <-- Bring in the second "LOAD" segment +mmap(0x3435017000, 13168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x3435017000 + (note: 0x3435017000 is the [anon] part which follows immediately after libpthread-2.5.so) + ... + .... (a lot of failed attempts to open 'libc.so.6' using LD_LIBRARY_PATH) + +open("/lib64/libc.so.6", O_RDONLY) = 3 +mmap(0x3434200000, 3498328, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x3434200000 <-- Bring in the first "LOAD" segment +mprotect(0x343434e000, 2093056, PROT_NONE) = 0 <-- Make the "hole" inaccessible +mmap(0x343454d000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x14d000) = 0x343454d000 <-- Bring in the second "LOAD" segment +mmap(0x3434552000, 16728, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x3434552000 + (note: 0x3434552000 is the [anon] part which follows immediately after libc-2.5.so) +mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2ae62c0ed000 +mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2ae62c0ee000 +mprotect(0x343454d000, 16384, PROT_READ) = 0 <-- Make the GNU_RELRO segment read-only +mprotect(0x3435015000, 4096, PROT_READ) = 0 <-- Make the GNU_RELRO segment read-only +mprotect(0x343401b000, 4096, PROT_READ) = 0 +munmap(0x2ae62c0d2000, 104801)= 0 +mmap(NULL, 10489856, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_32BIT, -1, 0) = 0x40dc7000 +mprotect(0x40dc7000, 4096, PROT_NONE) = 0 +mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aaaaaaab000 ++ +
4003c0 <printf@plt-0x10>: + 4003c0: push QWORD PTR [RIP+0x2004d2] # 600898 <_GLOBAL_OFFSET_TABLE_+0x8> + 4003c6: jmp QWORD PTR [RIP+0x2004d4] # 6008a0 <_GLOBAL_OFFSET_TABLE_+0x10> + 4003cc: nop DWORD PTR [RAX+0x0] + +4003d0 <printf@plt>: + 4003d0: jmp QWORD PTR [RIP+0x2004d2] # 6008a8 <_GLOBAL_OFFSET_TABLE_+0x18> + 4003d6: push 0 + 4003db: jmp 4003c0 <printf@plt-0x10> + +4003e0 <__libc_start_main@plt>: + 4003e0: jmp QWORD PTR [RIP+0x2004ca] # 6008b0 <_GLOBAL_OFFSET_TABLE_+0x20> + 4003e6: push 1 + 4003eb: jmp 4003c0 <printf@plt-0x10> ++ +The _GLOBAL_OFFSET_TABLE_ (labeled as R_X86_64_JUMP_SLOT and starts at address 0x600890) is located in +.got.plt section (to see this, run the command objdump -h a.out |grep -A 1 600890 +or the command readelf -r a.out) +The data in .got.plt section look like the following during runtime +(use gdb to see them) +
(gdb) b *0x4003d0 +(gdb) run +(gdb) x/6a 0x600890 +0x600890: 0x6006e8 <_DYNAMIC> 0x32696159a8 +0x6008a0: 0x326950aa20 <_dl_runtime_resolve> 0x4003d6 <printf@plt+6> +0x6008b0: 0x326971c3f0 <__libc_start_main> 0x0 ++When printf is called the first time in the user program, the +jump at 4003d0 will jump to 4003d6, which is just the next instruction (push 0) +The it jumps to 4003c0, which does not have a function name (so it is +shown as <printf@plt-0x10>). At 4003c6, it will jumps +to _dl_runtime_resolve. This function (in Glibc's source file +sysdeps/x86_64/dl-trampoline.S) +is a trampoline to _dl_fixup (in Glibc's source file +elf/dl-runtime.c). +_dl_fixup again, is part of Glibc runtime linker ld.so. In particular, it will change +the address stored at 6008a8 to the actual +address of printf in libc.so.6. To see this, set up a +hardware watchpoint +
(gdb) watch *0x6008a8 +(gdb) cont +Continuing. +Hardware watchpoint 2: *0x6008a8 + +Old value = 4195286 +New value = 1769244016 +0x000000326950abc2 in fixup () from /lib64/ld-linux-x86-64.so.2 ++If we continue execution, printf will be called, as +expected. When printf is called again in the user program, the +jump at 4003d0 will bounce directly to printf: +
(gdb) x/6a 0x600890 +0x600890: 0x6006e8 <_DYNAMIC> 0x32696159a8 +0x6008a0: 0x326950aa20 <_dl_runtime_resolve> 0x3269748570 <printf> +0x6008b0: 0x326971c3f0 <__libc_start_main> 0x0 ++ +
4003a8 <_init>: + 4003a8: sub RSP, 8 + 4003ac: call call_gmon_start + 4003b1: call frame_dummy + 4003b6: call __do_global_ctors_aux + 4003bb: add RSP, 8 + 4003bf: ret + +400618 <_fini>: + 400618: sub RSP, 8 + 40061c: call __do_global_dtors_aux + 400621: add RSP, 8 + 400625: ret ++ +There is only one function: _init, in .init section, and +likewise, only one function: _fini in .fini section. +Both _init and _fini are synthesized at compile time +by the compiler/linker. Glibc +provides its own prolog and epilog for _init and _fini, but +the compiler is free to choose how to use them and add more code into _init +and _fini. +
+In Glibc, the source file sysdeps/generic/initfini.c +(and some system dependent ones, such as sysdeps/x86_64/elf/initfini.c) +is compiled into two files: /usr/lib64/crti.o for prolog +and /usr/lib64/crtn.o for epilog. +
+For the compiler part, GCC uses different prolog and epilog files, depending +on the compiler command-line options. To see them, execute gcc -dumpspec, +and one can see +
... + +*endfile: + %{ffast-math|funsafe-math-optimizations:crtfastmath.o%s} + %{mpc32:crtprec32.o%s} + %{mpc64:crtprec64.o%s} + %{mpc80:crtprec80.o%s} + %{shared|pie:crtendS.o%s;:crtend.o%s} + crtn.o%s + +... + +*startfile: + %{!shared: %{pg|p|profile:gcrt1.o%s;pie:Scrt1.o%s;:crt1.o%s}} + crti.o%s + %{static:crtbeginT.o%s;shared|pie:crtbeginS.o%s;:crtbegin.o%s} + +... ++The detailed explanation of GCC spec file is here. +For above snippet, it means, for example, if compiler command-line +option -ffast-math is used, include GCC's crtfastmath.o +file (this file can be found under /usr/lib/gcc/<arch>/<version>/) +at the end of the linking process. Glibc's crtn.o is always +included at the end of linking. The %s means this preceding file is a startup file. (GCC allows +to skip startup files during linking using -nostartfiles compiler option) +
Similarly, if -shared compiler command-line option is not used, +then always include Glibc's crt1.o at the start of the linking process. +crt1.o contains the function _start in .text section (not .init section!) +_start is the function that is executed before anything else... see below. +Next, include Glibc's crti.o in the linking. Finally, include either +crtbeginT.o, crtbeginS.o, or crtbegin.o (both are part of GCC, of course), depending on +whether -static or -shared (or neither) is used. +
+So, for example, if a program is compiled using dynamic linking (which is default), no profiling, no fast +math optimizations, then the linking will include the following files in the following order: +
+Now back to the 4003a8 <_init>. +
+call_gmon_start is part of the Glibc prolog /usr/lib64/crti.o. +It initializes gprof related +data structures. +
+frame_dummy is in GCC code gcc/crtstuff.c and it +is used to set up excepion handling and Java class registration (JCR) information. +
+The most interesting code is __do_global_ctors_aux (in +GCC's gcc/crtstuff.c and +gcc/gbl-ctors.h) What it does +is to call functions which are marked as +__attribute__ ((constructor)) (and static C++ objects' constructors) one by one: +
__SIZE_TYPE__ nptrs = (__SIZE_TYPE__) __CTOR_LIST__[0]; + unsigned i; + + if (nptrs == (__SIZE_TYPE__)-1) + for (nptrs = 0; __CTOR_LIST__[nptrs + 1] != 0; nptrs++); + + for (i = nptrs; i >= 1; i--) + __CTOR_LIST__[i] (); ++The array __CTOR_LIST__ is stored in a special section called .ctors. +Suppose a function called foo is marked as __attribute__ ((constructor)), +then the runtime call stack trace would be +
(gdb) break foo +(gdb) run +(gdb) bt +#0 0x00000000004004d8 in foo () +#1 0x0000000000400606 in __do_global_ctors_aux () +#2 0x00000000004003bb in _init () +#3 0x00000000004005a0 in ?? () +#4 0x0000000000400561 in __libc_csu_init () +#5 0x000000326971c46f in __libc_start_main () +#6 0x000000000040041a in _start () ++Similarly, the __do_global_dtors_aux in _fini function +will invoke all functions which are marked as +__attribute__ ((destructor)). __do_global_dtors_aux code is also +in GCC's source tree at gcc/crtstuff.c. If +a function called foo is marked as __attribute__ ((destructor)) +(and static C++ objects' destructors), then the runtime call stack trace would be +
(gdb) bt +#0 0x0000000000400518 in foo () +#1 0x00000000004004ca in __do_global_dtors_aux () +#2 0x0000000000400641 in _fini () +#3 0x00000032699367e8 in ?? () from /lib64/tls/libc.so.6 +#4 0x0000003269730c95 in exit () from /lib64/tls/libc.so.6 +#5 0x000000326971c4d2 in __libc_start_main () from /lib64/tls/libc.so.6 +#6 0x000000000040045a in _start () ++The array __DTOR_LIST__ contains the addresses of these destructors +and it is stored in a special section called .dtors. + +
void __libc_csu_init (int argc, char **argv, char **envp) + { + #ifndef LIBC_NONSHARED + { + const size_t size = __preinit_array_end - __preinit_array_start; + size_t i; + for (i = 0; i < size; i++) + (*__preinit_array_start [i]) (argc, argv, envp); + } + #endif + + _init (); + + const size_t size = __init_array_end - __init_array_start; + for (size_t i = 0; i < size; i++) + (*__init_array_start [i]) (argc, argv, envp); + } ++(Symbols such as __preinit_array_start, __preinit_array_end, __init_array_start, +__init_array_end are defined by the default ld script; +look for PROVIDE +and PROVIDE_HIDDEN keywords in the output of ld -verbose command.) +
+The __libc_csu_fini function has similar code, but what +functions to be executed at program exit are actually determined by exit: +
void __libc_csu_fini (void) + { + #ifndef LIBC_NONSHARED + size_t i = __fini_array_end - __fini_array_start; + while (i-- > 0) + (*__fini_array_start [i]) (); + + _fini (); + #endif + } ++
+To see what's going on, consider the following C code example: +
#include <stdio.h> + #include <stdlib.h> + + void preinit(int argc, char **argv, char **envp) { + printf("%s\n", __FUNCTION__); + } + + void init(int argc, char **argv, char **envp) { + printf("%s\n", __FUNCTION__); + } + + void fini() { + printf("%s\n", __FUNCTION__); + } + + __attribute__((section(".init_array"))) typeof(init) *__init = init; + __attribute__((section(".preinit_array"))) typeof(preinit) *__preinit = preinit; + __attribute__((section(".fini_array"))) typeof(fini) *__fini = fini; + + void __attribute__ ((constructor)) constructor() { + printf("%s\n", __FUNCTION__); + } + + void __attribute__ ((destructor)) destructor() { + printf("%s\n", __FUNCTION__); + } + + void my_atexit() { + printf("%s\n", __FUNCTION__); + } + + void my_atexit2() { + printf("%s\n", __FUNCTION__); + } + + int main() { + atexit(my_atexit); + atexit(my_atexit2); + } ++The output will be +
preinit + constructor + init + my_atexit2 + my_atexit + fini + destructor ++The .preinit_array and .init_array sections must contain +function pointers (NOT code!) The prototype of these functions must be
void func(int argc,char** argv,char** envp)+__libc_csu_init execute them in the following order: +
+It is not advisable to put a code in .init section, e.g. +
void __attribute__((section(".init"))) foo() { + ... +} ++because doing so will cause __do_global_ctors_aux NOT to be called. The .init +section will now look like this: +
4003a0 <_init>: + 4003a0: sub RSP, 8 + 4003a4: call call_gmon_start + 4003a9: call frame_dummy + +4003ae <foo>: + 4003ae: push RBP + 4003af: mov RBP, RSP + + .... (foo's body) + + 4003b2: leave + 4003b3: ret + 4003b4: call __do_global_ctors_aux + 4003b9: add RSP, 8 + 4003bd: ret ++
+Now .init section contains more than one function, but the +epilog of _init is distorted by the insertion of foo +
+Similarly, it is not advisable to put a code in .fini section, +because otherwise the code will look like this: +
4006d8 <_fini>: + 4006d8: sub RSP, 8 + 4006dc: call __do_global_dtors_aux + +4006e1 <foo>: + 4006e1: push RBP + 4006e2: mov RBP, RSP + + .... (foo's body) + + 4006ef: leave + 4006f0: ret + 4006f1: add RSP, 8 + 4006f5: ret ++Now the epilog of _fini is distorted by the insertion of foo, so +the stack frame pointer will not be adjusted (add RSP, 8 is not executed), +causing segmentation fault. + +
+_start is part of Glibc code, as in sysdeps/x86_64/elf/start.S. +As mentioned earlier, it is compiled as /usr/lib64/crt1.o and is statically linked to +user's executable binary during compilation. To see this, run gcc with -v command, and +the last line would be something like: +
.../collect2 ... /usr/lib64/crt1.o /usr/lib64/crti.o ... /usr/lib64/crtn.o ++_start is always placed at the beginning of .text section, and +the default ld script specifies +"Entry point address" (in ELF header, use readelf -h ld.so|grep Entry command to see) +to be the address of _start (use ld -verbose | grep ENTRY command to see), so +_start is guaranteed to +be run before anything else. (This is changeable, however, at compile time +one can specify a different initial address +by -e option) +
+_start does only one thing: It sets up the arguments needed by +__libc_start_main and then call it. + +__libc_start_main's source code is csu/libc-start.c +and its function prototype is: +
__libc_start_main (int (*main) (int, char **, char **), + int argc, + char *argv, + int (*init) (int, char **, char **), + void (*fini) (void), + void (*rtld_fini) (void), + void *stack_end) + ) ++__libc_start_main does quite a lot of work in +addition to kicking off __libc_csu_init: +
__libc_setup_tls will initialize Thread Control Block + and Dynamic Thread Vector. +
Of course, if +the user program calls exit or abort, then exit +will gets called. +
+If one tries to build a program which does not contain main, then one should see the following error: +
/usr/lib/crt1.o: In function `_start': (.text+0x20): undefined reference to `main' +collect2: ld returned 1 exit status ++As mentioned earlier, crt1.o (part of Glibc) contains the function +_start, which will call +__libc_start_main and pass main (a function pointer) as one of the arguments. +If one uses +
nm -u /usr/lib/crt1.o ++then it will show main is a undefined symbol in crt1.o. Now let's disassemble +crt1.o: +
$ objdump -M intel -dj .text /usr/lib/crt1.o + +crt1.o: file format elf64-x86-64 + +Disassembly of section .text: + +0000000000000000 <_start>: + 0: 31 ed xor ebp,ebp + 2: 49 89 d1 mov r9,rdx + 5: 5e pop rsi + 6: 48 89 e2 mov rdx,rsp + 9: 48 83 e4 f0 and rsp,0xfffffffffffffff0 + d: 50 push rax + e: 54 push rsp + f: 49 c7 c0 00 00 00 00 mov r8,0x0 + 16: 48 c7 c1 00 00 00 00 mov rcx,0x0 + 1d: 48 c7 c7 00 00 00 00 mov rdi,0x0 + 24: e8 00 00 00 00 call 29 <_start+0x29> + 29: f4 hlt + ... ++Above shows .text+0x20 refers to +the 4 bytes of an mov instruction. This means during the +linking, the address of main should be resolved +and then inserted at the right memory location: .text+0x20. Now let's cross reference +the relocation table: +
$ readelf -p /usr/lib/crt1.o + +Relocation section '.rela.text' at offset 0x410 contains 4 entries: + Offset Info Type Sym. Value Sym. Name + Addend +000000000012 00090000000b R_X86_64_32S 0000000000000000 __libc_csu_fini + 0 +000000000019 000b0000000b R_X86_64_32S 0000000000000000 __libc_csu_init + 0 +000000000020 000c0000000b R_X86_64_32S 0000000000000000 main + 0 +000000000025 000f00000002 R_X86_64_PC32 0000000000000000 __libc_start_main - 4 ++Above shows where 0x20 comes from. + +
+From above analysis, it's possible to find out the address of main (which is +NOT the "Entry point address" seen from the output of readelf -h a.out | grep Entry +command. "Entry point address" is the address of _start) +
+Since the address of main is the first argument to the call +to __libc_start_main, we can extract the value of the first +argument as follows. +
+On 64-bit x86, the calling convention +requires that the first argument +goes to RDI register, so the +address can be extracted by +
objdump -j .text -d a.out | grep -B5 'call.*__libc_start_main' | awk '/mov.*%rdi/ { print $NF }' ++On 32-bit x86, the C calling +convention ("cdecl") is that the first argument +is the last item to be pushed onto the stack +before the call, so the +address can be extracted by +
objdump -j .text -d a.out | grep -B2 'call.*__libc_start_main' | awk '/push.*0x/ { print $NF }' ++ +
+
Relocation type | +Meaning | +Used when | +
---|---|---|
R_X86_64_16 | +Direct 16 bit zero extended | ++ |
R_X86_64_32 | +Direct 32 bit zero extended | ++ |
R_X86_64_32S | +Direct 32 bit + sign extended | ++ |
R_X86_64_64 | +Direct 64 bit | +Large code model | +
R_X86_64_8 | +Direct 8 bit sign extended | ++ |
R_X86_64_COPY | +Copy symbol at runtime | ++ |
R_X86_64_DTPMOD64 | +ID of module containing symbol | +TLS | +
R_X86_64_DTPOFF32 | +Offset in TLS block | +TLS | +
R_X86_64_DTPOFF64 | +Offset in module's TLS block | +TLS | +
R_X86_64_GLOB_DAT | +.got section, which contains addresses to the actual functions in DLL | ++ |
R_X86_64_GOT32 | +32 bit GOT entry | ++ |
R_X86_64_GOT64 | +64-bit GOT entry offset | +PIC & Large code model | +
R_X86_64_GOTOFF64 | +64-bit GOT offset | +PIC & Large code model | +
R_X86_64_GOTPC32 | +32-bit PC relative offset to GOT | ++ |
R_X86_64_GOTPC32_TLSDESC | +32-bit PC relative to TLS descriptor in GOT | +TLS | +
R_X86_64_GOTPC64 | +64-bit PC relative offset to GOT | +PIC & Large code model | +
R_X86_64_GOTPCREL | +32 bit signed PC relative offset to GOT | +PIC | +
R_X86_64_GOTPCREL64 | +64-bit PC relative offset to GOT entry | +PIC & Large code model | +
R_X86_64_GOTPLT64 | +Like GOT64, indicates that PLT entry needed | +PIC & Large code model | +
R_X86_64_GOTTPOFF | +32 bit signed PC relative offset to GOT entry for IE symbol | +TLS | +
R_X86_64_JUMP_SLOT | +.got.plt section, which contains addresses to the actual functions in DLL | +DLL | +
R_X86_64_PC16 | +16 bit sign extended PC relative | ++ |
R_X86_64_PC32 | +PC relative 32 bit signed | ++ |
R_X86_64_PC64 | +64-bit PC relative | +Large code model | +
R_X86_64_PC8 | +8 bit sign extended PC relative | ++ |
R_X86_64_PLT32 | +32 bit PLT address | ++ |
R_X86_64_PLTOFF64 | +64-bit GOT relative offset to PLT entry | +PIC & Large code model | +
R_X86_64_RELATIVE | +Adjust by program base | ++ |
R_X86_64_SIZE32 | ++ | + |
R_X86_64_SIZE64 | ++ | + |
R_X86_64_TLSDESC | +2 by 64-bit TLS descriptor | +TLS | +
R_X86_64_TLSDESC_CALL | +Relaxable call through TLS descriptor | +TLS | +
R_X86_64_TLSGD | +32 bit signed PC relative offset to two GOT entries for GD symbol | +TLS & PIC | +
R_X86_64_TLSLD | +32 bit signed PC relative offset to two GOT entries for LD symbol | +TLS | +
R_X86_64_TPOFF32 | +Offset in initial TLS block | +TLS | +
R_X86_64_TPOFF64 | +Offset in initial TLS block | +TLS & Large code model | +
+According to Chapter 3.5 of AMD64 System V Application Binary Interface, +there are four code models and they differ in addressing modes (absolute versus relative): +
The compiler can encode symbolic references +
This mode is the default mode for most compilers.
+ +
extern int esrc[100]; + int gsrc[100]; +static int ssrc[100]; + +void foo() { + int k; + k = esrc[5]; + k = gsrc[5]; + k = ssrc[5]; +} ++ +
k = esrc[5]; mov EAX, DWORD PTR[RIP+0x0] + mov DWORD PTR[RBP-0x4], EAX +k = gsrc[5]; mov EAX, DWORD PTR[RIP+0x0] + mov DWORD PTR[RBP-0x4], EAX +k = ssrc[5]; mov EAX, DWORD PTR[RIP+0x0] + mov DWORD PTR[RBP-0x4], EAX ++and the relocation table is (use readelf -r foo.o command) +
type Sym. Name + Addend +R_X86_64_PC32 esrc + 10 +R_X86_64_PC32 gsrc + 10 +R_X86_64_PC32 .bss + 10 ++All of the 0x0's in the generated assembly will be filled at link-time +with their relative offsets in respective sections, as indicated by the relocation table. +
+
k = esrc[5]; mov RAX, 0x0 + mov EAX, DWORD PTR[RAX+0x10] + mov DWORD PTR[RBP-0x4], EAX +k = gsrc[5]; mov RAX, 0x0 + mov EAX, DWORD PTR[RAX+0x10] + mov DWORD PTR[RBP-0x4], EAX +k = ssrc[5]; mov RAX, 0x0 + mov EAX, DWORD PTR[RAX+0x10] + mov DWORD PTR[RBP-0x4], EAX ++and the relocation table is: +
type Sym. Name + Addend +R_X86_64_64 esrc + 0 +R_X86_64_64 gsrc + 0 +R_X86_64_64 .bss + 0 ++All of the 0x0's in the generated assembly will be filled at link-time +with their (64-bit) absolute addresses. +
+
k = esrc[5]; mov RAX, QWORD PTR[RIP+0x0] + mov EAX, DWORD PTR[RAX+0x10] + mov DWORD PTR[RBP-0x4], EAX +k = gsrc[5]; mov RAX, QWORD PTR[RIP+0x0] + mov EAX, DWORD PTR[RAX+0x10] + mov DWORD PTR[RBP-0x4], EAX +k = ssrc[5]; mov EAX, DWORD PTR[RIP+0x0] + mov DWORD PTR[RBP-0x4], EAX ++and the relocation table is: +
type Sym. Name + Addend +R_X86_64_GOTPCREL esrc - 4 +R_X86_64_GOTPCREL gsrc - 4 +R_X86_64_PC32 .bss + 10 ++The first two 0x0's in the generated assembly will be filled with the relative +offset of _GLOBAL_OFFSET_TABLE_ (i.e. the .got.plt section) +
+
lea RBX, [RIP-0x7] + mov R11, 0x0 + add RBX, R11 +k = esrc[5]; mov RAX, 0x0 + mov RAX, QWORD PTR[RBX+RAX*1] + mov EAX, DWORD PTR[RAX+0x10] + mov DWORD PTR[RBP-0x4], EAX +k = gsrc[5]; mov RAX, 0x0 + mov RAX, QWORD PTR[RBX+RAX*1] + mov EAX, DWORD PTR[RAX+0x10] + mov DWORD PTR[RBP-0x4], EAX +k = ssrc[5]; mov RAX, 0x0 + mov RAX, QWORD PTR[RBX+RAX*1] + mov EAX, DWORD PTR[RAX+0x10] + mov DWORD PTR[RBP-0x4], EAX ++The first 0x0 is in the generated assembly will be filled with the absolute +address of _GLOBAL_OFFSET_TABLE_ +
(gdb) b *0x4003d0 +(gdb) run +(gdb) x/6a 0x600890 +0x600890: 0x6006e8 <_DYNAMIC> 0x32696159a8 +0x6008a0: 0x326950aa20 <_dl_runtime_resolve> 0x4003d6 <printf@plt+6> +0x6008b0: 0x326971c3f0 <__libc_start_main> 0x0 ++According to Chapter 5.2 of AMD64 System V Application Binary Interface, +the first 3 entries of this table are reserved for special purposes. +The first entry is set up during compilation by the link editor ld. +The second and third entries are set up during runtime by the runtime linker ld.so +(see function _dl_relocate_object in Glibc source file elf/dl-reloc.c +and in particular, notice the ELF_DYNAMIC_RELOCATE macro, +which calls function elf_machine_runtime_setup in sysdeps/x86_64/dl-machine.h) +
+The first entry _DYNAMIC has value 6006e8, and this is exactly +the starting address of .dynamic section (or DYNAMIC segment, in ELF's "execution view".) +The runtime linker ld.so uses this section to find the all necessary +information needed for runtime relocation and dynamic linking. +
+To see DYNAMIC segment's content, use readelf -d a.out command, or +objdump -x a.out, or just use x/50a 0x6006e8 in gdb. +The readelf -d a.out command will show something like this: +
Dynamic section at offset 0x6e8 contains 21 entries: + Tag Type Name/Value + 0x0000000000000001 (NEEDED) Shared library: [libc.so.6] <-- dependent dynamic library name + 0x000000000000000c (INIT) 0x4003a8 <-- address of .init section + 0x000000000000000d (FINI) 0x400618 <-- address of .fini section + 0x0000000000000004 (HASH) 0x400240 <-- address of .hash section + 0x000000006ffffef5 (GNU_HASH) 0x400268 <-- address of .gnu.hash section + 0x0000000000000005 (STRTAB) 0x4002e8 <-- address of .strtab section + 0x0000000000000006 (SYMTAB) 0x400288 <-- address of .symtab section + 0x000000000000000a (STRSZ) 63 (bytes) <-- size of .strtab section + 0x000000000000000b (SYMENT) 24 (bytes) <-- size of an entry in .symtab section + 0x0000000000000015 (DEBUG) 0x0 <-- see below + 0x0000000000000003 (PLTGOT) 0x600860 <-- address of .got.plt section + 0x0000000000000002 (PLTRELSZ) 48 (bytes) <-- total size of .rela.plt section + 0x0000000000000014 (PLTREL) RELA <-- RELA or REL ? + 0x0000000000000017 (JMPREL) 0x400368 <-- address of .rela.plt section + 0x0000000000000007 (RELA) 0x400350 <-- address of .rela.dyn section + 0x0000000000000008 (RELASZ) 24 (bytes) <-- total size of .rela.dyn section + 0x0000000000000009 (RELAENT) 24 (bytes) <-- size of an entry in .rela.dyn section + 0x000000006ffffffe (VERNEED) 0x400330 <-- address of .gnu.version_r section + 0x000000006fffffff (VERNEEDNUM) 1 <-- number of needed versions + 0x000000006ffffff0 (VERSYM) 0x400328 <-- address of .gnu.version section + 0x0000000000000000 (NULL) 0x0 <-- marks the end of .dynamic section ++Each entry in DYNAMIC segment is a struct of only two members: +"tag" and "value". The NEEDED, INIT ... above +are "tags" (see /usr/include/elf.h) +
Other tags of interest are: +
BIND_NOW The same as BIND_NOW in FLAGS. This has been superseded by + BIND_NOW in FLAGS + +CHECKSUM The checksum value used by prelink. + +DEBUG At runtime ld.so will fill its value with the runtime + address of r_debug structure (see elf/rtld.c) + and this info is used by GDB (see elf_locate_base function + in GDB's source tree). + +FINI Address of .fini section +FINI_ARRAY Address of .fini_array section +FINI_ARRAYSZ Size of .fini_array section + +FLAGS Additional flags, such as BIND_NOW, STATIC_TLS, TEXTREL.. + +FLAGS_1 Additional flags used by Solaris, such as NOW (the same as BIND_NOW), INTERPOSE.. + +GNU_PRELINKED The timestamp string when the binary object is last prelinked. + +INIT Address of .init section +INIT_ARRAY Address of .init_array section +INIT_ARRAYSZ Size of .init_array section + +INTERP Address of .interp section + +PREINIT_ARRAY Address of .preinit_array section +PREINIT_ARRAYSZ Size of .preinit_array section + +RELACOUNT Number of R_X86_64_RELATIVE entries in RELA segment (.rela.dyn + section) + +RPATH Dynamic library search path, which has higher precendence than + LD_LIBRARY_PATH. RPATH is ignored if RUNPATH is present. + + Use of RPATH is deprecated. + + When one uses "gcc -Wl,-rpath=... " to build binaries, the info + is stored here. + +RUNPATH Dynamic library search path, which has lower precendence than + LD_LIBRARY_PATH. + + When one uses "gcc -Wl,-rpath=...,--enable-new-dtags" + to build binaries, the info is stored here. + (See here for details.) + + One can use chrpath + tool to manipulate RPATH and RUNPATH settings. + + +SONAME Shared object (i.e. dynamic library) name. When one uses + "gcc -Wl,-soname=... " to build binaries, the info is + stored here. + +TEXTREL Relocation might modify .text section. + +VERDEF Address of .gnu.version_d section +VERDEFNUM Number of version definitions. ++ +
+First, before ld.so loads all dependent libraries of a dynamic executable, +it needs to run its own relocation! Even if ld.so is a statically-linked binary, +it also has a DYNAMIC segment and thus PLTREL (.rela.dyn section) +and JMPREL (.rela.plt section) tags: +
$ readelf -a `readelf -p .interp /bin/sh | awk '/ld/ {print $3}'` + + .... + +Dynamic section at offset 0x14e18 contains 22 entries: + Tag Type Name/Value + 0x000000000000000e (SONAME) Library soname: [ld-linux-x86-64.so.2] + 0x0000000000000004 (HASH) 0x3269500190 + 0x0000000000000005 (STRTAB) 0x3269500578 + 0x0000000000000006 (SYMTAB) 0x3269500260 + 0x000000000000000a (STRSZ) 388 (bytes) + 0x000000000000000b (SYMENT) 24 (bytes) + 0x0000000000000003 (PLTGOT) 0x3269614f98 + 0x0000000000000002 (PLTRELSZ) 120 (bytes) + 0x0000000000000014 (PLTREL) RELA + 0x0000000000000017 (JMPREL) 0x32695009a0 + 0x0000000000000007 (RELA) 0x32695007c0 + 0x0000000000000008 (RELASZ) 480 (bytes) + 0x0000000000000009 (RELAENT) 24 (bytes) + 0x000000006ffffffc (VERDEF) 0x3269500740 + 0x000000006ffffffd (VERDEFNUM) 4 + 0x0000000000000018 (BIND_NOW) + 0x000000006ffffffb (FLAGS_1) Flags: NOW + 0x000000006ffffff0 (VERSYM) 0x32695006fc + 0x000000006ffffff9 (RELACOUNT) 19 + 0x000000006ffffdf8 (CHECKSUM) 0x4c4e099e + 0x000000006ffffdf5 (GNU_PRELINKED) 2010-08-26T08:13:28 + 0x0000000000000000 (NULL) 0x0 + +Relocation section '.rela.dyn' at offset 0x7c0 contains 20 entries: + Offset Info Type Sym. Value Sym. Name + Addend +003269614cf0 000000000008 R_X86_64_RELATIVE 000000326950dd80 + .... +003269615820 000000000008 R_X86_64_RELATIVE 0000003269501140 +003269614fe0 001e00000006 R_X86_64_GLOB_DAT 0000003269615980 _r_debug + 0 + +Relocation section '.rela.plt' at offset 0x9a0 contains 5 entries: + Offset Info Type Sym. Value Sym. Name + Addend +003269614fb0 000b00000007 R_X86_64_JUMP_SLO 000000326950f1b0 __libc_memalign + 0 +003269614fb8 000c00000007 R_X86_64_JUMP_SLO 000000326950f2b0 malloc + 0 +003269614fc0 001200000007 R_X86_64_JUMP_SLO 000000326950f2c0 calloc + 0 +003269614fc8 001800000007 R_X86_64_JUMP_SLO 000000326950f340 realloc + 0 +003269614fd0 002000000007 R_X86_64_JUMP_SLO 000000326950f300 free + 0 ++Note that the ld.so is prelinked. On Fedora and Red Hat Enterprise Linux +(RHEL) systems, prelink is run every two weeks. +To see if your Linux has similar setup, check /etc/sysconfig/prelink +and /etc/prelink.conf +
+What does this prelink do? It changes the base address of a dynamic library +to the actual address in the user program's address space when it is loaded into memory. +Of course, ld.so recognizes GNU_PRELINKED +tag and will load a dynamic library to its this base address (recall the first argument of +mmap is the preferred address; of course, this is subject to the operating system.) +
Normally, a dynamic library +is built as position independent code, +i.e. the -fPIC compiler command-line option, and thus the base address is +0. For example, a normal libc.so has ELF program header as follows (readelf -l command): +
Program Headers: + Type Offset VirtAddr PhysAddr + FileSiz MemSiz Flags Align + LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000 + 0x0000000000179058 0x0000000000179058 R E 200000 + LOAD 0x0000000000179730 0x0000000000379730 0x0000000000379730 + 0x0000000000004668 0x00000000000090f8 RW 200000 + .... ++And when calling mmap with address 0 (i.e. NULL) +the operating system can choose any address it feels appropriate. +
+A prelinked one, on the other hand, has its ELF program header as follows: +
Program Headers: + Type Offset VirtAddr PhysAddr + FileSiz MemSiz Flags Align + LOAD 0x0000000000000000 0x0000003433e00000 0x0000003433e00000 + 0x000000000001bb80 0x000000000001bb80 R E 200000 + LOAD 0x000000000001bb90 0x000000343401bb90 0x000000343401bb90 + 0x0000000000000f58 0x00000000000010f8 RW 200000 ++ +What is the advantage of prelinking? +ld.so will not process R_X86_64_RELATIVE relocation types +since they are already in the "right" place in user program's address space. +The extra benefit of this is the memory regions which +ld.so would have written to (if R_X86_64_RELATIVE needs +processing) will not incur any Copy-On-Writes and thus can be made Read-Only.
+According to this post, for GUI +programs, which tend to link against dozens of dynamic libraries and use lengthy +C++ demangled names, the speed up can be an order of magnitude. +
+How to disable prelinking at runtime? +Run the user program with LD_USE_LOAD_BIAS environmental +variable set to 0. +
+How does ld.so process its own relocation? +
+The relocation is done by _dl_relocate_object function +in Glibc's elf/dl-reloc.c, which will call +elf_machine_rela function in sysdeps/x86_64/dl-machine.h +to do the majority of work. +
+First to be processed is the .rela.dyn relocation table, +which contains a bunch of R_X86_64_RELATIVE types +and one R_X86_64_GLOB_DAT type (the variable _r_debug) +
+If prelink is used, i.e. ld.so is indeed loaded +to the desired address, then R_X86_64_RELATIVE +relocation types will be ignored. If not, +then the address calculation for R_X86_64_RELATIVE types +is +
Base Address + Value Stored at [Base Address + Offset] ++For example, in ld.so's case, its base address +is 2a95556000 (can be obtained from pmap command; inside ld.so, +it calls elf_machine_load_address function to get this value) +
0000400000 4K r-x-- /tmp/a.out +0000500000 4K rw--- /tmp/a.out +2a95556000 92K r-x-- /lib64/ld.so +2a9556d000 8K rw--- [ anon ] +2a95599000 4K rw--- [ anon ] +2a9566c000 4K r---- /lib64/ld.so +2a9566d000 4K rw--- /lib64/ld.so +3269700000 1216K r-x-- /lib64/libc-2.3.4.so +... ++And ld.so's .rela.dyn relocation table is (no prelinked! +If ld.so is prelinked, the offset will be in a much higher address) +
Relocation section '.rela.dyn' at offset 0x7c0 contains 20 entries: + Offset Info Type Sym. Value Sym. Name + Addend +000000116d50 000000000008 R_X86_64_RELATIVE 000000000000e250 +... ++so the relocation for 000000116d50 is processed as +
0x2a95556000 + *(0x116d50+0x2a95556000) ++and this new value is stored at 0x2a9566cd50 (=0x116d50+0x2a95556000) + +
As R_X86_64_RELATIVE types do not require symbol lookups, +they are handled in a tight loop in +elf_machine_rela_relative function in +sysdeps/x86_64/dl-machine.h +
+Any relocation types other than R_X86_64_RELATIVE need to go +through symbol resolution first. +
+So what about R_X86_64_GLOB_DAT relocation type in ld.so ? +First, RESOLVE_MAP (a macro defined within elf/dl-reloc.c) +is called (with r_type = R_X86_64_GLOB_DAT) +to find out which ELF binary (could be the user's program or its dependent +dynamic libraries) +contains this symbol. Then +R_X86_64_GLOB_DAT relocation type is calculated as +
Base Address + Symbol Value + Addend ++where Base Address is the base address +of ELF binary which contains the symbol, and +Symbol Value is the symbol value from +the symbol table of ELF binary which contains the symbol. +
+So for ld.so, +
Relocation section '.rela.dyn' at offset 0x7c0 contains 20 entries: + Offset Info Type Sym. Value Sym. Name + Addend + .... +000000116fe0 001e00000006 R_X86_64_GLOB_DAT 00000000001179c0 _r_debug + 0 ++The relocation for 000000116fe0 is processed as +
0x2a95556000 + 0x1179c0 + 0 ++because ld.so determines _r_debug +can be found from itself. The calculated value is stored at 0x2a9566cfe0 (=0x116fe0+0x2a95556000). +
+The next to be processed by ld.so +is its own .rela.plt relocation table, +which contains a bunch of R_X86_64_JUMP_SLOT types. +This reloction type is handled exactly the same way as R_X86_64_GLOB_DAT. +
+After ld.so finishes its own relocation, it loads user program's +dependent libraries and process their relocations one by one. +First, ld.so handles libc.so's relocation. +libc.so has two relocation types we have not covered so far: +R_X86_64_64 and R_X86_64_TPOFF64. +
+R_X86_64_64 relocation type is processed by first looking +up the symbol's runtime absolute address, and then +calculating +
Absolute Address + Addend ++And the R_X86_64_TPOFF64 relocation type is calculated as +
Symbol Value + Addend - TLS Offset ++which usually results in a negative value. + +
+To see how R_X86_64_COPY relocation type works, consider the following two code: +
foo.c + + int foo=4; + + void foo_access() { + foo=5; + } + +bar.c + + #include <stdio.h> + extern int foo; + + int main() { + printf("foo=%d\n",foo); + } ++Now compile them as follows: +
$ gcc -shared -fPIC -Wl,-soname=libfoo.so foo.c -o /tmp/libfoo.so +$ gcc bar.c -o bar -L/tmp -lfoo ++And run them as +
$ LD_PRELOAD=/tmp/libfoo.so ./bar ++Before explaining what happened during runtime, we need to examine +the binaries first. +
+The foo_access in libfoo.so is like this: +
69c <foo_access>: + 69c: push rbp + 69d: mov rbp,rsp + 6a0: mov rax,QWORD PTR [rip+0x100269] # 100910 <_DYNAMIC+0x198> + 6a7: mov DWORD PTR [rax],0x5 + 6ad: leave + 6ae: ret ++So for libfoo.so, the address of variable foo is +in its .got section, not .data section: +
$ readelf -a /tmp/libfoo.so + +Section Headers: + [Nr] Name Type Address Offset + Size EntSize Flags Link Info Align +... + [18] .got PROGBITS 0000000000100908 00000908 + 0000000000000020 0000000000000008 WA 0 0 8 + [19] .got.plt PROGBITS 0000000000100928 00000928 + 0000000000000020 0000000000000008 WA 0 0 8 +... + [20] .data PROGBITS 0000000000100948 00000948 + 0000000000000014 0000000000000000 WA 0 0 8 +... + +Relocation section '.rela.dyn' at offset 0x520 contains 6 entries: + Offset Info Type Sym. Value Sym. Name + Addend +000000100948 000000000008 R_X86_64_RELATIVE 0000000000100948 +000000100950 000000000008 R_X86_64_RELATIVE 0000000000100768 +000000100908 000f00000006 R_X86_64_GLOB_DAT 0000000000000000 __cxa_finalize + 0 +000000100910 001100000006 R_X86_64_GLOB_DAT 0000000000100958 foo + 0 +.... ++But what about the address 0x100958 ? This address +is in libfoo.so's .data section! Well, 0x100958 +has the initial value of foo (in our example, 4) At runtime, ld.so +will copy this value to bar's .bss section: +
$ objdump -sj .data libfoo.so + +libfoo.so: file format elf64-x86-64 + +Contents of section .data: + 100948 48091000 00000000 68071000 00000000 H.......h....... + 100958 04000000 .... ++
+Next, disassemble the main function of bar: +
4005f8 <main>: + 4005f8: push rbp + 4005f9: mov rbp,rsp + 4005fc: mov esi,DWORD PTR [rip+0x1003de] # 5009e0 <__bss_start> + 400602: mov edi,0x40070c + 400607: mov eax,0x0 + 40060c: call 400528 <printf@plt> + 400611: leave + 400612: ret ++So the variable foo is indeed located in +bar's .bss section. Let's double check with nm: +
$ nm -n bar | grep 5009e0 + +00000000005009e0 A __bss_start +00000000005009e0 A _edata +00000000005009e0 B foo ++(Symbols such as __bss_start and _edata are defined by the default ld script; +one can search them in the output of ld -verbose command.) +
+The dynamic relocation table of bar is: +
Relocation section '.rela.dyn' at offset 0x490 contains 2 entries: + Offset Info Type Sym. Value Sym. Name + Addend +000000500998 000c00000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0 +0000005009e0 000700000005 R_X86_64_COPY 00000000005009e0 foo + 0 ++Now what happens during runtime is this: After ld.so loads all dependent +dynamic libraries, it starts processing their relocations. +When it sees foo of libfoo.so, it +calls RESOLVE_MAP with r_type = R_X86_64_GLOB_DAT to get +the Base Address, which is 0, and Symbol Value, which is +5009e0. Next it +sees foo of libfoo.so has +R_X86_64_GLOB_DAT relocation type, +so it calculates the new address as 5009e0 = 0 + 5009e0 + 0 (addend) +and stores the result somewhere inside .got section. +
+After ld.so has processed relocations of all +dynamic libraries, it starts processing the relocation table +of bar. When it sees foo of bar, it +calls RESOLVE_MAP again, but with r_type = R_X86_64_COPY. This time, the address returned is +the runtime address of foo in libfoo.so's +.data section. As mentioned earlier, this +address holds the initial value of foo. +Next it sees foo of bar has R_X86_64_COPY +relocation type, so it uses memcpy +to copy data to 5009e0 +(see the Sym. Value of .rela.dyn section of bar above) +from the runtime address of foo in libfoo.so's +.data section (see Glibc source file sysdeps/x86_64/dl-machine.h) +
+The above example also illustrates the difference +between .got section and .got.plt section. +For the runtime linker ld.so, all it knows is +entries in PLTREL segment, i.e. .rela.dyn section, +(which corresponds to .got section) +must be resolved/relocated immediately, while entries in +JMPREL segment, i.e. .rela.plt section, +(which corresponds to .got.plt section) can use +lazy binding. For x86_64 architecture, the relocation is actually not +needed for R_X86_64_JUMP_SLOT relocation types (albeit the +symbol resolution is still needed) + + +
+What's the difference then ? +
+Consider the following simple code: +
#include <stdio.h> +int bar; + +void foo() { + printf("%d\n",bar); +} ++Compile the above code in 32-bit mode with and without -fPIC: +
$ gcc -shared -m32 foo.c -o nopic.so +$ gcc -shared -m32 -fPIC foo.c -o pic.so ++(If you try to compile the above in 64-bit mode, GCC will +stop and insist you should compile with -fPIC option, i.e. you are going to +see error message such as +relocation R_X86_64_PC32 against symbol `XXXYYY' can not be used when making a shared object; recompile with -fPIC) + +The sections and relocation tables of nopic.so +and pic.so +are shown at left and right hand side, respectively: +
Section Headers: Section Headers: +[Nr] Name Type Addr [Nr] Name Type Addr +[ 0] NULL 0000 [ 0] NULL 0000 + ... ... +[ 8] .init PROGBITS 02f8 [ 8] .init PROGBITS 02f0 +[ 9] .plt PROGBITS 0310 [ 9] .plt PROGBITS 0308 +[10] .text PROGBITS 0340 [10] .text PROGBITS 0350 +[11] .fini PROGBITS 0488 [11] .fini PROGBITS 04a8 +[12] .rodata PROGBITS 04a4 [12] .rodata PROGBITS 04c4 + ... ... +[17] .dynamic DYNAMIC 14c0 [17] .dynamic DYNAMIC 14e0 +[18] .got PROGBITS 1590 [18] .got PROGBITS 15a8 +[19] .got.plt PROGBITS 159c [19] .got.plt PROGBITS 15b8 +[20] .data PROGBITS 15b0 [20] .data PROGBITS 15d0 + + ... ... + +Relocation section '.rel.dyn' at offset 0x2b0 Relocation section '.rel.dyn' at offset 0x2b0 +contains 7 entries: contains 5 entries: + Offset Info Type Sym.Value Sym. Name Offset Info Type Sym.Value Sym. Name +00000439 00000008 R_386_RELATIVE 000015d0 00000008 R_386_RELATIVE +000015b0 00000008 R_386_RELATIVE 000015a8 00000106 R_386_GLOB_DAT 000015dc bar +00000434 00000101 R_386_32 000015bc bar ... +00000445 00000602 R_386_PC32 00000000 printf + ... + +Relocation section '.rel.plt' at offset 0x2e8: Relocation section '.rel.plt' at offset 0x2d8 +contains 2 entries: contains 3 entries: + Offset Info Type Sym.Value Sym. Name Offset Info Type Sym.Value Sym. Name +000015a8 00000207 R_386_JUMP_SLOT 00000000 __gmon_start__ 000015c4 00000207 R_386_JUMP_SLOT 00000000 __gmon_start__ +000015ac 00000a07 R_386_JUMP_SLOT 00000000 __cxa_finalize 000015c8 00000607 R_386_JUMP_SLOT 00000000 printf + ... ++When we compile with -fPIC we can see the variable bar +has the right relocation type (R_386_GLOB_DAT) +and the relocation takes place in the right section (.got) The same for +printf. +
+Without -fPIC, the relocations of the format string "\n", bar +and printf all take place inside the .text section! +But we know .text section is in a Read-Only LOAD +segment, so what ld.so would do ? +
+As expected, ld.so will make .text section +writeable, patch the bytes, and make it Read-Only again. Since the +relocation of both bar and printf are +in .rel.dyn, their relocations are performed immediately +(no lazy binding), so this approach is feasible. +
+So how does ld.so handle +R_386_RELATIVE, +R_386_32 +and R_386_PC32 relocation types ? +
+Let's look at the disassembly: +
0000042c <foo>: + 42c: 55 push ebp + 42d: 89 e5 mov ebp,esp + 42f: 83 ec 18 sub esp,0x18 + 432: 8b 15 00 00 00 00 mov edx,DWORD PTR ds:0x0 <-- reference to bar + 438: b8 a4 04 00 00 mov eax,0x4a4 <-- reference to "%d\n" format string in .rodata + 43d: 89 54 24 04 mov DWORD PTR [esp+0x4],edx + 441: 89 04 24 mov DWORD PTR [esp],eax + 444: e8 fc ff ff ff call 445 <foo+0x19> <-- reference to printf + 449: c9 leave + 44a: c3 ret ++How would the 4 bytes starting at 445 (R_386_PC32 type) + be patched ? Suppose at runtime, our +nopic.so is loaded +into memory with base address 8000, and the 4 bytes +to be patched are now at 8000 + 445 = 8445. +Furthermore, suppose ld.so has determined +the entry address of printf to be 10000, then +ld.so calculates the relative offset as follows: +
10000 - 8445 + fffffffc = 7bb7 ++(fffffffc is -4) so ld.so replaces fc ff ff ff +with b7 7b 00 00 +
+To patch the 4 bytes starting at 434 (R_386_32 type) is simpler. +ld.so will simply overwrite the 4 bytes with the runtime absolute +address of bar. +
+To patch the 4 bytes starting at 439 (R_386_RELATIVE type) +ld.so calculates the address as +
10000 + 4a4 = 104a4 ++so ld.so replaces a4 04 00 00 +with a4 04 01 00 +
+Finally, what about the R_386_RELATIVE relocation at 15b0 ? +15b0 is the starting address of .data section, and the first 4 bytes +of .data section stores its own address, 15b0. So it has to be +relocated and patched as 115b0. +
+In conclusion, R_386_RELATIVE means "32-bit relative to base address", +R_386_PC32 means the "32-bit IP-relative offset" +and R_386_32 means the "32-bit absolute." + +
+This usually happens when the dynamic binary in question is built using newer version of GCC. +The solution is to recompile the code with either -static compiler command-line option +(to create a static binary), or the following option: +
-Wl,--hash-style=both ++This tells the link editor ld to create both .gnu.hash and .hash sections. +
According to ld documentation here, +the old-school .hash section is the default, but the compiler can override it. For example, +the GCC (which is version 4.1.2) on RHEL (Red Hat Enterprise Linux) Server release 5.5 has +this line: +
$ gcc -dumpspecs +.... +*link: +%{!static:--eh-frame-hdr} %{!m32:-m elf_x86_64} %{m32:-m elf_i386} --hash-style=gnu %{shared:-shared} .... +... ++ +
For more information, see here. + +
$ objdump -x foo | grep 'Version References' -A10 + +Version References: + required from libc.so.6: + 0x0d696914 0x00 03 GLIBC_2.4 + 0x09691a75 0x00 02 GLIBC_2.2.5 + +... ++The fix is to recompile the code with -static compiler command-line option. + +
+This kernel version check is done by DL_SYSDEP_OSCHECK macro in Glibc's +sysdeps/unix/sysv/linux/dl-osinfo.h +It calls _dl_discover_osversion to get current kernel's version. +
+To wit, run your code (suppose it is not stripped) inside gdb, +
(gdb) run +Starting program: foo +FATAL: kernel too old + +Program received signal SIGSEGV, Segmentation fault. +0x00000000004324a9 in ptmalloc_init () +(gdb) call _dl_discover_osversion() +$1 = 132617 +(gdb) p/x $1 +$2 = 0x20609 +(gdb) ++Here 0x20609 means the current kernel version is 2.6.9. +
+The fix (or hack) is to add the following function in your code: +
int _dl_discover_osversion() { return 0xffffff; } ++and compile your code with -static compiler command-line option. + +
+The second member of pthread struct is also +a struct called list_t defined in +nptl/sysdeps/pthread/list.h. +
+The third and fourth members of pthread struct are thread ID and thread +group ID (both are of pid_t type). +
+Other members of pthread struct which are of interest: int cancelhandling for +cancellation information, int flags for thread attributes, +start_routine for start position of the code to be executed for the thread, +void *arg for the argument to start_routine +void *stackblock and size_t stackblock_size for thread-specific +stack information. +
+Since pthread struct is opaque, how can one obtain the above information, +or more precisely, how can one obtain the offsets of these members within the +pthread struct ? We can use the known information and search +for the memory region pointed by pthread_t, as in this code snippet. + + + + + + + + +
+ + diff --git a/masto-thread.md b/masto-thread.md new file mode 100644 index 0000000..b57b3b9 --- /dev/null +++ b/masto-thread.md @@ -0,0 +1,115 @@ +# Rough transcriptions of a thread on Mastodon + +Here are some useful parts of posts I made on Mastodon, I haven't cleaned +them up too much. + +## General structure of an ELF file + +An ELF file starts with an ELF header (Ehdr), which contains offsets to the +program headers aka segments (Phdr), and the section headers (Shdr). also tells +you the entry point, architecture+bitsize, and which shdr is `.shstrtab`. + +Shdrs and phdrs are explained [here](linkers-8.md). Both provide views on the +ELF file, but for different purposes. Though its kinda not a good idea, I'll +give you that. + +The stuff pointed to by the shdrs and phdrs are: + +* `text`, `data`, `rodata`, `bss`, ... blobs +* string table blobs (`strtab`, `shstrtab`, `dynstr`) +* interpreter, comment, etc etc +* relocation tables (`Rel`, `Rela`) (phdrs don't know about this one) +* symbol tables (`Sym`) (phdrs don't know about this one) +* versioning info (`Versym`, `Verdef`, `Verneed`) (phdrs don't know about this one) +* dynamic table (Dyn), which also has entries for relocations, symbols, versioning + +Yes, its true that phdrs, shdrs, dynamic, symtab, dynsym, ... could've just been +tables right after the Ehdr, but that was apparently not complicated enough for +Sun. + +## What are all the different sections for? + +`.hash, .gnu.hash`: hash tables for looking up symbols. both do the same but +are slihgtly different in implementation, `.hash` comes from SysV R4 and has +been deprecated for ages, no clue why its still there. `.gnu.hash` is made by +the GNU people because they thought the SysV one wasnt good enough. + +`.comment` is just a string the toolchain inserts to tell people its built with +the toolchain, for some reason. + +`.shstrtab` is the blob that contains the section names (so the actual ".text", +".data", ... strings), for some reason (elaborated on later) this is stored +separately from the other string tables (the '`sh_name`' field of an +`ElfXX_Shdr` is an offset into this table). + +`.rel*` and `.rela*` contain relocation info, used both during +static/"compile-time" linking and runtime/dynamic linking. binaries contain +only runtime relocation info, linkable objects contain only static linking info +(the linker has to figure out which symbols and relocations need to get truned +into dynamic ones). + +`.gnu.version` and `.gnu.version_r` contain versioning information of symbols, +glibc uses this a lot, and practically nothing else + +There's also `.debug*` and `.dwarf*` stuff for debug info, that's yet another +rabbithole im *not* going into this time. + +Usually, an ELF binary (not a non-linked object) has two symbol tables, +`.symtab` and `.dynsym`. the former contains all the 'internal' symbols (the +part you can strip away), the latter are the imported and exported ones + +`.strtab` contains the symbol name strings, the `st_name` of the `ElfXX_Sym` +entries in `.symtab` is again an offset, `.dynstr` contains the names of the +names of the `.dynsym` entries + +However, section headers don't actually have to be present at all in binaries +(executables *and* libraries), only in linkable object files. you can just, get +rid of them completely (patch out the shdr-related fields in the ELF header), +and things still work, which is why and how you can get rid of the `.symtab`, +`.strtab`, `.shstrtab`, etc (and the shdr table itself), and thats also why all +the string tables are separate. + +But how would ld.so find `.dynsym` etc. if the shdrs that point to them are +gone? + +That's where `.dynamic` is for: it contains a bunch of only half-related +offsets of the file into a table: a list of library dependencies, offsets into +the `.dynsym`, `.dynstr`, `.gnu.version`, `.rel(a)`, ... tables, misc flags and +settings, and so on (the entries are key/value pairs, see `ElfXX_Dyn`). + +But then how does ld.so find `.dynamic`? + +That's what the phdrs are for (not). Originally, those are meant for the kernel +to see where in memory an executable needs to be mapped, with offset+address, +alignment, permission, ... info. But as that's the table the kernel looks at, +that's also where they added the info about which interpreter should be used for +the binary, whether the stack should be mapped NX, and so on. there's also one +containing the offset of the `.dynamic` table. the kernel doesnt touch it, but +thats how ld.so can reliably find it. + +The thing is, you can have most things "gone" by removing all the sections, but +many of these will still actually be present because they have an entry in the +`.dynamic` table. which is not very useful. so if you want to get rid of some +stuff (hash tables, versioning info, ...), you'll first have to remove the +entries from the dyn table, and only *then* remove the relevant shdrs, as that +will properly remove it from the binary + +Then you can nuke the shdr table itself using a tool like `sstrip` (usually +packaged in `elf-kickers` or `elfkickers` or ...), binutils/objcopy won't let +you do this. + +And thats why, if you want a *small* output file, you want to either write the +ELF headers manually, or use/write a custom linker that doesn't emit all this +stuff. + +## Random notes + +* `ld.so` has to be linked with `-static-pie` +* All symbol tables must start with a zeroed-out entry, because the standard + says that symbol index 0 (when referencing a symbol elsewhere) means no + symbol, instead of index -1 or so. It's not a sentinel value. +* ld.so will use the hash tables first to look up symbols that are defined in + the binary before resorting to walking the symbol table manually. It probably + actually needs at least one of these two to be present in a binary nowdays. + `.hash` is provided as a fallback for when ld.so wouldn't know about + `.gnu.hash`, but that practically never happens.