2504 lines
128 KiB
HTML
2504 lines
128 KiB
HTML
|
<html><head>
|
||
|
|
||
|
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
|
||
|
|
||
|
<title>Executable and Linkable Format (ELF)</title>
|
||
|
</head>
|
||
|
|
||
|
<body alink="#FF6600" bgcolor="#000000" link="#00FFFF" text="#EEEEEE" vlink="#00FFFF">
|
||
|
<p>This page is a copy of the <a href="https://web.archive.org/web/20201202024834/https://web.archive.org/web/20120922073347/http://www.acsu.buffalo.edu/~charngda/elf.html">Archive.org</a>
|
||
|
copy of the now no longer availabel <a href="https://web.archive.org/web/20201202024834/http://www.acsu.buffalo.edu/~charngda/elf.html">http://www.acsu.buffalo.edu/~charngda/elf.html</a>.
|
||
|
It is kept here online as a reference only.</p>
|
||
|
|
||
|
<hr>
|
||
|
|
||
|
<h2>Acronyms relevant to Executable and Linkable Format (ELF)</h2>
|
||
|
|
||
|
<table border="">
|
||
|
<tbody><tr><td><a href="https://web.archive.org/web/20201202024834/http://www.wikipedia.org/wiki/Application_binary_interface">ABI</a></td><td>Application binary interface</td></tr>
|
||
|
<tr><td><a href="https://web.archive.org/web/20201202024834/http://www.wikipedia.org/wiki/A.out">a.out</a></td><td>Assembler output file format</td></tr>
|
||
|
<tr><td><a href="https://web.archive.org/web/20201202024834/http://www.wikipedia.org/wiki/.bss">BSS</a></td><td>Block started by symbol. The uninitialized data segment containing statically-allocated variables.</td></tr>
|
||
|
<tr><td><a href="https://web.archive.org/web/20201202024834/http://www.wikipedia.org/wiki/COFF">COFF</a></td><td>Common object file format</td></tr>
|
||
|
<tr><td>DTV</td><td>Dynamic thread vector (for TLS)</td></tr>
|
||
|
<tr><td><a href="https://web.archive.org/web/20201202024834/http://www.wikipedia.org/wiki/DWARF">DWARF</a></td><td>A standardized debugging data format</td></tr>
|
||
|
<tr><td>GD</td><td>Global Dynamic (dynamic TLS) One of the <a href="https://web.archive.org/web/20201202024834/http://download.oracle.com/docs/cd/E19963-01/html/819-0690/chapter8-1.html">Thread-Local Storage access models</a>.</td></tr>
|
||
|
<tr><td>GOT</td><td>Global offset table</td></tr>
|
||
|
<tr><td>IE</td><td>Initial Executable (static TLS with assigned offsets) One of the <a href="https://web.archive.org/web/20201202024834/http://download.oracle.com/docs/cd/E19963-01/html/819-0690/chapter8-1.html">Thread-Local Storage access models</a>.</td></tr>
|
||
|
<tr><td>LD</td><td>Local Dynamic (dynamic TLS of local symbols) One of the <a href="https://web.archive.org/web/20201202024834/http://download.oracle.com/docs/cd/E19963-01/html/819-0690/chapter8-1.html">Thread-Local Storage access models</a>.</td></tr>
|
||
|
<tr><td>LE</td><td>Local Executable (static TLS) One of the <a href="https://web.archive.org/web/20201202024834/http://download.oracle.com/docs/cd/E19963-01/html/819-0690/chapter8-1.html">Thread-Local Storage access models</a>.</td></tr>
|
||
|
<tr><td><a href="https://web.archive.org/web/20201202024834/http://www.wikipedia.org/wiki/Mach-O">Mach-O</a></td><td>Mach object file format</td></tr>
|
||
|
<tr><td>PC</td><td>Program counter. On x86, this is the same as IP (Instruction Pointer) register.</td></tr>
|
||
|
<tr><td><a href="https://web.archive.org/web/20201202024834/http://www.wikipedia.org/wiki/Portable_Executable">PE</a></td><td>Portable executable</td></tr>
|
||
|
<tr><td>PHT</td><td>Program header table</td></tr>
|
||
|
<tr><td><a href="https://web.archive.org/web/20201202024834/http://www.wikipedia.org/wiki/Position_independent_code">PIC</a></td><td>Position independent code</td></tr>
|
||
|
<tr><td><a href="https://web.archive.org/web/20201202024834/http://www.wikipedia.org/wiki/Position_independent_code">PIE</a></td><td>Position independent executable</td></tr>
|
||
|
<tr><td>PLT</td><td>Procedure linkage table</td></tr>
|
||
|
<tr><td>REL<br>RELA</td><td>Relocation</td></tr>
|
||
|
<tr><td>RVA</td><td>Relative virtual address</td></tr>
|
||
|
<tr><td>SHF</td><td>Section header flag</td></tr>
|
||
|
<tr><td>SHT</td><td>Section header table</td></tr>
|
||
|
<tr><td>SO</td><td>Shared object (another name for dynamic link library)</td></tr>
|
||
|
<tr><td>VMA</td><td>Virtual memory area/address</td></tr>
|
||
|
</tbody></table>
|
||
|
|
||
|
<h2>Useful books and references</h2>
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://manpages.courier-mta.org/htmlman5/elf.5.html">ELF man page</a><a>
|
||
|
</a><p><a>
|
||
|
</a><a href="https://web.archive.org/web/20201202024834/http://www.sco.com/developers/gabi/latest/contents.html">System V Application Binary Interface</a><a>
|
||
|
</a></p><p><a>
|
||
|
</a><a href="https://web.archive.org/web/20201202024834/http://www.x86-64.org/documentation/abi.pdf">AMD64 System V Application Binary Interface</a>
|
||
|
</p><p>
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://homepage.ntlworld.com/jonathan.deboynepollard/FGA/function-calling-conventions.html">The gen on function calling conventions</a>
|
||
|
</p><p>
|
||
|
Section II of <a href="https://web.archive.org/web/20201202024834/http://refspecs.freestandards.org/LSB_4.0.0/LSB-Core-generic/LSB-Core-generic/book1.html">Linux Standard Base 4.0 Core Specification</a>
|
||
|
</p><p>
|
||
|
<i>Self-Service Linux: Mastering the Art of Problem Determination</i> by Mark Wilding and Dan Behman
|
||
|
</p><p>
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://download.oracle.com/docs/cd/E19963-01/html/819-0690/">Solaris Linker and Libraries Guide</a>
|
||
|
</p><p>
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://www.iecc.com/linker">Linkers and Loaders</a> by John Levine
|
||
|
</p><p>
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://s.eresi-project.org/inc/articles/elf-rtld.txt">Understanding Linux ELF RTLD internals</a> by mayhem (this article gives
|
||
|
you an idea how the runtime linker <tt>ld.so</tt> works)
|
||
|
</p><p>
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://manpages.courier-mta.org/htmlman8/ld.so.8.html"><tt>ld.so</tt> man page</a>
|
||
|
</p><p>
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://www.wikipedia.org/wiki/Prelink">Prelink</a> by Jakub Jelinek (and <a href="https://web.archive.org/web/20201202024834/http://linux.die.net/man/8/prelink">prelink man page</a>)
|
||
|
|
||
|
</p><h2>Executable and Linkable Format</h2>
|
||
|
An ELF executable binary contains at least two kinds of headers: ELF file header
|
||
|
(see <tt>struct Elf32_Ehdr</tt>/<tt>struct Elf64_Ehdr</tt> in <tt><a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=elf/elf.h">/usr/include/elf.h</a></tt>)
|
||
|
and one or more Program Headers (see <tt>struct Elf32_Phdr</tt>/<tt>struct Elf64_Phdr</tt> in <tt><a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=elf/elf.h">/usr/include/elf.h</a></tt>)
|
||
|
<p>
|
||
|
Usually there is another kind of header called Section Header, which describe
|
||
|
attributes of an ELF section (e.g. <tt>.text</tt>, <tt>.data</tt>,
|
||
|
<tt>.bss</tt>, etc) The Section Header is
|
||
|
described by <tt>struct Elf32_Shdr</tt>/<tt>struct Elf64_Shdr</tt> in <tt><a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=elf/elf.h">/usr/include/elf.h</a></tt>
|
||
|
</p><p>
|
||
|
The Program Headers are used during execution (ELF's "<b>execution view</b>"); it tells the kernel or the runtime linker
|
||
|
<tt>ld.so</tt> what to load into memory and how to find dynamic linking information.
|
||
|
</p><p>
|
||
|
The Section Headers are used during compile-time linking (ELF's "<b>linking view</b>"); it tells the link editor <tt>ld</tt>
|
||
|
how to resolve symbols, and how to group similar byte streams from different ELF binary
|
||
|
objects.
|
||
|
</p><p>
|
||
|
Conceptually, the two ELF's "views" are as follows (borrowed from Shaun Clowes's <i>Fixing/Making Holes in Binaries</i> slides):
|
||
|
</p><pre> +-----------------+
|
||
|
+----| ELF File Header |----+
|
||
|
| +-----------------+ |
|
||
|
v v
|
||
|
+-----------------+ +-----------------+
|
||
|
| Program Headers | | Section Headers |
|
||
|
+-----------------+ +-----------------+
|
||
|
|| ||
|
||
|
|| ||
|
||
|
|| ||
|
||
|
|| +------------------------+ ||
|
||
|
+--> | Contents (Byte Stream) |<--+
|
||
|
+------------------------+
|
||
|
</pre>
|
||
|
<p>
|
||
|
In reality, the layout of a typical ELF executable binary on a disk file is like this:
|
||
|
</p><pre> +-------------------------------+
|
||
|
| ELF File Header |
|
||
|
+-------------------------------+
|
||
|
| Program Header for segment #1 |
|
||
|
+-------------------------------+
|
||
|
| Program Header for segment #2 |
|
||
|
+-------------------------------+
|
||
|
| ... |
|
||
|
+-------------------------------+
|
||
|
| Contents (Byte Stream) |
|
||
|
| ... |
|
||
|
+-------------------------------+
|
||
|
| Section Header for section #1 |
|
||
|
+-------------------------------+
|
||
|
| Section Header for section #2 |
|
||
|
+-------------------------------+
|
||
|
| ... |
|
||
|
+-------------------------------+
|
||
|
| ".shstrtab" section |
|
||
|
+-------------------------------+
|
||
|
| ".symtab" section |
|
||
|
+-------------------------------+
|
||
|
| ".strtab" section |
|
||
|
+-------------------------------+
|
||
|
</pre>
|
||
|
The ELF File Header contains the file offsets of the first Program Header,
|
||
|
the first Section Header, and <tt>.shstrtab</tt> section which contains
|
||
|
the section names (a series of NULL-terminated strings)
|
||
|
<p>
|
||
|
The ELF File Header also contains the number of Program Headers
|
||
|
and the number of Section Headers.
|
||
|
</p><p>
|
||
|
Each Program Header describes a "segment": It contains the permissions (Readable, Writeable, or Executable)
|
||
|
, offset of the "segment" (which is just a byte stream) into the file, and the size of the
|
||
|
"segment". The following table shows the purposes of special segments.
|
||
|
Some information
|
||
|
can be found in GNU Binutil's source file <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=binutils.git;a=blob_plain;f=include/elf/common.h"><tt>include/elf/common.h</tt></a>:
|
||
|
</p><p>
|
||
|
<table border="">
|
||
|
<tbody><tr>
|
||
|
<th>ELF Segment</th>
|
||
|
<th>Purpose</th>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>DYNAMIC</tt></td>
|
||
|
<td>For dynamic binaries, this segment hold dynamic linking information and is usually
|
||
|
the same as <tt>.dynamic</tt> section in ELF's linking view. See paragraph below.
|
||
|
</td>
|
||
|
</tr>
|
||
|
|
||
|
<tr>
|
||
|
<td><tt>GNU_EH_FRAME</tt></td>
|
||
|
<td>Frame unwind information (EH = Exception Handling). This segment is usually the same as <tt>.eh_frame_hdr</tt> section in ELF's linking view.
|
||
|
</td>
|
||
|
</tr>
|
||
|
|
||
|
<tr>
|
||
|
<td><tt>GNU_RELRO</tt></td>
|
||
|
<td>This segment indicates the memory region which should be made Read-Only after relocation is done.
|
||
|
This segment usually appears in a dynamic link library and it
|
||
|
contains <tt>.ctors</tt>, <tt>.dtors</tt>, <tt>.dynamic</tt>, <tt>.got</tt>
|
||
|
sections. See paragraph below.
|
||
|
</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>GNU_STACK</tt></td>
|
||
|
<td>The permission flag of this segment indicates whether the
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://www.gentoo.org/proj/en/hardened/gnu-stack.xml">stack is executable or not</a>.
|
||
|
This segment does not have any content; it is just an indicator.
|
||
|
</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>INTERP</tt></td>
|
||
|
<td>For dynamic binaries, this holds the full pathname of runtime linker <tt>ld.so</tt><p>
|
||
|
This segement is the same as <tt>.interp</tt> section in ELF's linking view.
|
||
|
</p></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>LOAD</tt></td>
|
||
|
<td><b>Loadable program segment. Only segments of this type are loaded into memory during execution.</b></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>NOTE</tt></td>
|
||
|
<td>Auxiliary information.<p>For core dumps, this segment contains the status of the process (when the core dump is created),
|
||
|
such as the signal (the process received and caused it to dump core), pending & held signals,
|
||
|
process ID, parent process ID, user ID, nice value,
|
||
|
cumulative user & system time, values of registers (including the program counter!)</p><p>For more info, see
|
||
|
<tt>struct elf_prstatus</tt> and <tt>struct elf_prpsinfo</tt> in Linux kernel source file
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://lxr.linux.no/linux/include/linux/elfcore.h"><tt>include/linux/elfcore.h</tt></a>
|
||
|
and <tt>struct user_regs_struct</tt> in
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://lxr.linux.no/linux/arch/x86/include/asm/user_64.h"><tt>arch/x86/include/asm/user_64.h</tt></a></p></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>TLS</tt></td>
|
||
|
<td>Thread-Local Storage</td>
|
||
|
</tr>
|
||
|
</tbody></table>
|
||
|
</p><p>
|
||
|
Likewise, each Section Header contains the file offset of its corresponding "content"
|
||
|
and the size of the "content".
|
||
|
The following table shows the purposes of some special sections. Most information
|
||
|
here comes from <a href="https://web.archive.org/web/20201202024834/http://refspecs.freestandards.org/LSB_4.0.0/LSB-Core-generic/LSB-Core-generic/specialsections.html">LSB specification</a>.
|
||
|
Some information can be found in GNU Binutil's source file
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=binutils.git;a=blob_plain;f=bfd/elf.c"><tt>bfd/elf.c</tt></a> (look for
|
||
|
<tt>bfd_elf_special_section</tt>)
|
||
|
and <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=binutils.git;a=blob_plain;f=bfd/elflink.c"><tt>bfd/elflink.c</tt></a> (look for
|
||
|
double-quoted section names such as <tt>".got.plt"</tt>)
|
||
|
</p><p>
|
||
|
<table border="">
|
||
|
<tbody><tr>
|
||
|
<th>ELF Section</th>
|
||
|
<th>Purpose</th>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>.bss</tt></td>
|
||
|
<td>Uninitialized global data ("Block Started by Symbol").
|
||
|
<p>Depending on the compilers, uninitialized global variables could
|
||
|
be stored in a nameness section called <tt>COMMON</tt> (named after
|
||
|
Fortran 77's "common blocks".) To wit, consider
|
||
|
the following code:
|
||
|
</p><pre> int globalVar;
|
||
|
static int globalStaticVar;
|
||
|
void dummy() {
|
||
|
static int localStaticVar;
|
||
|
}
|
||
|
</pre>
|
||
|
Compile with <tt>gcc -c</tt>, then on x86_64, the resulting object file has the
|
||
|
following structure:
|
||
|
<pre> $ objdump -t foo.o
|
||
|
|
||
|
SYMBOL TABLE:
|
||
|
....
|
||
|
0000000000000000 l O .bss 0000000000000004 globalStaticVar
|
||
|
0000000000000004 l O .bss 0000000000000004 localStaticVar.1619
|
||
|
....
|
||
|
0000000000000004 O *COM* 0000000000000004 globalVar
|
||
|
</pre>
|
||
|
so only the file-scope and local-scope global variables are in
|
||
|
the <tt>.bss</tt> section.
|
||
|
<p>
|
||
|
If one wants <tt>globalVar</tt> to reside in the <tt>.bss</tt> section,
|
||
|
use the <font color="LightGreen"><tt>-fno-common</tt></font>
|
||
|
compiler command-line option. Using <font color="LightGreen"><tt>-fno-common</tt></font>
|
||
|
is encouraged, as the following example shows:
|
||
|
</p><pre> $ cat foo.c
|
||
|
int globalVar;
|
||
|
$ cat bar.c
|
||
|
double globalVar;
|
||
|
int main(){}
|
||
|
$ gcc foo.c bar.c
|
||
|
</pre>
|
||
|
Not only there is no error message about redefinition of the same symbol
|
||
|
in both source files (notice we did not use the <tt>extern</tt> keyword here),
|
||
|
there is no complaint about their different data
|
||
|
types and sizes either. However, if one uses <font color="LightGreen"><tt>-fno-common</tt></font>,
|
||
|
the compiler will complain:
|
||
|
<pre> /tmp/ccM71JR7.o:(.bss+0x0): <font color="Red">multiple definition</font> of `globalVar'
|
||
|
/tmp/ccIbS5MO.o:(.bss+0x0): first defined here
|
||
|
ld: Warning: <font color="Red">size of symbol</font> `globalVar' changed from 8 in /tmp/ccIbS5MO.o to 4 in /tmp/ccM71JR7.o
|
||
|
</pre>
|
||
|
</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>.comment</tt></td>
|
||
|
<td>A series of NULL-terminated strings containing compiler information.</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>.ctors</tt></td>
|
||
|
<td><b>Pointers</b> to functions which are marked as
|
||
|
<tt>__attribute__ ((constructor))</tt> as well as static C++ objects' constructors.
|
||
|
They will be used by <tt>__libc_global_ctors</tt> function.<p>
|
||
|
See paragraphs below.
|
||
|
</p></td>
|
||
|
</tr>
|
||
|
|
||
|
<tr>
|
||
|
<td><tt>.data</tt></td>
|
||
|
<td>Initialized data.</td>
|
||
|
</tr>
|
||
|
|
||
|
<tr>
|
||
|
<td><tt>.data.rel.ro</tt></td>
|
||
|
<td>Similar to <tt>.data</tt> section, but this section
|
||
|
should be made Read-Only after relocation is done.
|
||
|
</td>
|
||
|
</tr>
|
||
|
|
||
|
<tr>
|
||
|
<td><tt>.debug_XXX</tt></td>
|
||
|
<td>Debugging information (for the programs which are compiled with <tt>-g</tt> option)
|
||
|
which is in the DWARF 2.0 format.
|
||
|
<p>
|
||
|
See <a href="https://web.archive.org/web/20201202024834/http://dwarfstd.org/">here</a> for DWARF debugging format.
|
||
|
</p></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>.dtors</tt></td>
|
||
|
<td><b>Pointers</b> to functions which are marked as
|
||
|
<tt>__attribute__ ((destructor))</tt> as well as static C++ objects' destructors.
|
||
|
<p>
|
||
|
See paragraphs below.
|
||
|
</p></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>.dynamic</tt></td>
|
||
|
<td>For dynamic binaries, this section holds dynamic linking information used by <tt>ld.so</tt>.
|
||
|
See paragraphs below.</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>.dynstr</tt></td>
|
||
|
<td>NULL-terminated strings of names of symbols in <tt>.dynsym</tt> section.
|
||
|
<p>One can use commands such as <tt>readelf -p .dynstr a.out</tt> to see these strings.
|
||
|
</p></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>.dynsym</tt></td>
|
||
|
<td><b>Runtime</b>/Dynamic symbol table. For dynamic binaries, this section is the symbol table of
|
||
|
globally visible symbols. For example, if a dynamic link library wants to export
|
||
|
its symbols, these symbols will be stored here. On the other hand, if
|
||
|
a dynamic executable binary uses symbols from a dynamic link library,
|
||
|
then these symbols are stored here too.
|
||
|
<p>
|
||
|
The symbol names (as NULL-terminated strings) are stored in <tt>.dynstr</tt> section.
|
||
|
</p></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>.eh_frame</tt><br><tt>.eh_frame_hdr</tt></td>
|
||
|
<td>Frame unwind information (EH = Exception Handling).
|
||
|
<p>
|
||
|
See <a href="https://web.archive.org/web/20201202024834/http://refspecs.freestandards.org/LSB_4.0.0/LSB-Core-generic/LSB-Core-generic/ehframechpt.html">here</a>
|
||
|
for details.
|
||
|
</p><p>To see the content of <tt>.eh_frame</tt> section, use
|
||
|
</p><pre>readelf --debug-dump=frames-interp a.out</pre>
|
||
|
</td>
|
||
|
</tr>
|
||
|
|
||
|
<tr>
|
||
|
<td><tt>.fini</tt></td>
|
||
|
<td>Code which will be executed when program exits normally. See paragraphs below.</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>.fini_array</tt></td>
|
||
|
<td><b>Pointers</b> to functions which will be executed when program exits normally. See paragraphs below.</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>.GCC.command.line</tt></td>
|
||
|
<td>A series of NULL-terminated strings containing
|
||
|
GCC command-line (that is used to compile the code) options.<p>This feature is supported since GCC 4.5
|
||
|
and the program must be compiled with <tt>-frecord-gcc-switches</tt> option.
|
||
|
</p></td>
|
||
|
</tr>
|
||
|
|
||
|
<tr>
|
||
|
<td><tt>.gnu.hash</tt></td>
|
||
|
<td>GNU's extension to hash table for symbols.<p>
|
||
|
See <a href="https://web.archive.org/web/20201202024834/http://blogs.sun.com/ali/entry/gnu_hash_elf_sections">here</a> for its structure and the hash algorithm.
|
||
|
</p><p>
|
||
|
The link editor <tt>ld</tt> calls <tt>bfd_elf_gnu_hash</tt> in
|
||
|
in GNU Binutil's source file <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=binutils.git;a=blob_plain;f=bfd/elf.c"><tt>bfd/elf.c</tt></a>
|
||
|
to compute the hash value.
|
||
|
</p><p>
|
||
|
The runtime linker <tt>ld.so</tt> calls <tt>do_lookup_x</tt> in
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=elf/dl-lookup.c"><tt>elf/dl-lookup.c</tt></a>
|
||
|
to do the symbol look-up. The hash computing function here is <tt>dl_new_hash</tt>.
|
||
|
</p></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>.gnu.linkonceXXX</tt></td>
|
||
|
<td>GNU's extension. It means only a single copy of the section will be used in linking.
|
||
|
This is used to by g++. g++ will emit each template expansion in its own section.
|
||
|
The symbols will be defined as weak, so that multiple definitions
|
||
|
are permitted.
|
||
|
</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>.gnu.version</tt></td>
|
||
|
<td>Versions of symbols.
|
||
|
<p>See <a href="https://web.archive.org/web/20201202024834/http://refspecs.freestandards.org/LSB_4.0.0/LSB-Core-generic/LSB-Core-generic/symversion.html">here</a>,
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/4/html/Using_ld_the_GNU_Linker/version.html">here</a>,
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://download.oracle.com/docs/cd/E19963-01/html/819-0690/appendixb-45356.html">here</a>,
|
||
|
and
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://people.redhat.com/drepper/symbol-versioning">here</a>
|
||
|
for details of symbol versioning.
|
||
|
</p></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>.gnu.version_d</tt></td>
|
||
|
<td>Version definitions of symbols.</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>.gnu.version_r</tt></td>
|
||
|
<td>Version references (version needs) of symbols.</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>.got</tt></td>
|
||
|
<td>For dynamic binaries, this Global Offset Table holds the addresses of variables which are
|
||
|
relocated upon loading. See paragraphs below.
|
||
|
</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>.got.plt</tt></td>
|
||
|
<td>For dynamic binaries, this Global Offset Table holds the addresses of functions in dynamic libraries.
|
||
|
They are used by trampoline code in <tt>.plt</tt> section.
|
||
|
If <tt>.got.plt</tt> section is present, it contains at least three entries, which
|
||
|
have special meanings. See paragraphs below.
|
||
|
</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>.hash</tt></td>
|
||
|
<td>Hash table for symbols.<p>
|
||
|
See <a href="https://web.archive.org/web/20201202024834/http://www.sco.com/developers/gabi/latest/ch5.dynamic.html#hash">here</a> for its structure and the hash algorithm.
|
||
|
</p><p>
|
||
|
The link editor <tt>ld</tt> calls <tt>bfd_elf_hash</tt> in
|
||
|
in GNU Binutil's source file <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=binutils.git;a=blob_plain;f=bfd/elf.c"><tt>bfd/elf.c</tt></a>
|
||
|
to compute the hash value.
|
||
|
</p><p>
|
||
|
The runtime linker <tt>ld.so</tt> calls <tt>do_lookup_x</tt> in
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=elf/dl-lookup.c"><tt>elf/dl-lookup.c</tt></a>
|
||
|
to do the symbol look-up. The hash computing function here is <tt>_dl_elf_hash</tt>.
|
||
|
</p></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>.init</tt></td>
|
||
|
<td>Code which will be executed when program initializes. See paragraphs below.</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>.init_array</tt></td>
|
||
|
<td><b>Pointers</b> to functions which will be executed when program starts. See paragraphs below.</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>.interp</tt></td>
|
||
|
<td>For dynamic binaries, this holds the full pathname of runtime linker <tt>ld.so</tt></td>
|
||
|
</tr>
|
||
|
|
||
|
<tr>
|
||
|
<td><tt>.jcr</tt></td>
|
||
|
<td>Java class registration information.<p>
|
||
|
Like <tt>.ctors</tt> section, it contains a list of addresses
|
||
|
which will be used by <tt>_Jv_RegisterClasses</tt> function
|
||
|
in CRT (C Runtime) startup files (see <a href="https://web.archive.org/web/20201202024834/http://gcc.gnu.org/viewcvs/trunk/gcc/crtstuff.c?view=markup"><tt>gcc/crtstuff.c</tt></a>
|
||
|
in GCC's source tree)
|
||
|
</p></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>.note.ABI-tag</tt></td>
|
||
|
<td>This Linux-specific section is structured as a <a href="https://web.archive.org/web/20201202024834/http://www.sco.com/developers/gabi/latest/ch5.pheader.html#note_section">note</a>
|
||
|
section in ELF specification. Its content is mandated
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://refspecs.freestandards.org/LSB_4.0.0/LSB-Core-generic/LSB-Core-generic/noteabitag.html">here</a>.
|
||
|
</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>.note.gnu.build-id</tt></td>
|
||
|
<td>A unique build ID. See <a href="https://web.archive.org/web/20201202024834/http://fedoraproject.org/wiki/RolandMcGrath/BuildID">here</a> and
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://fedoraproject.org/wiki/Releases/FeatureBuildId">here</a>
|
||
|
</td>
|
||
|
</tr>
|
||
|
|
||
|
<tr>
|
||
|
<td><tt>.note.GNU-stack</tt></td>
|
||
|
<td>See <a href="https://web.archive.org/web/20201202024834/http://www.airs.com/blog/archives/518">here</a>
|
||
|
</td>
|
||
|
</tr>
|
||
|
|
||
|
|
||
|
<tr>
|
||
|
<td><tt>.nvFatBinSegment</tt></td>
|
||
|
<td>This segment contains information of nVidia's CUDA fat binary container. Its format
|
||
|
is described by <tt>struct __cudaFatCudaBinaryRec</tt> in <tt>__cudaFatFormat.h</tt>
|
||
|
</td>
|
||
|
</tr>
|
||
|
|
||
|
<tr>
|
||
|
<td><tt>.plt</tt></td>
|
||
|
<td>For dynamic binaries, this Procedure Linkage Table holds the trampoline/linkage code. See paragraphs below.</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>.preinit_array</tt></td>
|
||
|
<td>Similar to <tt>.init_array</tt> section. See paragraphs below.</td>
|
||
|
</tr>
|
||
|
|
||
|
<tr>
|
||
|
<td><tt>.rela.dyn</tt></td>
|
||
|
<td><b>Runtime</b>/Dynamic relocation table.
|
||
|
<p>
|
||
|
For dynamic binaries, this relocation table holds information of variables which
|
||
|
must be relocated upon loading. Each entry in this table is a
|
||
|
<tt>struct Elf64_Rela</tt> (see <tt><a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=elf/elf.h">/usr/include/elf.h</a></tt>) which
|
||
|
has only three members:
|
||
|
</p><ul>
|
||
|
<li><tt>offset</tt> (the variable's [usually position-independent] virtual memory address
|
||
|
which holds the "patched" value during the relocation process)
|
||
|
</li><li><tt>info</tt> (Index into <tt>.dynsym</tt> section and Relocation Type)
|
||
|
</li><li><tt>addend</tt>
|
||
|
</li></ul>
|
||
|
See paragraphs below for details about runtime relocation.
|
||
|
</td>
|
||
|
</tr>
|
||
|
|
||
|
<tr>
|
||
|
<td><tt>.rela.plt</tt></td>
|
||
|
<td><b>Runtime</b>/Dynamic relocation table.
|
||
|
<p>
|
||
|
This relocation table is similar to the one in <tt>.rela.dyn</tt> section;
|
||
|
the difference is this one is for functions, not variables.
|
||
|
</p><p>The relocation type of entries in this table is
|
||
|
<tt>R_386_JMP_SLOT</tt> or <tt>R_X86_64_JUMP_SLOT</tt> and
|
||
|
the "offset" refers to memory addresses which are
|
||
|
inside <tt>.got.plt</tt> section.
|
||
|
</p><p>Simply put, this table holds information to relocate entries in
|
||
|
<tt>.got.plt</tt> section.
|
||
|
</p></td>
|
||
|
</tr>
|
||
|
|
||
|
<tr>
|
||
|
<td><tt>.rel.text</tt><br><tt>.rela.text</tt></td>
|
||
|
<td><b>Compile-time</b>/Static relocation table.
|
||
|
<p>For programs compiled with <tt>-c</tt> option,
|
||
|
this section provides information to the link editor <tt>ld</tt>
|
||
|
where and how to "patch" executable code in <tt>.text</tt> section.
|
||
|
</p><p>The difference between <tt>.rel.text</tt> and <tt>.rela.text</tt>
|
||
|
is entries in the former does not have <tt>addend</tt> member.
|
||
|
(Compare <tt>struct Elf64_Rel</tt> with <tt>struct Elf64_Rela</tt> in <tt>/usr/include/elf.h</tt>)
|
||
|
Instead, the addend is taken from the memory location
|
||
|
described by <tt>offset</tt> member.
|
||
|
</p><p>
|
||
|
Whether to use <tt>.rel</tt> or <tt>.rela</tt> is platform-dependent.
|
||
|
For x86_32, it is <tt>.rel</tt> and for x86_64, <tt>.rela</tt>
|
||
|
</p></td>
|
||
|
</tr>
|
||
|
|
||
|
<tr>
|
||
|
<td><tt>.rel.XXX</tt><br><tt>.rela.XXX</tt></td>
|
||
|
<td>Compile-time/Static relocation table for other sections. For example,
|
||
|
<tt>.rela.init_array</tt> is the relocation table for <tt>.init_array</tt>
|
||
|
section.
|
||
|
</td>
|
||
|
</tr>
|
||
|
|
||
|
<tr>
|
||
|
<td><tt>.rodata</tt></td>
|
||
|
<td>Read-only data.</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>.shstrtab</tt></td>
|
||
|
<td>NULL-terminated strings of section names.
|
||
|
<p>One can use commands such as <tt>readelf -p .shstrtab a.out</tt> to see these strings.
|
||
|
</p></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>.strtab</tt></td>
|
||
|
<td>NULL-terminated strings of names of symbols in <tt>.symtab</tt> section.
|
||
|
<p>One can use commands such as <tt>readelf -p .strtab a.out</tt> to see these strings.
|
||
|
</p></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>.symtab</tt></td>
|
||
|
<td><b>Compile-time</b>/Static symbol table.
|
||
|
<p>This is the main symbol table used in compile-time linking
|
||
|
or runtime debugging.
|
||
|
</p><p>
|
||
|
The symbol names (as NULL-terminated strings) are stored in <tt>.strtab</tt> section.
|
||
|
</p><p>Both <tt>.symtab</tt> and <tt>.symtab</tt> can be stripped away by the <tt>strip</tt>
|
||
|
command.
|
||
|
</p></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>.tbss</tt></td>
|
||
|
<td>Similar to <tt>.bss</tt> section, but for <i>Thread-Local data</i>. See paragraphs below.</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>.tdata</tt></td>
|
||
|
<td>Similar to <tt>.data</tt> section, but for <i>Thread-Local data</i>. See paragraphs below.</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>.text</tt></td>
|
||
|
<td>User's executable code</td>
|
||
|
</tr>
|
||
|
|
||
|
</tbody></table>
|
||
|
|
||
|
</p><h2>How is an executable binary in Linux being executed ?</h2>
|
||
|
First, the operating system must recognize executable binaries. For example,
|
||
|
<tt>zcat /proc/config.gz | grep <a href="https://web.archive.org/web/20201202024834/http://cateee.net/lkddb/web-lkddb/BINFMT_ELF.html">CONFIG_BINFMT_ELF</a></tt> can show whether the Linux kernel is compiled
|
||
|
to support ELF executable binary format (if <tt>/proc/config.gz</tt> does not exist, try
|
||
|
<tt>/lib/modules/`uname -r`/build/.config</tt>)
|
||
|
<p>
|
||
|
When the shell makes an <tt>execvc</tt> system call to run an executable binary, the Linux kernel responds as
|
||
|
follows (see <a href="https://web.archive.org/web/20201202024834/http://asm.sourceforge.net/articles/startup.html">here</a> and
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://s.eresi-project.org/inc/articles/elf-rtld.txt">here</a> for more details) in sequence:
|
||
|
</p><ol>
|
||
|
<li><a href="https://web.archive.org/web/20201202024834/http://lxr.linux.no/linux/arch/x86/kernel/process.c#L301"><tt>sys_execve</tt></a> function (in <tt>arch/x86/kernel/process.c</tt>) handles the <tt>execvc</tt> system call
|
||
|
from user space. It calls <tt>do_execve</tt> function.
|
||
|
</li><li><tt>do_execve</tt> function (in <tt>fs/exec.c</tt>) opens the executable binary file and does some preparation.
|
||
|
It calls <tt>search_binary_handler</tt> function.
|
||
|
</li><li><a href="https://web.archive.org/web/20201202024834/http://lxr.linux.no/linux/fs/exec.c#L1240"><tt>search_binary_handler</tt></a> function (in <tt>fs/exec.c</tt>) finds out the type of executable binary
|
||
|
and calls the corresponding handler, which in our case, is <tt>load_elf_binary</tt> function.
|
||
|
</li><li><a href="https://web.archive.org/web/20201202024834/http://lxr.linux.no/linux/fs/binfmt_elf.c#L564"><tt>load_elf_binary</tt></a> (in <tt>fs/binfmt_elf.c</tt>) loads the user's executable binary file into memory.
|
||
|
It allocates memory segments and zeros out the BSS section by calling the <tt>padzero</tt> function.
|
||
|
<p><tt>load_elf_binary</tt> also examines
|
||
|
whether the user's executable binary contains an <tt>INTERP</tt> segment or not.
|
||
|
|
||
|
</p></li><li>If the executable binary is dynamically linked, then the compiler will usually creates an
|
||
|
<tt>INTERP</tt> segment (which is usually the same as <tt>.interp</tt> section in
|
||
|
ELF's "linking view"), which contains the full pathname of an "interpreter", usually
|
||
|
is the Glibc runtime linker <a href="https://web.archive.org/web/20201202024834/http://linux.die.net/man/8/ld-linux">ld.so</a>.
|
||
|
<p>To see this, use command <tt>readelf -p .interp a.out</tt>
|
||
|
</p><p>According to <a href="https://web.archive.org/web/20201202024834/http://www.x86-64.org/documentation/abi.pdf">AMD64 System V Application Binary Interface</a>,
|
||
|
the only valid interpreter for programs conforming to AMD64 ABI is <tt>/lib/ld64.so.1</tt>
|
||
|
and on Linux, GCC usually uses <tt>/lib64/ld-linux-x86-64.so.2</tt>
|
||
|
or <tt>/lib/ld-linux-x86-64.so.2</tt> instead:
|
||
|
</p><pre>$ gcc -dumpspecs
|
||
|
....
|
||
|
|
||
|
*link:
|
||
|
...
|
||
|
%{!m32:%{!dynamic-linker:-dynamic-linker %{muclibc:%{mglibc:%e-mglibc and -muclibc used
|
||
|
together}/lib/ld64-uClibc.so.0;:<font color="LightGreen">/lib/ld-linux-x86-64.so.2</font>}}}}
|
||
|
...
|
||
|
</pre>
|
||
|
<p>To change the runtime linker, compile the program using something like </p><pre>gcc foo.c -Wl,-I/my/own/ld.so</pre>
|
||
|
<p>The <a href="https://web.archive.org/web/20201202024834/http://www.sco.com/developers/gabi/latest/ch5.dynamic.html">System V Application Binary Interface</a>
|
||
|
specifies, the operating system, instead of running the user's executable binary, should run this
|
||
|
"interpreter". This interpreter should complete the binding of user's executable binary
|
||
|
to its dependencies.
|
||
|
|
||
|
</p></li><li>Thus, if the ELF executable binary file contains an <tt>INTERP</tt> segment, <tt>load_elf_binary</tt> will
|
||
|
call <a href="https://web.archive.org/web/20201202024834/http://lxr.linux.no/linux/fs/binfmt_elf.c#L383"><tt>load_elf_interp</tt></a> function to load the image of this interpreter as well.
|
||
|
</li><li>Finally, <tt>load_elf_binary</tt> calls <tt>start_thread</tt> (in <tt>arch/x86/kernel/process_64.c</tt>)
|
||
|
and passes control to either the interpreter or the user program.
|
||
|
</li></ol>
|
||
|
|
||
|
<h2>What about <tt>ld.so</tt> ?</h2>
|
||
|
<tt>ld.so</tt> is the runtime linker/loader (the compile-time linker <tt>ld</tt> is formally called "link editor")
|
||
|
for dynamic executables. It provides the <a href="https://web.archive.org/web/20201202024834/http://download.oracle.com/docs/cd/E19963-01/html/819-0690/chapter3-1.html">following services</a>:
|
||
|
<ul>
|
||
|
<li>Analyzes the user's executable binary's <tt>DYNAMIC</tt> segment and determines what
|
||
|
dependencies are required. (See below)
|
||
|
</li><li>Locates and loads these dependencies, analyzes their <tt>DYNAMIC</tt> segments
|
||
|
to determine if more dependencies are required.
|
||
|
</li><li>Performs any necessary relocations to bind these objects.
|
||
|
</li><li>Calls any initialization functions (see below) provided by these dependencies.
|
||
|
</li><li>Passes control to user's executable binary.
|
||
|
</li></ul>
|
||
|
|
||
|
<h2>Compile your own <tt>ld.so</tt> </h2>
|
||
|
The internal working of <tt>ld.so</tt> is complex, so you might want to compile and experiment your
|
||
|
own <tt>ld.so</tt>.
|
||
|
The source code of <tt>ld.so</tt> can be found in <font color="lightgreen">Glibc</font>. The main files are
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=elf/rtld.c"><tt>elf/rtld.c</tt></a>,
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=elf/dl-reloc.c"><tt>elf/dl-reloc.c</tt></a>, and
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=sysdeps/x86_64/dl-machine.h"><tt>sysdeps/x86_64/dl-machine.h</tt></a>.
|
||
|
<p>
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://www.linuxfromscratch.org/lfs/view/development/chapter05/glibc.html">This link</a>
|
||
|
provides general tips for building Glibc. Glibc's own
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=INSTALL">INSTALL</a> and
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=FAQ">FAQ</a> documents
|
||
|
are useful too.
|
||
|
</p><p>
|
||
|
To compile Glibc (<font color="lightgreen"><tt>ld.so</tt> cannot be compiled independently</font>) download and unpack Glibc source tarball.
|
||
|
</p><ul>
|
||
|
<li>Make sure the version of Glibc you downloaded is the same as the system's current one.
|
||
|
</li><li>Make sure the environmental variable <tt>LD_RUN_PATH</tt> is not set.
|
||
|
</li><li>Read the <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=INSTALL">INSTALL</a> and make sure all necessary tool chains (Make, Binutils, etc)
|
||
|
are up-to-date.
|
||
|
</li><li>Make sure the file system you are doing the compilation is <font color="lightgreen">case sensitive</font>, or
|
||
|
you will see <a href="https://web.archive.org/web/20201202024834/http://crossgcc.rts-software.org/doku.php?id=i386linuxgccformac">weird errors</a> like
|
||
|
<pre>/scratch/elf/librtld.os: In function `process_envvars':
|
||
|
/tmp/glibc-2.x.y/elf/rtld.c:2718: undefined reference to `__open'
|
||
|
...
|
||
|
</pre>
|
||
|
|
||
|
</li><li><tt>ld.so</tt> should be compiled with the <font color="lightgreen">optimization flag on</font>
|
||
|
(<tt>-O2</tt> is the default). Failing to do so will end up with weird errors (see Question 1.23 in
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=FAQ">FAQ</a>)
|
||
|
</li><li>Suppose Glibc is unpacked at <pre>/tmp/glibc-2.x.y/</pre>
|
||
|
Then edit <tt>/tmp/glibc-2.x.y/Makefile.in</tt>: Un-comment the line <pre># PARALLELMFLAGS = -j 4</pre> and
|
||
|
change 4 to an appropriate number.<p>
|
||
|
</p></li><li>Since we are only interested in <tt>ld.so</tt> and not the whole Glibc,
|
||
|
we only want to build the essential source files needed by <tt>ld.so</tt>.
|
||
|
To do so, edit <tt>/tmp/glibc-2.x.y/Makeconfig</tt>: Find the line started with
|
||
|
<pre>all-subdirs = csu assert ctype locale intl catgets math setjmp signal \
|
||
|
...
|
||
|
</pre>
|
||
|
and change it to
|
||
|
<pre>all-subdirs = csu elf gmon io misc posix setjmp signal stdlib string time
|
||
|
</pre>
|
||
|
|
||
|
</li><li>Find a scratch directory, say <tt>/scratch</tt>. Then
|
||
|
<pre>$ cd /scratch
|
||
|
$ /tmp/glibc-2.x.y/configure --prefix=/scratch --disable-profile
|
||
|
$ gmake
|
||
|
</pre>
|
||
|
|
||
|
</li><li>Since we are not building the entire Glibc, when the <tt>gmake</tt>
|
||
|
stops (probably with some errors), check if <tt>/scratch/elf/ld.so</tt> exists
|
||
|
or not.
|
||
|
|
||
|
</li><li><tt>ld.so</tt> is a static binary, which means it has its own
|
||
|
implementation of standard C routines (e.g. <tt>memcpy</tt>, <tt>strcmp</tt>, etc)
|
||
|
It has its own <tt>printf</tt>-like routine called <tt>_dl_debug_printf</tt>.
|
||
|
<p>
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=elf/dl-misc.c"><tt>_dl_debug_printf</tt></a>
|
||
|
is not the full-blown <tt>printf</tt> and has very limited capabilities.
|
||
|
For example, to print the address, one would need to use
|
||
|
</p><pre>_dl_debug_printf("0x%0*lx\n", (int)sizeof (void*)*2, &foo);
|
||
|
</pre>
|
||
|
</li></ul>
|
||
|
|
||
|
<h2>How does <tt>ld.so</tt> work ?</h2>
|
||
|
|
||
|
<tt>ld.so</tt>, by its nature, cannot be a dynamic executable itself. The
|
||
|
entry point of <tt>ld.so</tt> is <tt>_start</tt> defined in
|
||
|
the macro <tt>RTLD_START</tt> (in <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=sysdeps/x86_64/dl-machine.h"><tt>sysdeps/x86_64/dl-machine.h</tt></a>).
|
||
|
<tt>_start</tt> is placed at the beginning of <tt>.text</tt> section, and
|
||
|
the default <tt>ld</tt> script specifies
|
||
|
"Entry point address" (in ELF header, use <tt>readelf -h ld.so|grep Entry</tt> command to see)
|
||
|
to be the address of <tt>_start</tt> (use <tt>ld -verbose | grep ENTRY</tt> command to see). One
|
||
|
can set the entry point to a different address at compile time
|
||
|
by <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/binutils/docs/ld/Entry-Point.html"><tt>-e</tt> option</a>)
|
||
|
so <tt>ld.so</tt> is executed from here. The very first thing it does is to call <tt>_dl_start</tt> of
|
||
|
<tt>elf/rtld.c</tt>. To see this, run gdb on some ELF executable binary, and do
|
||
|
<pre>(gdb) break _dl_start
|
||
|
Function "_dl_start" not defined.
|
||
|
Make breakpoint pending on future shared library load? (y or [n]) y
|
||
|
Breakpoint 1 (_dl_start) pending.
|
||
|
(gdb) run
|
||
|
Starting program: a.out
|
||
|
|
||
|
Breakpoint 1, 0x0000003433e00fa0 in _dl_start () from /lib64/ld-linux-x86-64.so.2
|
||
|
(gdb) bt
|
||
|
#0 0x0000003433e00fa0 in <font color="lightgreen">_dl_start</font> () from /lib64/ld-linux-x86-64.so.2
|
||
|
#1 0x0000003433e00a78 in <font color="lightgreen">_start</font> () from /lib64/ld-linux-x86-64.so.2
|
||
|
#2 0x0000000000000001 in ?? ()
|
||
|
#3 0x00007fffffffe4f2 in ?? ()
|
||
|
#4 0x0000000000000000 in ?? ()
|
||
|
...
|
||
|
(gdb) x/10i $pc
|
||
|
0x3433e00a70 <_start>: mov %rsp,%rdi
|
||
|
0x3433e00a73 <_start+3>: callq 0x3433e00fa0 <<font color="lightgreen">_dl_start</font>>
|
||
|
0x3433e00a78 <_dl_start_user>: mov %rax,%r12
|
||
|
0x3433e00a7b <_dl_start_user+3>: mov 0x21b30b(%rip),%eax # 0x343401bd8c <_dl_skip_args>
|
||
|
...
|
||
|
</pre>
|
||
|
At this breakpoint, we can use <tt>pmap</tt> to see the memory map of a.out, which would
|
||
|
look like this:
|
||
|
<pre>0000000000400000 8K r-x-- a.out
|
||
|
0000000000601000 4K rw--- a.out
|
||
|
0000003433e00000 112K r-x-- /lib64/ld-2.5.so
|
||
|
000000343401b000 8K rw--- /lib64/ld-2.5.so
|
||
|
00007ffffffea000 84K rw--- [ stack ]
|
||
|
ffffffffff600000 8192K ----- [ anon ]
|
||
|
total 8408K
|
||
|
</pre>
|
||
|
The memory segment of <tt>/lib64/ld-2.5.so</tt> indeed starts at 3433e00000 (page aligned) and
|
||
|
this can be verified by running <tt>readelf -t /lib64/ld-2.5.so</tt>.
|
||
|
<p>
|
||
|
If we put another breakpoint at <tt>main</tt> and continue, then when it stops, the memory
|
||
|
map would change to this:
|
||
|
</p><pre>0000000000400000 8K r-x-- a.out
|
||
|
0000000000601000 4K rw--- a.out
|
||
|
0000003433e00000 112K r-x-- /lib64/ld-2.5.so
|
||
|
000000343401b000 4K r---- /lib64/ld-2.5.so
|
||
|
000000343401c000 4K rw--- /lib64/ld-2.5.so
|
||
|
<font color="lightgreen">0000003434200000 1336K r-x-- /lib64/libc-2.5.so <-- The first "LOAD" segment, which contains .text and .rodata sections
|
||
|
000000343434e000 2044K ----- /lib64/libc-2.5.so <-- "Hole"
|
||
|
000000343454d000 16K r---- /lib64/libc-2.5.so <-- Relocation (GNU_RELRO) info -+---- The second "LOAD" segment
|
||
|
0000003434551000 4K rw--- /lib64/libc-2.5.so <-- .got.plt .data sections -+
|
||
|
0000003434552000 20K rw--- [ anon ] <-- The remaining zero-filled sections (e.g. .bss)
|
||
|
0000003434e00000 88K r-x-- /lib64/libpthread-2.5.so <-- The first "LOAD" segment, which contains .text and .rodata sections
|
||
|
0000003434e16000 2044K ----- /lib64/libpthread-2.5.so <-- "Hole"
|
||
|
0000003435015000 4K r---- /lib64/libpthread-2.5.so <-- Relocation (GNU_RELRO) info -+---- The second "LOAD" segment
|
||
|
0000003435016000 4K rw--- /lib64/libpthread-2.5.so <-- .got.plt .data sections -+
|
||
|
0000003435017000 16K rw--- [ anon ] <-- The remaining zero-filled sections (e.g. .bss)
|
||
|
00002aaaaaaab000 4K rw--- [ anon ]
|
||
|
00002aaaaaac6000 12K rw--- [ anon ]</font>
|
||
|
00007ffffffea000 84K rw--- [ stack ]
|
||
|
ffffffffff600000 8192K ----- [ anon ]
|
||
|
total 14000K
|
||
|
</pre>
|
||
|
Indeed, <tt>ld.so</tt> has brought in all the required dynamic libraries.<p>Note that there
|
||
|
are two memory regions of 2044KB with <font color="lightgreen">null permissions</font>.
|
||
|
As mentioned earlier, the ELF's 'execution view' is concerned with how to load an executable
|
||
|
binary into memory. When <tt>ld.so</tt> brings in the dynamic libraries, it looks at the segments labelled
|
||
|
as <tt>LOAD</tt> (look at "Program Headers" and "Section to Segment mapping"
|
||
|
from <tt>readelf -a xxx.so</tt> command.) Usually there are two <tt>LOAD</tt> segments, and
|
||
|
there is a "hole" between the two segments (look at the VirtAddr and MemSiz of these
|
||
|
two segments), so <tt>ld.so</tt> will
|
||
|
make this hole inaccessible deliberately: Look for the <tt>PROT_NONE</tt> symbol in
|
||
|
<tt>_dl_map_object_from_fd</tt> in <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=elf/dl-load.c"><tt>elf/dl-load.c</tt></a>
|
||
|
</p><p>
|
||
|
Also note that each of
|
||
|
<tt>libc-2.5.so</tt> and <tt>libpthread-2.5.so</tt> has a read-only memory region
|
||
|
(at 0x343454d000 and 0x3435015000, respectively). This is a for
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=elf/dl-reloc.c"><tt>elf/dl-reloc.c</tt></a>.
|
||
|
The <tt>GNU_RELRO</tt> segment is contained in the the second <tt>LOAD</tt> segment, which
|
||
|
contains the following sections (look at "Program Headers" and "Section to Segment mapping"
|
||
|
from <tt>readelf -l xxx.so</tt> command):
|
||
|
<tt>.tdata</tt>, <tt>.fini_array</tt>, <tt>.ctors</tt>, <tt>.dtors</tt>, <tt>__libc_subfreeres</tt>,
|
||
|
<tt>__libc_atexit</tt>, <tt>__libc_thread_subfreeres</tt>, <tt>.data.rel.ro</tt>, <tt>.dynamic</tt>,
|
||
|
<tt>.got</tt>, <tt>.got.plt</tt>, <tt>.data</tt>, and <tt>.bss</tt>. Except for
|
||
|
<tt>.got.plt</tt>, <tt>.data</tt>, and <tt>.bss</tt>, all sections in the the second <tt>LOAD</tt> segment
|
||
|
are also in the <tt>GNU_RELRO</tt> segment, and they are thus made read-only.
|
||
|
</p><p>
|
||
|
The two <tt>[anon]</tt> memory segments at 0x3434552000 and 0x3435017000 are for sections which do not take space in the ELF
|
||
|
binary files. For example, <tt>readelf -t xxx.so</tt> will show that <tt>.bss</tt> section
|
||
|
has <tt>NOBITS</tt> flag, which means that section takes no disk space. When segments
|
||
|
containing <tt>NOBITS</tt> sections are mapped into memory, <tt>ld.so</tt> allocates
|
||
|
extra memory pages to accomodate these <tt>NOBITS</tt> sections. A <tt>LOAD</tt>
|
||
|
segment is usually structured as a series of <b>contiguous</b> sections, and if
|
||
|
a segment contains <tt>NOBITS</tt> sections, these <tt>NOBITS</tt> sections will
|
||
|
be grouped together and placed at the tail of the segment.
|
||
|
</p><p>
|
||
|
So what does <tt>_dl_start</tt> do ?
|
||
|
</p><ul>
|
||
|
<li>Allocate the initial TLS block and initialize the Thread Pointer if needed (these are for <tt>ld.so</tt>'s own, not for the user program)
|
||
|
</li><li>Call <tt>_dl_sysdep_start</tt>, which will call <tt>dl_main</tt>
|
||
|
</li><li><tt>dl_main</tt> does the majority of the hard work, for example:<p>
|
||
|
It calls <tt>process_envvars</tt> to handle these <tt>LD_</tt> prefix environmental
|
||
|
variables such as <tt>LD_PRELOAD</tt>, <tt>LD_LIBRARY_PATH</tt>.</p><p>
|
||
|
It examines the <tt>NEEDED</tt> field(s) in the user executable binary's <tt>DYNAMIC</tt> segment
|
||
|
section (see below) to determine the dependencies.</p><p>
|
||
|
It calls <tt>_dl_init_paths</tt> (in <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=elf/dl-load.c"><tt>elf/dl-load.c</tt></a>)
|
||
|
to initialize the dynamic libraries search paths.
|
||
|
According to <a href="https://web.archive.org/web/20201202024834/http://manpages.courier-mta.org/htmlman8/ld.so.8.html"><tt>ld.so</tt> man page</a>
|
||
|
and <a href="https://web.archive.org/web/20201202024834/http://blog.lxgcc.net/?tag=dt_runpath">this page</a>,
|
||
|
the dynamic libraries are searched in the following order:
|
||
|
</p><p>
|
||
|
</p><ol>
|
||
|
<li>The <tt>RPATH</tt> in the <tt>DYNAMIC</tt> segment if there is no
|
||
|
<tt>RUNPATH</tt> in the <tt>DYNAMIC</tt> segment.
|
||
|
<p><tt>RPATH</tt> can be specified when
|
||
|
the code is compiled with <tt>gcc -Wl,-rpath=...</tt>
|
||
|
</p><p><font color="red">Use of <tt>RPATH</tt> is deprecated</font>
|
||
|
because it has an obvious drawback: There is no way to override
|
||
|
it except using <tt>LD_PRELOAD</tt> environmental variable
|
||
|
or removing it from the <tt>DYNAMIC</tt> segment.
|
||
|
</p><p>Both <tt>RPATH</tt> and <tt>RUNPATH</tt> can
|
||
|
contain <font color="LightGreen"><tt>$ORIGIN</tt></font>
|
||
|
(or equivalently <font color="LightGreen"><tt>${ORIGIN}</tt></font>), which will be
|
||
|
expanded to the value of environmental variable <tt>LD_ORIGIN_PATH</tt>
|
||
|
or the full path of the loaded object
|
||
|
(unless the programs use <tt>setuid</tt> or <tt>setgid</tt>)
|
||
|
</p><p>
|
||
|
</p></li><li>The <tt>LD_LIBRARY_PATH</tt> environmental variable (unless
|
||
|
the programs use <tt>setuid</tt> or <tt>setgid</tt>)
|
||
|
</li><li>The <tt>RUNPATH</tt> in the <tt>DYNAMIC</tt> segment.<br><tt>RUNPATH</tt> can be specified when
|
||
|
the code is compiled with <tt>gcc -Wl,-rpath=...<font color="LightGreen">,--enable-new-dtags</font></tt>
|
||
|
<br>
|
||
|
One can use <a href="https://web.archive.org/web/20201202024834/http://linux.die.net/man/1/chrpath">chrpath</a>
|
||
|
tool to manipulate <tt>RPATH</tt> and <tt>RUNPATH</tt> settings.
|
||
|
</li><li><a href="https://web.archive.org/web/20201202024834/http://manpages.courier-mta.org/htmlman8/ldconfig.8.html"><tt>/etc/ld.so.cache</tt></a>
|
||
|
</li><li><tt>/lib</tt>
|
||
|
</li><li><tt>/usr/lib</tt>
|
||
|
</li></ol>
|
||
|
<p>
|
||
|
It calls <tt>_dl_map_object_from_fd</tt> (in <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=elf/dl-load.c"><tt>elf/dl-load.c</tt></a>)
|
||
|
to load the dynamic libraries, sets up the right read/write/execute permissions for the memory segments,
|
||
|
(within <tt>_dl_map_object_from_fd</tt>, look at calls to <tt>mmap</tt>, <tt>mprotect</tt> and symbols such as
|
||
|
<tt>PROT_READ</tt>, <tt>PROT_WRITE</tt>, <tt>PROT_EXEC</tt>, <tt>PROT_NONE</tt>),
|
||
|
<b>zeroes out BSS sections of dynamic libraries</b> (inside <tt>_dl_map_object_from_fd</tt> function, look at calls to <tt>memset</tt>),
|
||
|
updates the link map, and performs relocations.</p><p>
|
||
|
It calls <tt>_dl_relocate_object</tt> (in <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=elf/dl-reloc.c"><tt>elf/dl-reloc.c</tt></a>) to perform <b>runtime relocations</b> (see details below).
|
||
|
</p><p>
|
||
|
|
||
|
</p></li><li>When <tt>_dl_start</tt> returns, it continues to execute
|
||
|
code in <tt>_dl_start_user</tt> (see <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=sysdeps/x86_64/dl-machine.h"><tt>sysdeps/x86_64/dl-machine.h</tt></a>)
|
||
|
|
||
|
</li><li><tt>_dl_start_user</tt> will call <tt>_dl_init_internal</tt>, which will call <tt>call_init</tt>
|
||
|
to invoke initialization function of each dynamic library loaded.
|
||
|
<p>Note that <tt>_dl_init_internal</tt> is defined in <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=elf/dl-init.c"><tt>elf/dl-init.c</tt></a> as:
|
||
|
</p><pre>void
|
||
|
internal_function
|
||
|
_dl_init (struct link_map *main_map, int argc, char **argv, char **env)
|
||
|
</pre>
|
||
|
<tt>call_init</tt> is also in <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=elf/dl-init.c"><tt>elf/dl-init.c</tt></a><p>
|
||
|
|
||
|
</p></li><li>The initialization function of a dynamic library, say <tt>libfoo.so</tt>, is located at the
|
||
|
address marked with type "<tt>INIT</tt>" in the output of <tt>readelf -d libfoo.so</tt>
|
||
|
<font color="lightgreen">For Glibc, its initialization function is named <tt>_init</tt></font> (not to be confused with the <tt>_init</tt>
|
||
|
inside the user's executable binary) and its source code is in
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=sysdeps/unix/sysv/linux/x86_64/init-first.c"><tt>sysdeps/unix/sysv/linux/x86_64/init-first.c</tt></a>.
|
||
|
<p><tt>_init</tt> will do the following things:
|
||
|
</p><ul>
|
||
|
<li>Save <tt>argc</tt>, <tt>argv</tt>, <tt>envp</tt> to hidden variables
|
||
|
<tt>__libc_argc</tt>, <tt>__libc_argv</tt>, <tt>__environ</tt>
|
||
|
</li><li>Call <tt>VDSO_SETUP</tt> to set up Virtaul Dynamic Shared Objects (see <a href="https://web.archive.org/web/20201202024834/http://www.acsu.buffalo.edu/%7Echarngda/x86assembly.html">here</a>)
|
||
|
<tt>VDSO_SETUP</tt> is a platform-dependent macro. For x86_64, this macro is defined as
|
||
|
<tt>_libc_vdso_platform_setup</tt> in <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=sysdeps/unix/sysv/linux/x86_64/init-first.c"><tt>sysdeps/unix/sysv/linux/x86_64/init-first.c</tt></a>
|
||
|
</li><li>Call <tt>__init_misc</tt> (in
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=misc/init-misc.c"><tt>misc/init-misc.c</tt></a>) which saves <tt>argv[0]</tt>
|
||
|
to two global variables: <tt>program_invocation_name</tt> and <tt>program_invocation_short_name</tt>
|
||
|
</li><li>Call <tt>__libc_global_ctors</tt> (in <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=elf/soinit.c"><tt>elf/soinit.c</tt></a>) to invoke each function listed in
|
||
|
the <tt>.ctors</tt> section (see below).<p>
|
||
|
For x86_64, <tt>.ctors</tt> section contains only one function: <tt>init_cacheinfo</tt></p><p>
|
||
|
</p></li></ul>
|
||
|
|
||
|
</li><li>At the end of <tt>_dl_start_user</tt>, the control transfers to user program's entry point address (use <tt>readelf -h a.out|grep Entry</tt> to see)
|
||
|
which is usually the initial address of <tt>.text</tt> section and contains
|
||
|
the entry of a function named <tt>_start</tt>, and in the control transfer, the finalizer function
|
||
|
<tt>_dl_fini</tt> is passed as an argument,
|
||
|
and the stack frames are completely clobbered, as if the user program
|
||
|
is run without any <tt>ld.so</tt> intervention. The latter is done by manipulating the stack (see the
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://articles.manugarg.com/aboutelfauxiliaryvectors.html">on-stack auxiliary vector</a> adjustment
|
||
|
code and <tt>HAVE_AUX_VECTOR</tt> in <tt>dl_main</tt>)
|
||
|
</li></ul>
|
||
|
<p>
|
||
|
</p><center><h1>Here is the <a href="https://web.archive.org/web/20201202024834/http://www.acsu.buffalo.edu/%7Echarngda/code/gdb_callgraph/examples/callgraphLDSO.gif">call graph</a>,
|
||
|
which is worth a thousand words</h1> and see <a href="https://web.archive.org/web/20201202024834/http://www.acsu.buffalo.edu/%7Echarngda/callgraph.html">here</a>
|
||
|
on how it is generated.</center>
|
||
|
<p>
|
||
|
<font color="lightgreen">To see <tt>ld.so</tt> in action, set the environmental
|
||
|
variable <tt>LD_DEBUG</tt> to <tt>all</tt></font> and then run a user program.
|
||
|
</p><p>The above debugging information does not show <tt>mmap</tt> and <tt>mprotect</tt> calls.
|
||
|
However, we can use <tt>strace</tt>. If we run the user program again with
|
||
|
</p><pre>strace -e trace=mmap,mprotect,munmap,open a.out</pre> we should see something like the
|
||
|
following:
|
||
|
<pre>mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2ae62c0d1000
|
||
|
|
||
|
.... (a lot of failed attempts to open 'libpthread.so.0' using LD_LIBRARY_PATH)
|
||
|
|
||
|
<font color="LightBlue">open("/etc/ld.so.cache", O_RDONLY) = 3
|
||
|
mmap(NULL, 104801, PROT_READ, MAP_PRIVATE, 3, 0) = 0x2ae62c0d2000</font>
|
||
|
<font color="LightCoral">open("/lib64/libpthread.so.0", O_RDONLY) = 3</font>
|
||
|
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2ae62c0ec000
|
||
|
<font color="LightCoral">mmap(0x3434e00000, 2204528, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x3434e00000 <-- Bring in the first "LOAD" segment
|
||
|
mprotect(0x3434e16000, 2093056, PROT_NONE) = 0 <-- Make the "hole" inaccessible
|
||
|
mmap(0x3435015000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x15000) = 0x3435015000 <-- Bring in the second "LOAD" segment
|
||
|
mmap(0x3435017000, 13168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x3435017000</font>
|
||
|
(note: 0x3435017000 is the [anon] part which follows immediately after libpthread-2.5.so)
|
||
|
...
|
||
|
.... (a lot of failed attempts to open 'libc.so.6' using LD_LIBRARY_PATH)
|
||
|
|
||
|
<font color="Orange">open("/lib64/libc.so.6", O_RDONLY) = 3
|
||
|
mmap(0x3434200000, 3498328, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x3434200000 <-- Bring in the first "LOAD" segment
|
||
|
mprotect(0x343434e000, 2093056, PROT_NONE) = 0 <-- Make the "hole" inaccessible
|
||
|
mmap(0x343454d000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x14d000) = 0x343454d000 <-- Bring in the second "LOAD" segment
|
||
|
mmap(0x3434552000, 16728, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x3434552000</font>
|
||
|
(note: 0x3434552000 is the [anon] part which follows immediately after libc-2.5.so)
|
||
|
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2ae62c0ed000
|
||
|
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2ae62c0ee000
|
||
|
<font color="Orange">mprotect(0x343454d000, 16384, PROT_READ) = 0</font> <-- Make the GNU_RELRO segment read-only
|
||
|
<font color="LightCoral">mprotect(0x3435015000, 4096, PROT_READ) = 0</font> <-- Make the GNU_RELRO segment read-only
|
||
|
mprotect(0x343401b000, 4096, PROT_READ) = 0
|
||
|
<font color="LightBlue">munmap(0x2ae62c0d2000, 104801)= 0</font>
|
||
|
mmap(NULL, 10489856, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_32BIT, -1, 0) = 0x40dc7000
|
||
|
mprotect(0x40dc7000, 4096, PROT_NONE) = 0
|
||
|
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aaaaaaab000
|
||
|
</pre>
|
||
|
|
||
|
<h2><tt>.plt</tt> section</h2>
|
||
|
This section contains trampolines for functions defined in dynamic libraries.
|
||
|
A sample disassembly (run the command <tt>objdump -M intel -dj .plt a.out</tt>) will show the following:
|
||
|
<pre>4003c0 <<font color="lightgreen">printf@plt-0x10</font>>:
|
||
|
4003c0: push QWORD PTR [RIP+0x2004d2] # 600898 <_GLOBAL_OFFSET_TABLE_+0x8>
|
||
|
4003c6: jmp QWORD PTR [RIP+0x2004d4] # 6008a0 <_GLOBAL_OFFSET_TABLE_+0x10>
|
||
|
4003cc: nop DWORD PTR [RAX+0x0]
|
||
|
|
||
|
4003d0 <printf@plt>:
|
||
|
4003d0: jmp QWORD PTR [RIP+0x2004d2] # <font color="lightgreen">6008a8</font> <_GLOBAL_OFFSET_TABLE_+0x18>
|
||
|
4003d6: push 0
|
||
|
4003db: jmp 4003c0 <<font color="lightgreen">printf@plt-0x10</font>>
|
||
|
|
||
|
4003e0 <__libc_start_main@plt>:
|
||
|
4003e0: jmp QWORD PTR [RIP+0x2004ca] # 6008b0 <_GLOBAL_OFFSET_TABLE_+0x20>
|
||
|
4003e6: push 1
|
||
|
4003eb: jmp 4003c0 <<font color="lightgreen">printf@plt-0x10</font>>
|
||
|
</pre>
|
||
|
|
||
|
The <tt>_GLOBAL_OFFSET_TABLE_</tt> (labeled as <tt>R_X86_64_JUMP_SLOT</tt> and starts at address 0x600890) is located in
|
||
|
<tt>.got.plt</tt> section (to see this, run the command <tt>objdump -h a.out |grep -A 1 600890</tt>
|
||
|
or the command <tt>readelf -r a.out</tt>)
|
||
|
The data in <tt>.got.plt</tt> section look like the following <font color="lightgreen">during runtime</font>
|
||
|
(use gdb to see them)
|
||
|
<pre>(gdb) b *0x4003d0
|
||
|
(gdb) run
|
||
|
(gdb) x/6a 0x600890
|
||
|
0x600890: 0x6006e8 <_DYNAMIC> 0x32696159a8
|
||
|
0x6008a0: 0x326950aa20 <_dl_runtime_resolve> <font color="lightgreen">0x4003d6</font> <printf@plt+6>
|
||
|
0x6008b0: 0x326971c3f0 <__libc_start_main> 0x0
|
||
|
</pre>
|
||
|
When <tt>printf</tt> is called the first time in the user program, the
|
||
|
jump at 4003d0 will jump to <font color="lightgreen">4003d6</font>, which is just the next instruction (<tt>push 0</tt>)
|
||
|
The it jumps to 4003c0, which does not have a function name (so it is
|
||
|
shown as <tt><printf@plt-0x10></tt>). At 4003c6, it will jumps
|
||
|
to <tt>_dl_runtime_resolve</tt>. This function (in Glibc's source file
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=sysdeps/x86_64/dl-trampoline.S"><tt>sysdeps/x86_64/dl-trampoline.S</tt></a>)
|
||
|
is a trampoline to <tt>_dl_fixup</tt> (in Glibc's source file
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=elf/dl-runtime.c"><tt>elf/dl-runtime.c</tt></a>).
|
||
|
<tt>_dl_fixup</tt> again, is part of Glibc runtime linker <tt>ld.so</tt>. In particular, <font color="lightgreen">it will change
|
||
|
the address stored at 6008a8 to the actual
|
||
|
address of <tt>printf</tt> in <tt>libc.so.6</tt></font>. To see this, set up a
|
||
|
hardware watchpoint
|
||
|
<pre>(gdb) watch *0x6008a8
|
||
|
(gdb) cont
|
||
|
Continuing.
|
||
|
Hardware watchpoint 2: *0x6008a8
|
||
|
|
||
|
Old value = 4195286
|
||
|
New value = 1769244016
|
||
|
0x000000326950abc2 in fixup () from /lib64/ld-linux-x86-64.so.2
|
||
|
</pre>
|
||
|
If we continue execution, <tt>printf</tt> will be called, as
|
||
|
expected. When <tt>printf</tt> is called again in the user program, the
|
||
|
jump at 4003d0 will bounce directly to <tt>printf</tt>:
|
||
|
<pre>(gdb) x/6a 0x600890
|
||
|
0x600890: 0x6006e8 <_DYNAMIC> 0x32696159a8
|
||
|
0x6008a0: 0x326950aa20 <_dl_runtime_resolve> <font color="lightgreen">0x3269748570</font> <printf>
|
||
|
0x6008b0: 0x326971c3f0 <__libc_start_main> 0x0
|
||
|
</pre>
|
||
|
|
||
|
<h2><tt>.init</tt>, <tt>.fini</tt>, <tt>.preinit_array</tt>, <tt>.init_array</tt> and <tt>.fini_array</tt> sections</h2>
|
||
|
<tt>.init</tt> and <tt>.fini</tt> sections contain code to do
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://download.oracle.com/docs/cd/E19963-01/html/819-0690/chapter3-8.html">initialization and termination</a>, as
|
||
|
specified by the <a href="https://web.archive.org/web/20201202024834/http://www.sco.com/developers/gabi/latest/ch4.sheader.html#special_sections">System V Application Binary Interface</a>.
|
||
|
If the code is compiled by GCC, then one will see the following code in
|
||
|
<tt>.init</tt> and <tt>.fini</tt> sections, respectively:
|
||
|
<pre>4003a8 <_init>:
|
||
|
4003a8: sub RSP, 8
|
||
|
4003ac: call call_gmon_start
|
||
|
4003b1: call frame_dummy
|
||
|
4003b6: call __do_global_ctors_aux
|
||
|
4003bb: add RSP, 8
|
||
|
4003bf: ret
|
||
|
|
||
|
400618 <_fini>:
|
||
|
400618: sub RSP, 8
|
||
|
40061c: call __do_global_dtors_aux
|
||
|
400621: add RSP, 8
|
||
|
400625: ret
|
||
|
</pre>
|
||
|
|
||
|
There is only one function: <tt>_init</tt>, in <tt>.init</tt> section, and
|
||
|
likewise, only one function: <tt>_fini</tt> in <tt>.fini</tt> section.
|
||
|
Both <tt>_init</tt> and <tt>_fini</tt> are <b>synthesized</b> at compile time
|
||
|
by the compiler/linker. Glibc
|
||
|
provides its own prolog and epilog for <tt>_init</tt> and <tt>_fini</tt>, but
|
||
|
the compiler is free to choose how to use them and add more code into <tt>_init</tt>
|
||
|
and <tt>_fini</tt>.
|
||
|
<p>
|
||
|
In Glibc, the source file <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=sysdeps/generic/initfini.c"><tt>sysdeps/generic/initfini.c</tt></a>
|
||
|
(and some system dependent ones, such as <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=sysdeps/x86_64/elf/initfini.c"><tt>sysdeps/x86_64/elf/initfini.c</tt></a>)
|
||
|
is compiled into two files: <tt>/usr/lib64/crti.o</tt> for prolog
|
||
|
and <tt>/usr/lib64/crtn.o</tt> for epilog.
|
||
|
</p><p>
|
||
|
For the compiler part, GCC uses different prolog and epilog files, depending
|
||
|
on the compiler command-line options. To see them, execute <tt>gcc -dumpspec</tt>,
|
||
|
and one can see
|
||
|
</p><pre>...
|
||
|
|
||
|
*endfile:
|
||
|
%{ffast-math|funsafe-math-optimizations:crtfastmath.o%s}
|
||
|
%{mpc32:crtprec32.o%s}
|
||
|
%{mpc64:crtprec64.o%s}
|
||
|
%{mpc80:crtprec80.o%s}
|
||
|
%{shared|pie:crtendS.o%s;:crtend.o%s}
|
||
|
crtn.o%s
|
||
|
|
||
|
...
|
||
|
|
||
|
*startfile:
|
||
|
%{!shared: %{pg|p|profile:gcrt1.o%s;pie:Scrt1.o%s;:crt1.o%s}}
|
||
|
crti.o%s
|
||
|
%{static:crtbeginT.o%s;shared|pie:crtbeginS.o%s;:crtbegin.o%s}
|
||
|
|
||
|
...
|
||
|
</pre>
|
||
|
The detailed explanation of GCC spec file is <a href="https://web.archive.org/web/20201202024834/http://gcc.gnu.org/onlinedocs/gcc/Spec-Files.html">here</a>.
|
||
|
For above snippet, it means, for example, if compiler command-line
|
||
|
option <tt>-ffast-math</tt> is used, include GCC's <tt>crtfastmath.o</tt>
|
||
|
file (this file can be found under <tt>/usr/lib/gcc/<arch>/<version>/</tt>)
|
||
|
at the end of the linking process. Glibc's <tt>crtn.o</tt> is always
|
||
|
included at the end of linking. The <tt>%s</tt> means this preceding file is a startup file. (GCC allows
|
||
|
to skip startup files during linking using <tt>-nostartfiles</tt> compiler option)
|
||
|
<p>Similarly, if <tt>-shared</tt> compiler command-line option is not used,
|
||
|
then always include Glibc's <tt>crt1.o</tt> at the start of the linking process.
|
||
|
<tt>crt1.o</tt> contains the function <tt>_start</tt> in <tt>.text</tt> section (not <tt>.init</tt> section!)
|
||
|
<tt>_start</tt> is the <font color="lightgreen">function that is executed before anything else</font>... see below.
|
||
|
Next, include Glibc's <tt>crti.o</tt> in the linking. Finally, include either
|
||
|
<tt>crtbeginT.o</tt>, <tt>crtbeginS.o</tt>, or <tt>crtbegin.o</tt> (both are part of GCC, of course), depending on
|
||
|
whether <tt>-static</tt> or <tt>-shared</tt> (or neither) is used.
|
||
|
</p><p>
|
||
|
So, for example, if a program is compiled using dynamic linking (which is default), no profiling, no fast
|
||
|
math optimizations, then the linking will include the following files in the following order:
|
||
|
</p><ol>
|
||
|
<li><tt>crt1.o</tt> (part of Glibc)
|
||
|
</li><li><tt>crti.o</tt> (part of Glibc and contributes the code at 4003a8, 4003ac, 400618, and the body of <tt>call_gmon_start</tt>)
|
||
|
</li><li><tt>crtbegin.o</tt> (part of GCC and contributes the code at 4003b1 and 40061c, and the body of <tt>frame_dummy</tt> and <tt>__do_global_dtors_aux</tt>)
|
||
|
</li><li>user's code
|
||
|
</li><li><tt>crtend.o</tt> (part of GCC and contributes the code at 4003b6 and the body of <tt>__do_global_ctors_aux</tt>)
|
||
|
</li><li><tt>crtn.o</tt> (part of Glibc and contributes the code at 4003bb, 4003bf, 400621, 400625)
|
||
|
</li></ol>
|
||
|
Why <tt>__do_global_ctors_aux</tt> is in <tt>crtend*.o</tt> and <tt>__do_global_dtors_aux</tt>
|
||
|
is in <tt>crtbegin*.o</tt> ? Recall the order of invocation of destructors should be the reverse order
|
||
|
of invocation of constructors. Therefore, GCC doing so will ensure <tt>__do_global_ctors_aux</tt> is called
|
||
|
as late as possible in <tt>.init</tt> section and <tt>__do_global_dtors_aux</tt> is called
|
||
|
as early as possible in <tt>.fini</tt> section.
|
||
|
<p>
|
||
|
Now back to the <tt>4003a8 <_init></tt>.
|
||
|
</p><p>
|
||
|
<tt>call_gmon_start</tt> is part of the Glibc prolog <tt>/usr/lib64/crti.o</tt>.
|
||
|
It initializes <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/binutils/docs/gprof/">gprof</a> related
|
||
|
data structures.
|
||
|
</p><p>
|
||
|
<tt>frame_dummy</tt> is in GCC code <a href="https://web.archive.org/web/20201202024834/http://gcc.gnu.org/viewcvs/trunk/gcc/crtstuff.c?view=markup"><tt>gcc/crtstuff.c</tt></a> and it
|
||
|
is used to set up excepion handling and Java class registration (JCR) information.
|
||
|
</p><p>
|
||
|
The most interesting code is <tt>__do_global_ctors_aux</tt> (in
|
||
|
GCC's <a href="https://web.archive.org/web/20201202024834/http://gcc.gnu.org/viewcvs/trunk/gcc/crtstuff.c?view=markup"><tt>gcc/crtstuff.c</tt></a> and
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://gcc.gnu.org/viewcvs/trunk/gcc/gbl-ctors.h?view=markup"><tt>gcc/gbl-ctors.h</tt></a>) What it does
|
||
|
is to call functions which are marked as
|
||
|
<tt>__attribute__ ((constructor))</tt> (and static C++ objects' constructors) one by one:
|
||
|
</p><pre> __SIZE_TYPE__ nptrs = (__SIZE_TYPE__) __CTOR_LIST__[0];
|
||
|
unsigned i;
|
||
|
|
||
|
if (nptrs == (__SIZE_TYPE__)-1)
|
||
|
for (nptrs = 0; __CTOR_LIST__[nptrs + 1] != 0; nptrs++);
|
||
|
|
||
|
for (i = nptrs; i >= 1; i--)
|
||
|
__CTOR_LIST__[i] ();
|
||
|
</pre>
|
||
|
The array <tt>__CTOR_LIST__</tt> is stored in a special section called <tt>.ctors</tt>.
|
||
|
Suppose a function called <tt>foo</tt> is marked as <tt>__attribute__ ((constructor))</tt>,
|
||
|
then the runtime call stack trace would be
|
||
|
<pre>(gdb) break foo
|
||
|
(gdb) run
|
||
|
(gdb) bt
|
||
|
#0 0x00000000004004d8 in foo ()
|
||
|
#1 0x0000000000400606 in __do_global_ctors_aux ()
|
||
|
#2 0x00000000004003bb in _init ()
|
||
|
#3 0x00000000004005a0 in ?? ()
|
||
|
#4 0x0000000000400561 in <font color="lightgreen">__libc_csu_init</font> ()
|
||
|
#5 0x000000326971c46f in __libc_start_main ()
|
||
|
#6 0x000000000040041a in _start ()
|
||
|
</pre>
|
||
|
Similarly, the <tt>__do_global_dtors_aux</tt> in <tt>_fini</tt> function
|
||
|
will invoke all functions which are marked as
|
||
|
<tt>__attribute__ ((destructor))</tt>. <tt>__do_global_dtors_aux</tt> code is also
|
||
|
in GCC's source tree at <a href="https://web.archive.org/web/20201202024834/http://gcc.gnu.org/viewcvs/trunk/gcc/crtstuff.c?view=markup"><tt>gcc/crtstuff.c</tt></a>. If
|
||
|
a function called <tt>foo</tt> is marked as <tt>__attribute__ ((destructor))</tt>
|
||
|
(and static C++ objects' destructors), then the runtime call stack trace would be
|
||
|
<pre>(gdb) bt
|
||
|
#0 0x0000000000400518 in foo ()
|
||
|
#1 0x00000000004004ca in __do_global_dtors_aux ()
|
||
|
#2 0x0000000000400641 in _fini ()
|
||
|
#3 0x00000032699367e8 in ?? () from /lib64/tls/libc.so.6
|
||
|
#4 0x0000003269730c95 in exit () from /lib64/tls/libc.so.6
|
||
|
#5 0x000000326971c4d2 in __libc_start_main () from /lib64/tls/libc.so.6
|
||
|
#6 0x000000000040045a in _start ()
|
||
|
</pre>
|
||
|
The array <tt>__DTOR_LIST__</tt> contains the addresses of these destructors
|
||
|
and it is stored in a special section called <tt>.dtors</tt>.
|
||
|
|
||
|
<h2>What user functions will be executed before <tt>main</tt> and at program exit? </h2>
|
||
|
As above call strack trace shows, <tt>_init</tt> is NOT the only function to be called before <tt>main</tt>.
|
||
|
It is <tt>__libc_csu_init</tt> function (in Glibc's source file <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=csu/elf-init.c"><tt>csu/elf-init.c</tt></a>)
|
||
|
that determines what functions to be run before <tt>main</tt>
|
||
|
and the order of running them. Its code is like this
|
||
|
<pre> void __libc_csu_init (int argc, char **argv, char **envp)
|
||
|
{
|
||
|
#ifndef LIBC_NONSHARED
|
||
|
{
|
||
|
const size_t size = __preinit_array_end - __preinit_array_start;
|
||
|
size_t i;
|
||
|
for (i = 0; i < size; i++)
|
||
|
<font color="lightgreen">(*__preinit_array_start [i]) (argc, argv, envp)</font>;
|
||
|
}
|
||
|
#endif
|
||
|
|
||
|
<font color="lightgreen">_init ()</font>;
|
||
|
|
||
|
const size_t size = __init_array_end - __init_array_start;
|
||
|
for (size_t i = 0; i < size; i++)
|
||
|
<font color="lightgreen">(*__init_array_start [i]) (argc, argv, envp)</font>;
|
||
|
}
|
||
|
</pre>
|
||
|
(Symbols such as <tt>__preinit_array_start</tt>, <tt>__preinit_array_end</tt>, <tt>__init_array_start</tt>,
|
||
|
<tt>__init_array_end</tt> are defined by the default <tt>ld</tt> script;
|
||
|
look for <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/binutils/docs/ld/PROVIDE.html"><tt>PROVIDE</tt>
|
||
|
and <tt>PROVIDE_HIDDEN</tt> keywords</a> in the output of <tt>ld -verbose</tt> command.)
|
||
|
<p>
|
||
|
The <tt>__libc_csu_fini</tt> function has similar code, but what
|
||
|
functions to be executed at program exit are actually determined by <tt>exit</tt>:
|
||
|
</p><pre> void __libc_csu_fini (void)
|
||
|
{
|
||
|
#ifndef LIBC_NONSHARED
|
||
|
size_t i = __fini_array_end - __fini_array_start;
|
||
|
while (i-- > 0)
|
||
|
(*__fini_array_start [i]) ();
|
||
|
|
||
|
<font color="lightgreen">_fini ()</font>;
|
||
|
#endif
|
||
|
}
|
||
|
</pre>
|
||
|
<p>
|
||
|
To see what's going on, consider the following C code example:
|
||
|
</p><pre> #include <stdio.h>
|
||
|
#include <stdlib.h>
|
||
|
|
||
|
void preinit(<font color="lightgreen">int argc, char **argv, char **envp</font>) {
|
||
|
printf("%s\n", __FUNCTION__);
|
||
|
}
|
||
|
|
||
|
void init(<font color="lightgreen">int argc, char **argv, char **envp</font>) {
|
||
|
printf("%s\n", __FUNCTION__);
|
||
|
}
|
||
|
|
||
|
void fini() {
|
||
|
printf("%s\n", __FUNCTION__);
|
||
|
}
|
||
|
|
||
|
<font color="lightgreen">__attribute__((section(".init_array")))</font> typeof(init) *__init = init;
|
||
|
<font color="lightgreen">__attribute__((section(".preinit_array")))</font> typeof(preinit) *__preinit = preinit;
|
||
|
<font color="lightgreen">__attribute__((section(".fini_array")))</font> typeof(fini) *__fini = fini;
|
||
|
|
||
|
void <font color="lightgreen">__attribute__ ((constructor))</font> constructor() {
|
||
|
printf("%s\n", __FUNCTION__);
|
||
|
}
|
||
|
|
||
|
void <font color="lightgreen">__attribute__ ((destructor))</font> destructor() {
|
||
|
printf("%s\n", __FUNCTION__);
|
||
|
}
|
||
|
|
||
|
void <font color="lightgreen">my_atexit</font>() {
|
||
|
printf("%s\n", __FUNCTION__);
|
||
|
}
|
||
|
|
||
|
void <font color="lightgreen">my_atexit2</font>() {
|
||
|
printf("%s\n", __FUNCTION__);
|
||
|
}
|
||
|
|
||
|
int main() {
|
||
|
<font color="lightgreen">atexit(my_atexit)</font>;
|
||
|
<font color="lightgreen">atexit(my_atexit2)</font>;
|
||
|
}
|
||
|
</pre>
|
||
|
The output will be
|
||
|
<pre> preinit
|
||
|
constructor
|
||
|
init
|
||
|
my_atexit2
|
||
|
my_atexit
|
||
|
fini
|
||
|
destructor
|
||
|
</pre>
|
||
|
The <tt>.preinit_array</tt> and <tt>.init_array</tt> sections must contain
|
||
|
<b>function pointers</b> (NOT code!) The prototype of these functions must be <pre>void func(int argc,char** argv,char** envp)</pre>
|
||
|
<tt>__libc_csu_init</tt> execute them in the following order:
|
||
|
<ol>
|
||
|
<li>Function pointers in <tt>.preinit_array</tt> section
|
||
|
</li><li>Functions marked as <tt>__attribute__ ((constructor))</tt>, via <tt>_init</tt>
|
||
|
</li><li>Function pointers in <tt>.init_array</tt> section
|
||
|
</li></ol>
|
||
|
|
||
|
The <tt>.fini_array</tt> section must also contain <b>function pointers</b>
|
||
|
and the prototype is like the destructor, i.e. taking no arguments and returning void. If the program exits <b>normally</b>, then
|
||
|
the <tt>exit</tt> function (Glibc source file <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=stdlib/exit.c"><tt>stdlib/exit.c</tt></a>) is called and it
|
||
|
will do the following:
|
||
|
<ol>
|
||
|
<li>In reverse order, functions registered via <tt>atexit</tt> or <tt>on_exit</tt>
|
||
|
</li><li>Function pointers in <tt>.fini_array</tt> section, via <tt>__libc_csu_fini</tt>
|
||
|
</li><li>Functions marked as <tt>__attribute__ ((destructor))</tt>, via <tt>__libc_csu_fini</tt> (which calls <tt>_fini</tt> after Step 2)
|
||
|
</li><li>stdio cleanup functions
|
||
|
</li></ol>
|
||
|
<p>
|
||
|
It is <font color="lightgreen">not advisable</font> to put a code in <tt>.init</tt> section, e.g.
|
||
|
</p><pre>void __attribute__((section(".init"))) foo() {
|
||
|
...
|
||
|
}
|
||
|
</pre>
|
||
|
because doing so will cause <tt>__do_global_ctors_aux</tt> NOT to be called. The <tt>.init</tt>
|
||
|
section will now look like this:
|
||
|
<pre>4003a0 <_init>:
|
||
|
4003a0: sub RSP, 8
|
||
|
4003a4: call call_gmon_start
|
||
|
4003a9: call frame_dummy
|
||
|
|
||
|
4003ae <foo>:
|
||
|
4003ae: push RBP
|
||
|
4003af: mov RBP, RSP
|
||
|
|
||
|
.... (foo's body)
|
||
|
|
||
|
4003b2: leave
|
||
|
4003b3: <font color="lightgreen">ret</font>
|
||
|
4003b4: call __do_global_ctors_aux
|
||
|
4003b9: add RSP, 8
|
||
|
4003bd: ret
|
||
|
</pre>
|
||
|
<p>
|
||
|
Now <tt>.init</tt> section contains more than one function, but the
|
||
|
epilog of <tt>_init</tt> is distorted by the insertion of <tt>foo</tt>
|
||
|
</p><p>
|
||
|
Similarly, it is <font color="lightgreen">not advisable</font> to put a code in <tt>.fini</tt> section,
|
||
|
because otherwise the code will look like this:
|
||
|
</p><pre>4006d8 <_fini>:
|
||
|
4006d8: sub RSP, 8
|
||
|
4006dc: call __do_global_dtors_aux
|
||
|
|
||
|
4006e1 <foo>:
|
||
|
4006e1: push RBP
|
||
|
4006e2: mov RBP, RSP
|
||
|
|
||
|
.... (foo's body)
|
||
|
|
||
|
4006ef: leave
|
||
|
4006f0: <font color="lightgreen">ret</font>
|
||
|
4006f1: add RSP, 8
|
||
|
4006f5: ret
|
||
|
</pre>
|
||
|
Now the epilog of <tt>_fini</tt> is distorted by the insertion of <tt>foo</tt>, so
|
||
|
the stack frame pointer will not be adjusted (<tt>add RSP, 8</tt> is not executed),
|
||
|
causing segmentation fault.
|
||
|
|
||
|
<h2>What do <tt>_start</tt> and <tt>__libc_start_main</tt> do? </h2>
|
||
|
The above call stack traces show that
|
||
|
<tt>_start</tt> calls <tt>__libc_start_main</tt>, which runs
|
||
|
all of the code before <tt>main</tt>.
|
||
|
<p>
|
||
|
<tt>_start</tt> is part of Glibc code, as in <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=sysdeps/x86_64/elf/start.S"><tt>sysdeps/x86_64/elf/start.S</tt></a>.
|
||
|
As mentioned earlier, it is compiled as <tt>/usr/lib64/crt1.o</tt> and is statically linked to
|
||
|
user's executable binary during compilation. To see this, run gcc with <tt>-v</tt> command, and
|
||
|
the last line would be something like:
|
||
|
</p><pre>.../collect2 ... /usr/lib64/crt1.o /usr/lib64/crti.o ... /usr/lib64/crtn.o
|
||
|
</pre>
|
||
|
<tt>_start</tt> is always placed <font color="lightgreen">at the beginning of <tt>.text</tt> section, and
|
||
|
the default <tt>ld</tt> script specifies
|
||
|
"Entry point address" (in ELF header, use <tt>readelf -h ld.so|grep Entry</tt> command to see)
|
||
|
to be the address of <tt>_start</tt> (use <tt>ld -verbose | grep ENTRY</tt> command to see), so
|
||
|
<tt>_start</tt> is guaranteed to
|
||
|
be run before anything else</font>. (This is changeable, however, at compile time
|
||
|
one can specify a different initial address
|
||
|
by <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/binutils/docs/ld/Entry-Point.html"><tt>-e</tt> option</a>)
|
||
|
<p>
|
||
|
<tt>_start</tt> does only one thing: It sets up the arguments needed by
|
||
|
<tt>__libc_start_main</tt> and then call it.
|
||
|
|
||
|
<tt>__libc_start_main</tt>'s source code is <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=csu/libc-start.c"><tt>csu/libc-start.c</tt></a>
|
||
|
and its function prototype is:
|
||
|
</p><pre>__libc_start_main (int (*main) (int, char **, char **),
|
||
|
int argc,
|
||
|
char *argv,
|
||
|
int (*init) (int, char **, char **),
|
||
|
void (*fini) (void),
|
||
|
void (*rtld_fini) (void),
|
||
|
void *stack_end)
|
||
|
)
|
||
|
</pre>
|
||
|
<tt>__libc_start_main</tt> does quite a lot of work in
|
||
|
addition to kicking off <tt>__libc_csu_init</tt>:
|
||
|
<ol>
|
||
|
<li>Set up <tt>argv</tt> and <tt>envp</tt>
|
||
|
<!--
|
||
|
<li>Set up dynamic linker's flags (<tt>_dl_aux_init</tt> in <tt>elf/dl-support.c</tt>)
|
||
|
-->
|
||
|
</li><li>Initialize the thread local storage by calling <tt>__pthread_initialize_minimal</tt> (which
|
||
|
only calls <tt>__libc_setup_tls</tt>).<p><tt>__libc_setup_tls</tt> will initialize Thread Control Block
|
||
|
and Dynamic Thread Vector.
|
||
|
</p></li><li>Set up the thread stack guard
|
||
|
</li><li>Register the destructor (i.e. the <tt>rtld_fini</tt> argument passed to <tt>__libc_start_main</tt>)
|
||
|
of the dynamic linker (by calling <tt>__cxa_atexit</tt>) if there is any
|
||
|
</li><li>Initialize Glibc inself by calling <tt>__libc_init_first</tt>
|
||
|
</li><li>Register <tt>__libc_csu_fini</tt> (i.e. the <tt>fini</tt> argument passed to <tt>__libc_start_main</tt>)
|
||
|
using <tt>__cxa_atexit</tt>
|
||
|
</li><li><font color="lightgreen">Call <tt>__libc_csu_init</tt></font> (i.e. the <tt>init</tt> argument
|
||
|
passed to <tt>__libc_start_main</tt>)
|
||
|
<ol>
|
||
|
<li>Call function pointers in <tt>.preinit_array</tt> section
|
||
|
</li><li>Execute the code in <tt>.init</tt> section, which is usually <tt>_init</tt> function.
|
||
|
What <tt>_init</tt> function does is <font color="lightgreen">compiler-specific</font>.
|
||
|
For GCC, <tt>_init</tt> executes user functions marked as <tt>__attribute__ ((constructor))</tt>
|
||
|
(in <tt>__do_global_dtors_aux</tt>)
|
||
|
</li><li>Call function pointers in <tt>.init_array</tt> section
|
||
|
</li></ol>
|
||
|
</li><li>Set up data structures needed for thread unwinding/cancellation
|
||
|
</li><li><font color="lightgreen">Call <tt>main</tt></font> of user's program.
|
||
|
</li><li><font color="lightgreen">Call <tt>exit</tt></font>
|
||
|
</li></ol>
|
||
|
|
||
|
So if the last line of user program's <tt>main</tt> is <tt>return XX</tt>,
|
||
|
then the <tt>XX</tt> will be passed to <tt>exit</tt> at Step #11 above. If
|
||
|
the last line is not <tt>return XX</tt> or is simply <tt>return</tt>, then
|
||
|
the value passed to <tt>exit</tt> would be undefined.<p>Of course, if
|
||
|
the user program calls <tt>exit</tt> or <tt>abort</tt>, then <tt>exit</tt>
|
||
|
will gets called.
|
||
|
</p><center><h1>Here is the <a href="https://web.archive.org/web/20201202024834/http://www.acsu.buffalo.edu/%7Echarngda/code/gdb_callgraph/examples/callgraphEmpty.gif">call graph</a>,
|
||
|
which is worth a thousand words</h1> and see <a href="https://web.archive.org/web/20201202024834/http://www.acsu.buffalo.edu/%7Echarngda/callgraph.html">here</a>
|
||
|
on how it is generated.</center>
|
||
|
<p>
|
||
|
If one tries to build a program which does not contain <tt>main</tt>, then one should see the following error:
|
||
|
</p><pre>/usr/lib/crt1.o: In function `_start': (<font color="lightgreen">.text+0x20</font>): undefined reference to `main'
|
||
|
collect2: ld returned 1 exit status
|
||
|
</pre>
|
||
|
As mentioned earlier, <tt>crt1.o</tt> (part of Glibc) contains the function
|
||
|
<tt>_start</tt>, which will call
|
||
|
<tt>__libc_start_main</tt> and pass <tt>main</tt> (a function pointer) as one of the arguments.
|
||
|
If one uses
|
||
|
<pre>nm -u /usr/lib/crt1.o
|
||
|
</pre>
|
||
|
then it will show <tt>main</tt> is a undefined symbol in <tt>crt1.o</tt>. Now let's disassemble
|
||
|
<tt>crt1.o</tt>:
|
||
|
<pre>$ objdump -M intel -dj .text /usr/lib/crt1.o
|
||
|
|
||
|
crt1.o: file format elf64-x86-64
|
||
|
|
||
|
Disassembly of section .text:
|
||
|
|
||
|
0000000000000000 <_start>:
|
||
|
0: 31 ed xor ebp,ebp
|
||
|
2: 49 89 d1 mov r9,rdx
|
||
|
5: 5e pop rsi
|
||
|
6: 48 89 e2 mov rdx,rsp
|
||
|
9: 48 83 e4 f0 and rsp,0xfffffffffffffff0
|
||
|
d: 50 push rax
|
||
|
e: 54 push rsp
|
||
|
f: 49 c7 c0 00 00 00 00 mov r8,0x0
|
||
|
16: 48 c7 c1 00 00 00 00 mov rcx,0x0
|
||
|
1d: 48 c7 c7 <font color="lightgreen">00 00 00 00</font> mov rdi,0x0
|
||
|
24: e8 00 00 00 00 call 29 <_start+0x29>
|
||
|
29: f4 hlt
|
||
|
...
|
||
|
</pre>
|
||
|
Above shows <font color="lightgreen">.text+0x20</font> refers to
|
||
|
the 4 bytes of an <tt>mov</tt> instruction. This means during the
|
||
|
linking, the address of <tt>main</tt> should be resolved
|
||
|
and then inserted at the right memory location: .text+0x20. Now let's cross reference
|
||
|
the relocation table:
|
||
|
<pre>$ readelf -p /usr/lib/crt1.o
|
||
|
|
||
|
Relocation section '.rela.text' at offset 0x410 contains 4 entries:
|
||
|
Offset Info Type Sym. Value Sym. Name + Addend
|
||
|
000000000012 00090000000b R_X86_64_32S 0000000000000000 __libc_csu_fini + 0
|
||
|
000000000019 000b0000000b R_X86_64_32S 0000000000000000 __libc_csu_init + 0
|
||
|
<font color="lightgreen">000000000020</font> 000c0000000b R_X86_64_32S 0000000000000000 main + 0
|
||
|
000000000025 000f00000002 R_X86_64_PC32 0000000000000000 __libc_start_main - 4
|
||
|
</pre>
|
||
|
Above shows where <font color="lightgreen">0x20</font> comes from.
|
||
|
|
||
|
<h2>How to find the address of <tt>main</tt> of an executable binary ?</h2>
|
||
|
When an ELF executable binary is stripped off symbolic information, it
|
||
|
is not clear where the <tt>main</tt> is located.
|
||
|
<p>
|
||
|
From above analysis, it's possible to find out the address of <tt>main</tt> (which is
|
||
|
NOT the "Entry point address" seen from the output of <tt>readelf -h a.out | grep Entry</tt>
|
||
|
command. "Entry point address" is the address of <tt>_start</tt>)
|
||
|
</p><p>
|
||
|
Since the address of <tt>main</tt> is the first argument to the call
|
||
|
to <tt>__libc_start_main</tt>, we can extract the value of the first
|
||
|
argument as follows.
|
||
|
</p><p>
|
||
|
On <font color="LightGreen">64-bit x86</font>, the <a href="https://web.archive.org/web/20201202024834/http://www.acsu.buffalo.edu/%7Echarngda/x86assembly.html">calling convention</a>
|
||
|
requires that the first argument
|
||
|
goes to <font color="LightGreen"><tt>RDI</tt> register</font>, so the
|
||
|
address can be extracted by
|
||
|
</p><pre>objdump -j .text -d a.out | grep -B5 'call.*__libc_start_main' | awk '/mov.*%rdi/ { print $NF }'
|
||
|
</pre>
|
||
|
On <font color="LightGreen">32-bit x86</font>, the C calling
|
||
|
convention ("<a href="https://web.archive.org/web/20201202024834/http://www.wikipedia.org/wiki/X86_calling_conventions#cdecl">cdecl</a>") is that the first argument
|
||
|
is the <font color="LightGreen">last item to be pushed onto the stack</font>
|
||
|
before the call, so the
|
||
|
address can be extracted by
|
||
|
<pre>objdump -j .text -d a.out | grep -B2 'call.*__libc_start_main' | awk '/push.*0x/ { print $NF }'
|
||
|
</pre>
|
||
|
|
||
|
<h2>PIC, TLS, and AMD64 code models</h2>
|
||
|
Relocation is the process of connecting symbolic references with symbolic definitions.
|
||
|
The runtime relocation is done by <tt>ld.so</tt>, as in <tt>elf_machine_rela</tt> function
|
||
|
in Glibc's source file <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=sysdeps/x86_64/dl-machine.h"><tt>sysdeps/x86_64/dl-machine.h</tt></a>.
|
||
|
The link-time relocation is done by the link-editor <tt>ld</tt>, which uses the relocation
|
||
|
table in the object file (<tt>.rela.text</tt> section). Each symbolic reference has an entry
|
||
|
in the relocation table, and
|
||
|
each entry contains three fields: offset, info (relocation type, symbol table index), and addend.
|
||
|
The relocation types are:
|
||
|
<p>
|
||
|
<table border="">
|
||
|
<tbody><tr>
|
||
|
<th>Relocation type</th>
|
||
|
<th>Meaning</th>
|
||
|
<th>Used when</th>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_16</tt></td>
|
||
|
<td>Direct 16 bit zero extended</td>
|
||
|
<td></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_32</tt></td>
|
||
|
<td>Direct 32 bit zero extended</td>
|
||
|
<td></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_32S</tt></td>
|
||
|
<td>Direct 32 bit
|
||
|
sign extended<span></span></td>
|
||
|
<td></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_64</tt></td>
|
||
|
<td>Direct 64 bit</td>
|
||
|
<td>Large code model</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_8</tt></td>
|
||
|
<td>Direct 8 bit sign extended</td>
|
||
|
<td></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_COPY</tt></td>
|
||
|
<td>Copy symbol at runtime</td>
|
||
|
<td></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_DTPMOD64</tt></td>
|
||
|
<td>ID of module containing symbol</td>
|
||
|
<td>TLS</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_DTPOFF32</tt></td>
|
||
|
<td>Offset in TLS block</td>
|
||
|
<td>TLS</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_DTPOFF64</tt></td>
|
||
|
<td>Offset in module's TLS block</td>
|
||
|
<td>TLS</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_GLOB_DAT</tt></td>
|
||
|
<td><tt>.got</tt> section, which contains addresses to the actual functions in DLL</td>
|
||
|
<td></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_GOT32</tt></td>
|
||
|
<td>32 bit GOT entry</td>
|
||
|
<td></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_GOT64</tt></td>
|
||
|
<td>64-bit GOT entry offset</td>
|
||
|
<td>PIC & Large code model</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_GOTOFF64</tt></td>
|
||
|
<td>64-bit GOT offset</td>
|
||
|
<td>PIC & Large code model</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_GOTPC32</tt></td>
|
||
|
<td>32-bit PC relative offset to GOT</td>
|
||
|
<td></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_GOTPC32_TLSDESC</tt></td>
|
||
|
<td>32-bit PC relative to TLS descriptor in GOT</td>
|
||
|
<td>TLS</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_GOTPC64</tt></td>
|
||
|
<td>64-bit PC relative offset to GOT</td>
|
||
|
<td>PIC & Large code model</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_GOTPCREL</tt></td>
|
||
|
<td>32 bit signed PC relative offset to GOT</td>
|
||
|
<td>PIC</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_GOTPCREL64</tt></td>
|
||
|
<td>64-bit PC relative offset to GOT entry</td>
|
||
|
<td>PIC & Large code model</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_GOTPLT64</tt></td>
|
||
|
<td>Like GOT64, indicates that PLT entry needed</td>
|
||
|
<td>PIC & Large code model</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_GOTTPOFF</tt></td>
|
||
|
<td>32 bit signed PC relative offset to GOT entry for IE symbol</td>
|
||
|
<td>TLS</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_JUMP_SLOT</tt></td>
|
||
|
<td><tt>.got.plt</tt> section, which contains addresses to the actual functions in DLL</td>
|
||
|
<td>DLL</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_PC16</tt></td>
|
||
|
<td>16 bit sign extended PC relative</td>
|
||
|
<td></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_PC32</tt></td>
|
||
|
<td>PC relative 32 bit signed</td>
|
||
|
<td></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_PC64</tt></td>
|
||
|
<td>64-bit PC relative</td>
|
||
|
<td>Large code model</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_PC8</tt></td>
|
||
|
<td>8 bit sign extended PC relative</td>
|
||
|
<td></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_PLT32</tt></td>
|
||
|
<td>32 bit PLT address</td>
|
||
|
<td></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_PLTOFF64</tt></td>
|
||
|
<td>64-bit GOT relative offset to PLT entry</td>
|
||
|
<td>PIC & Large code model</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_RELATIVE</tt></td>
|
||
|
<td>Adjust by program base</td>
|
||
|
<td></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_SIZE32</tt></td>
|
||
|
<td></td>
|
||
|
<td></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_SIZE64</tt></td>
|
||
|
<td></td>
|
||
|
<td></td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_TLSDESC</tt></td>
|
||
|
<td>2 by 64-bit TLS descriptor</td>
|
||
|
<td>TLS</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_TLSDESC_CALL</tt></td>
|
||
|
<td>Relaxable call through TLS descriptor</td>
|
||
|
<td>TLS</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_TLSGD</tt></td>
|
||
|
<td>32 bit signed PC relative offset to two GOT entries for GD symbol</td>
|
||
|
<td>TLS & PIC</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_TLSLD</tt></td>
|
||
|
<td>32 bit signed PC relative offset to two GOT entries for LD symbol</td>
|
||
|
<td>TLS</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_TPOFF32</tt></td>
|
||
|
<td>Offset in initial TLS block</td>
|
||
|
<td>TLS</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td><tt>R_X86_64_TPOFF64</tt></td>
|
||
|
<td>Offset in initial TLS block<span></span></td>
|
||
|
<td>TLS & Large code model</td>
|
||
|
</tr>
|
||
|
</tbody></table>
|
||
|
</p><p>
|
||
|
According to Chapter 3.5 of <a href="https://web.archive.org/web/20201202024834/http://www.x86-64.org/documentation/abi.pdf">AMD64 System V Application Binary Interface</a>,
|
||
|
there are four code models and they differ in addressing modes (absolute versus relative):
|
||
|
</p><ul>
|
||
|
<li><b>Small</b>: All compile- and link-time addresses and symbols are assumed to fit
|
||
|
in 32-bit immediate operands. This model restricts code and global data to the low
|
||
|
2 GB of the address space (to be exact, between 0x0 and 0x7eff ffff, which is 2031 MB)
|
||
|
<p>The compiler can encode symbolic references
|
||
|
</p><ul>
|
||
|
<li>In sign-extended immediate operands for offsets in the range of 0x8000 0000 (-2<sup>31</sup>)
|
||
|
to 0x100 0000 (2<sup>24</sup>)
|
||
|
</li><li>In zero-extended immediate operands for offsets in the range of 0x0
|
||
|
to 0x7f00 0000 (2<sup>31</sup> - 2<sup>24</sup>)
|
||
|
</li><li>In <a href="https://web.archive.org/web/20201202024834/http://www.tortall.net/projects/yasm/manual/html/nasm-effaddr.html">RIP relative addressing mode</a>
|
||
|
for offsets in the range 0xff00 0000 (-2<sup>24</sup> = -16 MB) to 0x100 0000 (2<sup>24</sup> = 16 MB)
|
||
|
</li></ul>
|
||
|
<p>This mode is the default mode for most compilers.</p><p>
|
||
|
|
||
|
</p></li><li><b>Large</b>: No restrictions are placed on the size or placement of code and data.
|
||
|
The max virtual memory space is 48 bits (256 TB).
|
||
|
|
||
|
</li><li><b>Medium</b>: Like the Small code model, except the data sections are split into two parts, e.g.
|
||
|
instead of having just <tt>.data</tt>, <tt>.rodata</tt>, <tt>.bss</tt> sections, there would also be
|
||
|
<tt>.ldata</tt>, <tt>.lrodata</tt>, <tt>.lbss</tt> sections. The smaller <tt>.data</tt>, etc
|
||
|
are still the same as in the Small code model, and the larger <tt>.ldata</tt>, etc
|
||
|
are as in the Large code model.
|
||
|
|
||
|
</li><li><b>Kernel</b>: Like the Small code model, but the 2 GB address space spans
|
||
|
from 0xffff ffff 8000 0000 (2<sup>64</sup>-2<sup>31</sup>)
|
||
|
to 0xffff ffff ff00 0000 (2<sup>64</sup>-2<sup>24</sup>) Because of this, the
|
||
|
offsets which can be encoded using sign-extended and zero-extended immediate operands
|
||
|
change.
|
||
|
</li></ul>
|
||
|
|
||
|
Now consider the following C code
|
||
|
<pre>extern int esrc[100];
|
||
|
int gsrc[100];
|
||
|
static int ssrc[100];
|
||
|
|
||
|
void foo() {
|
||
|
int k;
|
||
|
k = esrc[5];
|
||
|
k = gsrc[5];
|
||
|
k = ssrc[5];
|
||
|
}
|
||
|
</pre>
|
||
|
|
||
|
<ul>
|
||
|
<li><b>Small</b> code model, no PIC (i.e. compiled with just <tt>gcc -c</tt>):
|
||
|
<pre>k = esrc[5]; mov EAX, DWORD PTR[RIP+0x0]
|
||
|
mov DWORD PTR[RBP-0x4], EAX
|
||
|
k = gsrc[5]; mov EAX, DWORD PTR[RIP+0x0]
|
||
|
mov DWORD PTR[RBP-0x4], EAX
|
||
|
k = ssrc[5]; mov EAX, DWORD PTR[RIP+0x0]
|
||
|
mov DWORD PTR[RBP-0x4], EAX
|
||
|
</pre>
|
||
|
and the relocation table is (use <tt>readelf -r foo.o</tt> command)
|
||
|
<pre>type Sym. Name + Addend
|
||
|
R_X86_64_PC32 esrc + 10
|
||
|
R_X86_64_PC32 gsrc + 10
|
||
|
R_X86_64_PC32 .bss + 10
|
||
|
</pre>
|
||
|
All of the 0x0's in the generated assembly will be filled at link-time
|
||
|
with their relative offsets in respective sections, as indicated by the relocation table.
|
||
|
<p>
|
||
|
</p></li><li><b>Large</b> code model, no PIC (i.e. compiled with <tt>gcc -c -mcmodel=large</tt>)
|
||
|
<pre>k = esrc[5]; mov RAX, 0x0
|
||
|
mov EAX, DWORD PTR[RAX+0x10]
|
||
|
mov DWORD PTR[RBP-0x4], EAX
|
||
|
k = gsrc[5]; mov RAX, 0x0
|
||
|
mov EAX, DWORD PTR[RAX+0x10]
|
||
|
mov DWORD PTR[RBP-0x4], EAX
|
||
|
k = ssrc[5]; mov RAX, 0x0
|
||
|
mov EAX, DWORD PTR[RAX+0x10]
|
||
|
mov DWORD PTR[RBP-0x4], EAX
|
||
|
</pre>
|
||
|
and the relocation table is:
|
||
|
<pre>type Sym. Name + Addend
|
||
|
R_X86_64_64 esrc + 0
|
||
|
R_X86_64_64 gsrc + 0
|
||
|
R_X86_64_64 .bss + 0
|
||
|
</pre>
|
||
|
All of the 0x0's in the generated assembly will be filled at link-time
|
||
|
with their (64-bit) absolute addresses.
|
||
|
<p>
|
||
|
</p></li><li><b>Small</b> code model, PIC (i.e. compiled with <tt>gcc -c -fPIC</tt>. Note that adding <tt>-shared</tt> or not will not affect the generated code)
|
||
|
<pre>k = esrc[5]; mov RAX, QWORD PTR[RIP+0x0]
|
||
|
mov EAX, DWORD PTR[RAX+0x10]
|
||
|
mov DWORD PTR[RBP-0x4], EAX
|
||
|
k = gsrc[5]; mov RAX, QWORD PTR[RIP+0x0]
|
||
|
mov EAX, DWORD PTR[RAX+0x10]
|
||
|
mov DWORD PTR[RBP-0x4], EAX
|
||
|
k = ssrc[5]; mov EAX, DWORD PTR[RIP+0x0]
|
||
|
mov DWORD PTR[RBP-0x4], EAX
|
||
|
</pre>
|
||
|
and the relocation table is:
|
||
|
<pre>type Sym. Name + Addend
|
||
|
R_X86_64_GOTPCREL esrc - 4
|
||
|
R_X86_64_GOTPCREL gsrc - 4
|
||
|
R_X86_64_PC32 .bss + 10
|
||
|
</pre>
|
||
|
The first two 0x0's in the generated assembly will be filled with the relative
|
||
|
offset of <tt>_GLOBAL_OFFSET_TABLE_</tt> (i.e. the <tt>.got.plt</tt> section)
|
||
|
<p>
|
||
|
</p></li><li><b>Large</b> code model, PIC (i.e. compiled with <tt>gcc -c -fPIC -mcmodel=large</tt>)
|
||
|
<pre> lea RBX, [RIP-0x7]
|
||
|
mov R11, 0x0
|
||
|
add RBX, R11
|
||
|
k = esrc[5]; mov RAX, 0x0
|
||
|
mov RAX, QWORD PTR[RBX+RAX*1]
|
||
|
mov EAX, DWORD PTR[RAX+0x10]
|
||
|
mov DWORD PTR[RBP-0x4], EAX
|
||
|
k = gsrc[5]; mov RAX, 0x0
|
||
|
mov RAX, QWORD PTR[RBX+RAX*1]
|
||
|
mov EAX, DWORD PTR[RAX+0x10]
|
||
|
mov DWORD PTR[RBP-0x4], EAX
|
||
|
k = ssrc[5]; mov RAX, 0x0
|
||
|
mov RAX, QWORD PTR[RBX+RAX*1]
|
||
|
mov EAX, DWORD PTR[RAX+0x10]
|
||
|
mov DWORD PTR[RBP-0x4], EAX
|
||
|
</pre>
|
||
|
The first 0x0 is in the generated assembly will be filled with the absolute
|
||
|
address of <tt>_GLOBAL_OFFSET_TABLE_</tt>
|
||
|
</li></ul>
|
||
|
|
||
|
<h2><tt>_GLOBAL_OFFSET_TABLE_</tt>, <tt>.got.plt</tt> section, and <tt>DYNAMIC</tt> segment</h2>
|
||
|
Earlier we see that the <tt>_GLOBAL_OFFSET_TABLE_</tt> is located in <tt>.got.plt</tt> section:
|
||
|
<pre>(gdb) b *0x4003d0
|
||
|
(gdb) run
|
||
|
(gdb) x/6a 0x600890
|
||
|
0x600890: 0x6006e8 <_DYNAMIC> 0x32696159a8
|
||
|
0x6008a0: 0x326950aa20 <_dl_runtime_resolve> <font color="lightgreen">0x4003d6</font> <printf@plt+6>
|
||
|
0x6008b0: 0x326971c3f0 <__libc_start_main> 0x0
|
||
|
</pre>
|
||
|
According to Chapter 5.2 of <a href="https://web.archive.org/web/20201202024834/http://www.x86-64.org/documentation/abi.pdf">AMD64 System V Application Binary Interface</a>,
|
||
|
the first 3 entries of this table are reserved for special purposes.
|
||
|
The first entry is set up during compilation by the link editor <tt>ld</tt>.
|
||
|
The second and third entries are set up during runtime by the runtime linker <tt>ld.so</tt>
|
||
|
(see function <tt>_dl_relocate_object</tt> in Glibc source file <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=elf/dl-reloc.c"><tt>elf/dl-reloc.c</tt></a>
|
||
|
and in particular, notice the <tt>ELF_DYNAMIC_RELOCATE</tt> macro,
|
||
|
which calls function <tt>elf_machine_runtime_setup</tt> in <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=sysdeps/x86_64/dl-machine.h"><tt>sysdeps/x86_64/dl-machine.h</tt></a>)
|
||
|
<p>
|
||
|
The first entry <tt>_DYNAMIC</tt> has value 6006e8, and this is exactly
|
||
|
the starting address of <tt>.dynamic</tt> section (or <tt>DYNAMIC</tt> segment, in ELF's "execution view".)
|
||
|
The runtime linker <tt>ld.so</tt> uses this section to find the all necessary
|
||
|
information needed for runtime relocation and dynamic linking.
|
||
|
</p><p>
|
||
|
To see <tt>DYNAMIC</tt> segment's content, use <tt>readelf -d a.out</tt> command, or
|
||
|
<tt>objdump -x a.out</tt>, or just use <tt>x/50a 0x6006e8</tt> in gdb.
|
||
|
The <tt>readelf -d a.out</tt> command will show something like this:
|
||
|
</p><pre>Dynamic section at offset 0x6e8 contains 21 entries:
|
||
|
Tag Type Name/Value
|
||
|
0x0000000000000001 (NEEDED) Shared library: [libc.so.6] <-- dependent dynamic library name
|
||
|
0x000000000000000c (INIT) 0x4003a8 <-- address of .init section
|
||
|
0x000000000000000d (FINI) 0x400618 <-- address of .fini section
|
||
|
0x0000000000000004 (HASH) 0x400240 <-- address of .hash section
|
||
|
0x000000006ffffef5 (GNU_HASH) 0x400268 <-- address of .gnu.hash section
|
||
|
0x0000000000000005 (STRTAB) 0x4002e8 <-- address of .strtab section
|
||
|
0x0000000000000006 (SYMTAB) 0x400288 <-- address of .symtab section
|
||
|
0x000000000000000a (STRSZ) 63 (bytes) <-- size of .strtab section
|
||
|
0x000000000000000b (SYMENT) 24 (bytes) <-- size of an entry in .symtab section
|
||
|
0x0000000000000015 (DEBUG) 0x0 <-- see below
|
||
|
0x0000000000000003 (PLTGOT) 0x600860 <-- address of .got.plt section
|
||
|
0x0000000000000002 (PLTRELSZ) 48 (bytes) <-- total size of .rela.plt section
|
||
|
0x0000000000000014 (PLTREL) RELA <-- RELA or REL ?
|
||
|
0x0000000000000017 (JMPREL) 0x400368 <-- address of .rela.plt section
|
||
|
0x0000000000000007 (RELA) 0x400350 <-- address of .rela.dyn section
|
||
|
0x0000000000000008 (RELASZ) 24 (bytes) <-- total size of .rela.dyn section
|
||
|
0x0000000000000009 (RELAENT) 24 (bytes) <-- size of an entry in .rela.dyn section
|
||
|
0x000000006ffffffe (VERNEED) 0x400330 <-- address of .gnu.version_r section
|
||
|
0x000000006fffffff (VERNEEDNUM) 1 <-- number of needed versions
|
||
|
0x000000006ffffff0 (VERSYM) 0x400328 <-- address of .gnu.version section
|
||
|
0x0000000000000000 (NULL) 0x0 <-- marks the end of .dynamic section
|
||
|
</pre>
|
||
|
Each entry in <tt>DYNAMIC</tt> segment is a struct of only two members:
|
||
|
"tag" and "value". The <tt>NEEDED</tt>, <tt>INIT</tt> ... above
|
||
|
are "tags" (see <tt><a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=elf/elf.h">/usr/include/elf.h</a></tt>)
|
||
|
<p>Other tags of interest are:
|
||
|
</p><pre>BIND_NOW The same as BIND_NOW in FLAGS. This has been superseded by
|
||
|
BIND_NOW in FLAGS
|
||
|
|
||
|
CHECKSUM The checksum value used by <a href="https://web.archive.org/web/20201202024834/http://www.wikipedia.org/wiki/Prelink">prelink</a>.
|
||
|
|
||
|
DEBUG At runtime ld.so will fill its value with the runtime
|
||
|
address of r_debug structure (see elf/rtld.c)
|
||
|
and this info is used by GDB (see elf_locate_base function
|
||
|
in GDB's source tree).
|
||
|
|
||
|
FINI Address of .fini section
|
||
|
FINI_ARRAY Address of .fini_array section
|
||
|
FINI_ARRAYSZ Size of .fini_array section
|
||
|
|
||
|
FLAGS Additional flags, such as BIND_NOW, STATIC_TLS, TEXTREL..
|
||
|
|
||
|
FLAGS_1 Additional flags used by Solaris, such as NOW (the same as BIND_NOW), INTERPOSE..
|
||
|
|
||
|
GNU_PRELINKED The timestamp string when the binary object is last <a href="https://web.archive.org/web/20201202024834/http://www.wikipedia.org/wiki/Prelink">prelinked</a>.
|
||
|
|
||
|
INIT Address of .init section
|
||
|
INIT_ARRAY Address of .init_array section
|
||
|
INIT_ARRAYSZ Size of .init_array section
|
||
|
|
||
|
INTERP Address of .interp section
|
||
|
|
||
|
PREINIT_ARRAY Address of .preinit_array section
|
||
|
PREINIT_ARRAYSZ Size of .preinit_array section
|
||
|
|
||
|
RELACOUNT Number of R_X86_64_RELATIVE entries in RELA segment (.rela.dyn
|
||
|
section)
|
||
|
|
||
|
RPATH Dynamic library search path, which has higher precendence than
|
||
|
LD_LIBRARY_PATH. RPATH is ignored if RUNPATH is present.
|
||
|
|
||
|
<font color="red">Use of RPATH is deprecated</font>.
|
||
|
|
||
|
When one uses "gcc -Wl,-rpath=... " to build binaries, the info
|
||
|
is stored here.
|
||
|
|
||
|
RUNPATH Dynamic library search path, which has lower precendence than
|
||
|
LD_LIBRARY_PATH.
|
||
|
|
||
|
When one uses "gcc -Wl,-rpath=...<font color="LightGreen">,--enable-new-dtags</font>"
|
||
|
to build binaries, the info is stored here.
|
||
|
(See <a href="https://web.archive.org/web/20201202024834/http://blog.lxgcc.net/?tag=dt_runpath">here</a> for details.)
|
||
|
|
||
|
One can use <a href="https://web.archive.org/web/20201202024834/http://linux.die.net/man/1/chrpath">chrpath</a>
|
||
|
tool to manipulate RPATH and RUNPATH settings.
|
||
|
|
||
|
|
||
|
SONAME Shared object (i.e. dynamic library) name. When one uses
|
||
|
"gcc -Wl,-soname=... " to build binaries, the info is
|
||
|
stored here.
|
||
|
|
||
|
TEXTREL Relocation might modify .text section.
|
||
|
|
||
|
VERDEF Address of .gnu.version_d section
|
||
|
VERDEFNUM Number of version definitions.
|
||
|
</pre>
|
||
|
|
||
|
<h2>Runtime Relocation</h2>
|
||
|
After exploring <tt>DYNAMIC</tt> segment, we can explain how <tt>ld.so</tt> performs
|
||
|
runtime relocation.
|
||
|
<p>
|
||
|
First, before <tt>ld.so</tt> loads all dependent libraries of a dynamic executable,
|
||
|
it needs to run its own relocation! Even if <tt>ld.so</tt> is a statically-linked binary,
|
||
|
it also has a <tt>DYNAMIC</tt> segment and thus <tt>PLTREL</tt> (<tt>.rela.dyn</tt> section)
|
||
|
and <tt>JMPREL</tt> (<tt>.rela.plt</tt> section) tags:
|
||
|
</p><pre>$ readelf -a `readelf -p .interp /bin/sh | awk '/ld/ {print $3}'`
|
||
|
|
||
|
....
|
||
|
|
||
|
Dynamic section at offset 0x14e18 contains 22 entries:
|
||
|
Tag Type Name/Value
|
||
|
0x000000000000000e (SONAME) Library soname: [ld-linux-x86-64.so.2]
|
||
|
0x0000000000000004 (HASH) 0x3269500190
|
||
|
0x0000000000000005 (STRTAB) 0x3269500578
|
||
|
0x0000000000000006 (SYMTAB) 0x3269500260
|
||
|
0x000000000000000a (STRSZ) 388 (bytes)
|
||
|
0x000000000000000b (SYMENT) 24 (bytes)
|
||
|
0x0000000000000003 (PLTGOT) 0x3269614f98
|
||
|
0x0000000000000002 (PLTRELSZ) 120 (bytes)
|
||
|
0x0000000000000014 (PLTREL) RELA
|
||
|
0x0000000000000017 (JMPREL) 0x32695009a0
|
||
|
0x0000000000000007 (RELA) 0x32695007c0
|
||
|
0x0000000000000008 (RELASZ) 480 (bytes)
|
||
|
0x0000000000000009 (RELAENT) 24 (bytes)
|
||
|
0x000000006ffffffc (VERDEF) 0x3269500740
|
||
|
0x000000006ffffffd (VERDEFNUM) 4
|
||
|
0x0000000000000018 (BIND_NOW)
|
||
|
0x000000006ffffffb (FLAGS_1) Flags: NOW
|
||
|
0x000000006ffffff0 (VERSYM) 0x32695006fc
|
||
|
0x000000006ffffff9 (RELACOUNT) 19
|
||
|
0x000000006ffffdf8 (CHECKSUM) 0x4c4e099e
|
||
|
0x000000006ffffdf5 (<font color="Orange">GNU_PRELINKED</font>) 2010-08-26T08:13:28
|
||
|
0x0000000000000000 (NULL) 0x0
|
||
|
|
||
|
Relocation section '.rela.dyn' at offset 0x7c0 contains 20 entries:
|
||
|
Offset Info Type Sym. Value Sym. Name + Addend
|
||
|
003269614cf0 000000000008 R_X86_64_RELATIVE 000000326950dd80
|
||
|
....
|
||
|
003269615820 000000000008 R_X86_64_RELATIVE 0000003269501140
|
||
|
003269614fe0 001e00000006 R_X86_64_GLOB_DAT 0000003269615980 _r_debug + 0
|
||
|
|
||
|
Relocation section '.rela.plt' at offset 0x9a0 contains 5 entries:
|
||
|
Offset Info Type Sym. Value Sym. Name + Addend
|
||
|
003269614fb0 000b00000007 R_X86_64_JUMP_SLO 000000326950f1b0 <font color="LightGreen">__libc_memalign</font> + 0
|
||
|
003269614fb8 000c00000007 R_X86_64_JUMP_SLO 000000326950f2b0 <font color="LightGreen">malloc</font> + 0
|
||
|
003269614fc0 001200000007 R_X86_64_JUMP_SLO 000000326950f2c0 <font color="LightGreen">calloc</font> + 0
|
||
|
003269614fc8 001800000007 R_X86_64_JUMP_SLO 000000326950f340 <font color="LightGreen">realloc</font> + 0
|
||
|
003269614fd0 002000000007 R_X86_64_JUMP_SLO 000000326950f300 <font color="LightGreen">free</font> + 0
|
||
|
</pre>
|
||
|
Note that the <tt>ld.so</tt> is <font color="Orange">prelinked</font>. On Fedora and Red Hat Enterprise Linux
|
||
|
(RHEL) systems, <a href="https://web.archive.org/web/20201202024834/http://lwn.net/Articles/341244/">prelink is run every two weeks</a>.
|
||
|
To see if your Linux has similar setup, check <tt>/etc/sysconfig/prelink</tt>
|
||
|
and <tt>/etc/prelink.conf</tt>
|
||
|
<p>
|
||
|
<b>What does this prelink do</b>? It changes the base address of a dynamic library
|
||
|
to the actual address in the user program's address space when it is loaded into memory.
|
||
|
Of course, <tt>ld.so</tt> recognizes <font color="Orange"><tt>GNU_PRELINKED</tt></font>
|
||
|
tag and will load a dynamic library to its this base address (recall the first argument of
|
||
|
<tt>mmap</tt> is the preferred address; of course, this is subject to the operating system.)
|
||
|
</p><p>Normally, a dynamic library
|
||
|
is built as <a href="https://web.archive.org/web/20201202024834/http://www.wikipedia.org/wiki/Position_independent_code">position independent code</a>,
|
||
|
i.e. the <tt>-fPIC</tt> compiler command-line option, and thus the base address is
|
||
|
0. For example, a normal libc.so has ELF program header as follows (<tt>readelf -l</tt> command):
|
||
|
</p><pre>Program Headers:
|
||
|
Type Offset VirtAddr PhysAddr
|
||
|
FileSiz MemSiz Flags Align
|
||
|
LOAD 0x0000000000000000 <font color="Orange">0x0000000000000000</font> 0x0000000000000000
|
||
|
0x0000000000179058 0x0000000000179058 R E 200000
|
||
|
LOAD 0x0000000000179730 0x0000000000379730 0x0000000000379730
|
||
|
0x0000000000004668 0x00000000000090f8 RW 200000
|
||
|
....
|
||
|
</pre>
|
||
|
And when calling <tt>mmap</tt> with address 0 (i.e. NULL)
|
||
|
the operating system can choose any address it feels appropriate.
|
||
|
<p>
|
||
|
A prelinked one, on the other hand, has its ELF program header as follows:
|
||
|
</p><pre>Program Headers:
|
||
|
Type Offset VirtAddr PhysAddr
|
||
|
FileSiz MemSiz Flags Align
|
||
|
LOAD 0x0000000000000000 <font color="Orange">0x0000003433e00000</font> 0x0000003433e00000
|
||
|
0x000000000001bb80 0x000000000001bb80 R E 200000
|
||
|
LOAD 0x000000000001bb90 0x000000343401bb90 0x000000343401bb90
|
||
|
0x0000000000000f58 0x00000000000010f8 RW 200000
|
||
|
</pre>
|
||
|
|
||
|
<b>What is the advantage of prelinking</b>?
|
||
|
<tt>ld.so</tt> will not process <tt>R_X86_64_RELATIVE</tt> relocation types
|
||
|
since they are already in the "right" place in user program's address space.
|
||
|
The extra benefit of this is the memory regions which
|
||
|
<tt>ld.so</tt> would have written to (if <tt>R_X86_64_RELATIVE</tt> needs
|
||
|
processing) will not incur any Copy-On-Writes and thus can be made Read-Only.<p>
|
||
|
According to <a href="https://web.archive.org/web/20201202024834/http://lwn.net/Articles/341309/">this post</a>, for GUI
|
||
|
programs, which tend to link against dozens of dynamic libraries and use lengthy
|
||
|
C++ demangled names, the speed up can be an order of magnitude.
|
||
|
</p><p>
|
||
|
<b>How to disable prelinking at runtime</b>?
|
||
|
Run the user program with <tt>LD_USE_LOAD_BIAS</tt> environmental
|
||
|
variable set to 0.
|
||
|
</p><p>
|
||
|
<b>How does <tt>ld.so</tt> process its own relocation</b>?
|
||
|
</p><p>
|
||
|
The relocation is done by <tt>_dl_relocate_object</tt> function
|
||
|
in Glibc's <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=elf/dl-reloc.c"><tt>elf/dl-reloc.c</tt></a>, which will call
|
||
|
<tt>elf_machine_rela</tt> function in <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=sysdeps/x86_64/dl-machine.h"><tt>sysdeps/x86_64/dl-machine.h</tt></a>
|
||
|
to do the majority of work.
|
||
|
</p><p>
|
||
|
First to be processed is the <tt>.rela.dyn</tt> relocation table,
|
||
|
which contains a bunch of <tt>R_X86_64_RELATIVE</tt> types
|
||
|
and one <tt>R_X86_64_GLOB_DAT</tt> type (the variable <tt>_r_debug</tt>)
|
||
|
</p><p>
|
||
|
If prelink is used, i.e. <tt>ld.so</tt> is indeed loaded
|
||
|
to the desired address, then <tt>R_X86_64_RELATIVE</tt>
|
||
|
relocation types will be ignored. If not,
|
||
|
then the address calculation for <tt>R_X86_64_RELATIVE</tt> types
|
||
|
is
|
||
|
</p><pre>Base Address + Value Stored at [Base Address + Offset]
|
||
|
</pre>
|
||
|
For example, in <tt>ld.so</tt>'s case, its base address
|
||
|
is 2a95556000 (can be obtained from <tt>pmap</tt> command; inside <tt>ld.so</tt>,
|
||
|
it calls <tt>elf_machine_load_address</tt> function to get this value)
|
||
|
<pre>0000400000 4K r-x-- /tmp/a.out
|
||
|
0000500000 4K rw--- /tmp/a.out
|
||
|
<font color="lightgreen">2a95556000</font> 92K r-x-- /lib64/ld.so
|
||
|
2a9556d000 8K rw--- [ anon ]
|
||
|
2a95599000 4K rw--- [ anon ]
|
||
|
2a9566c000 4K r---- /lib64/ld.so
|
||
|
2a9566d000 4K rw--- /lib64/ld.so
|
||
|
3269700000 1216K r-x-- /lib64/libc-2.3.4.so
|
||
|
...
|
||
|
</pre>
|
||
|
And <tt>ld.so</tt>'s <tt>.rela.dyn</tt> relocation table is (<font color="red">no prelinked</font>!
|
||
|
If <tt>ld.so</tt> is prelinked, the offset will be in a much higher address)
|
||
|
<pre>Relocation section '.rela.dyn' at offset 0x7c0 contains 20 entries:
|
||
|
Offset Info Type Sym. Value Sym. Name + Addend
|
||
|
000000116d50 000000000008 R_X86_64_RELATIVE 000000000000e250
|
||
|
...
|
||
|
</pre>
|
||
|
so the relocation for 000000116d50 is processed as
|
||
|
<pre><font color="lightgreen">0x2a95556000</font> + *(0x116d50+<font color="lightgreen">0x2a95556000</font>)
|
||
|
</pre>
|
||
|
and this new value is stored at 0x2a9566cd50 (=0x116d50+0x2a95556000)
|
||
|
|
||
|
<p>As <tt>R_X86_64_RELATIVE</tt> types do not require symbol lookups,
|
||
|
they are handled in a tight loop in
|
||
|
<tt>elf_machine_rela_relative</tt> function in
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=sysdeps/x86_64/dl-machine.h"><tt>sysdeps/x86_64/dl-machine.h</tt></a>
|
||
|
</p><p>
|
||
|
<b>Any relocation types other than <tt>R_X86_64_RELATIVE</tt> need to go
|
||
|
through symbol resolution first.</b>
|
||
|
</p><p>
|
||
|
So what about <tt>R_X86_64_GLOB_DAT</tt> relocation type in <tt>ld.so</tt> ?
|
||
|
First, <tt>RESOLVE_MAP</tt> (a macro defined within <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=elf/dl-reloc.c"><tt>elf/dl-reloc.c</tt></a>)
|
||
|
is called (with r_type = <tt>R_X86_64_GLOB_DAT</tt>)
|
||
|
to find out which ELF binary (could be the user's program or its dependent
|
||
|
dynamic libraries)
|
||
|
contains this symbol. Then
|
||
|
<tt>R_X86_64_GLOB_DAT</tt> relocation type is calculated as
|
||
|
</p><pre>Base Address + Symbol Value + Addend
|
||
|
</pre>
|
||
|
where <tt>Base Address</tt> is the base address
|
||
|
of ELF binary which contains the symbol, and
|
||
|
<tt>Symbol Value</tt> is the symbol value from
|
||
|
the symbol table of ELF binary which contains the symbol.
|
||
|
<p>
|
||
|
So for <tt>ld.so</tt>,
|
||
|
</p><pre>Relocation section '.rela.dyn' at offset 0x7c0 contains 20 entries:
|
||
|
Offset Info Type Sym. Value Sym. Name + Addend
|
||
|
....
|
||
|
000000116fe0 001e00000006 R_X86_64_GLOB_DAT <font color="SkyBlue">00000000001179c0</font> _r_debug + <font color="MediumOrchid">0</font>
|
||
|
</pre>
|
||
|
The relocation for 000000116fe0 is processed as
|
||
|
<pre><font color="lightgreen">0x2a95556000</font> + <font color="SkyBlue">0x1179c0</font> + <font color="MediumOrchid">0</font>
|
||
|
</pre>
|
||
|
because <tt>ld.so</tt> determines <tt>_r_debug</tt>
|
||
|
can be found from itself. The calculated value is stored at 0x2a9566cfe0 (=0x116fe0+0x2a95556000).
|
||
|
<p>
|
||
|
The next to be processed by <tt>ld.so</tt>
|
||
|
is its own <tt>.rela.plt</tt> relocation table,
|
||
|
which contains a bunch of <tt>R_X86_64_JUMP_SLOT</tt> types.
|
||
|
This reloction type is handled exactly the same way as <tt>R_X86_64_GLOB_DAT</tt>.
|
||
|
</p><p>
|
||
|
After <tt>ld.so</tt> finishes its own relocation, it loads user program's
|
||
|
dependent libraries and process their relocations one by one.
|
||
|
First, <tt>ld.so</tt> handles <tt>libc.so</tt>'s relocation.
|
||
|
<tt>libc.so</tt> has two relocation types we have not covered so far:
|
||
|
<tt>R_X86_64_64</tt> and <tt>R_X86_64_TPOFF64</tt>.
|
||
|
</p><p>
|
||
|
<tt>R_X86_64_64</tt> relocation type is processed by first looking
|
||
|
up the symbol's runtime <b>absolute</b> address, and then
|
||
|
calculating
|
||
|
</p><pre>Absolute Address + Addend
|
||
|
</pre>
|
||
|
And the <tt>R_X86_64_TPOFF64</tt> relocation type is calculated as
|
||
|
<pre>Symbol Value + Addend - TLS Offset
|
||
|
</pre>
|
||
|
which usually results in a negative value.
|
||
|
|
||
|
<h2><tt>R_X86_64_COPY</tt> relocation type</h2>
|
||
|
<tt>R_X86_64_COPY</tt> relocation type is used when a dynamic binary refers
|
||
|
to an <b>initialized</b> global variable (not a function!) defined in a dynamic link library. Unlike
|
||
|
functions, <font color="Yellow">for variables, there is no lazy binding, and
|
||
|
the trampoline trick used in <tt>.plt</tt> section
|
||
|
does not work.</font> Instead, the global variable will actually be allocated
|
||
|
in dynamic binary's <font color="Yellow"><tt>.bss</tt> section</font>.
|
||
|
<p>
|
||
|
To see how <tt>R_X86_64_COPY</tt> relocation type works, consider the following two code:
|
||
|
</p><pre>foo.c
|
||
|
|
||
|
int foo=4;
|
||
|
|
||
|
void foo_access() {
|
||
|
foo=5;
|
||
|
}
|
||
|
|
||
|
bar.c
|
||
|
|
||
|
#include <stdio.h>
|
||
|
extern int foo;
|
||
|
|
||
|
int main() {
|
||
|
printf("foo=%d\n",foo);
|
||
|
}
|
||
|
</pre>
|
||
|
Now compile them as follows:
|
||
|
<pre>$ gcc -shared -fPIC -Wl,-soname=libfoo.so foo.c -o /tmp/libfoo.so
|
||
|
$ gcc bar.c -o bar -L/tmp -lfoo
|
||
|
</pre>
|
||
|
And run them as
|
||
|
<pre>$ LD_PRELOAD=/tmp/libfoo.so ./bar
|
||
|
</pre>
|
||
|
Before explaining what happened during runtime, we need to examine
|
||
|
the binaries first.
|
||
|
<p>
|
||
|
The <tt>foo_access</tt> in <tt>libfoo.so</tt> is like this:
|
||
|
</p><pre>69c <foo_access>:
|
||
|
69c: push rbp
|
||
|
69d: mov rbp,rsp
|
||
|
6a0: mov rax,QWORD PTR [rip+0x100269] # <font color="lightgreen">100910</font> <_DYNAMIC+0x198>
|
||
|
6a7: mov DWORD PTR [rax],0x5
|
||
|
6ad: leave
|
||
|
6ae: ret
|
||
|
</pre>
|
||
|
So for <tt>libfoo.so</tt>, the <b>address</b> of variable <tt>foo</tt> is
|
||
|
in its <tt>.got</tt> section, not <tt>.data</tt> section:
|
||
|
<pre>$ readelf -a /tmp/libfoo.so
|
||
|
|
||
|
Section Headers:
|
||
|
[Nr] Name Type Address Offset
|
||
|
Size EntSize Flags Link Info Align
|
||
|
...
|
||
|
[18] .got PROGBITS <font color="lightgreen">0000000000100908</font> 00000908
|
||
|
0000000000000020 0000000000000008 WA 0 0 8
|
||
|
[19] .got.plt PROGBITS 0000000000100928 00000928
|
||
|
0000000000000020 0000000000000008 WA 0 0 8
|
||
|
...
|
||
|
[20] .data PROGBITS <font color="lightblue">0000000000100948</font> 00000948
|
||
|
0000000000000014 0000000000000000 WA 0 0 8
|
||
|
...
|
||
|
|
||
|
Relocation section '.rela.dyn' at offset 0x520 contains 6 entries:
|
||
|
Offset Info Type Sym. Value Sym. Name + Addend
|
||
|
000000100948 000000000008 R_X86_64_RELATIVE 0000000000100948
|
||
|
000000100950 000000000008 R_X86_64_RELATIVE 0000000000100768
|
||
|
000000100908 000f00000006 R_X86_64_GLOB_DAT 0000000000000000 __cxa_finalize + 0
|
||
|
000000100910 001100000006 R_X86_64_GLOB_DAT <font color="lightblue">0000000000100958</font> foo + 0
|
||
|
....
|
||
|
</pre>
|
||
|
But what about the address <font color="lightblue">0x100958</font> ? This address
|
||
|
is in <tt>libfoo.so</tt>'s <tt>.data</tt> section! Well, <font color="lightblue">0x100958</font>
|
||
|
has the initial value of <tt>foo</tt> (in our example, 4) At runtime, <tt>ld.so</tt>
|
||
|
will copy this value to <tt>bar</tt>'s <tt>.bss</tt> section:
|
||
|
<pre>$ objdump -sj .data libfoo.so
|
||
|
|
||
|
libfoo.so: file format elf64-x86-64
|
||
|
|
||
|
Contents of section .data:
|
||
|
100948 48091000 00000000 68071000 00000000 H.......h.......
|
||
|
<font color="lightblue">100958 04000000</font> ....
|
||
|
</pre>
|
||
|
<p>
|
||
|
Next, disassemble the <tt>main</tt> function of <tt>bar</tt>:
|
||
|
</p><pre>4005f8 <main>:
|
||
|
4005f8: push rbp
|
||
|
4005f9: mov rbp,rsp
|
||
|
4005fc: mov esi,DWORD PTR [rip+0x1003de] # <font color="lightgreen">5009e0</font> <__bss_start>
|
||
|
400602: mov edi,0x40070c
|
||
|
400607: mov eax,0x0
|
||
|
40060c: call 400528 <printf@plt>
|
||
|
400611: leave
|
||
|
400612: ret
|
||
|
</pre>
|
||
|
So the variable <tt>foo</tt> is indeed located in
|
||
|
<tt>bar</tt>'s <tt>.bss</tt> section. Let's double check with <tt>nm</tt>:
|
||
|
<pre>$ nm -n bar | grep 5009e0
|
||
|
|
||
|
00000000005009e0 A __bss_start
|
||
|
00000000005009e0 A _edata
|
||
|
00000000005009e0 B <font color="lightgreen">foo</font>
|
||
|
</pre>
|
||
|
(Symbols such as <tt>__bss_start</tt> and <tt>_edata</tt> are defined by the default <tt>ld</tt> script;
|
||
|
one can search them in the output of <tt>ld -verbose</tt> command.)
|
||
|
<p>
|
||
|
The dynamic relocation table of <tt>bar</tt> is:
|
||
|
</p><pre>Relocation section '<font color="lightgreen">.rela.dyn</font>' at offset 0x490 contains 2 entries:
|
||
|
Offset Info Type Sym. Value Sym. Name + Addend
|
||
|
000000500998 000c00000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0
|
||
|
0000005009e0 000700000005 <font color="lightgreen">R_X86_64_COPY 00000000005009e0</font> foo + 0
|
||
|
</pre>
|
||
|
<b>Now what happens during runtime is this</b>: After <tt>ld.so</tt> loads all dependent
|
||
|
dynamic libraries, it starts processing their relocations.
|
||
|
When it sees <tt>foo</tt> of <tt>libfoo.so</tt>, it
|
||
|
calls <tt>RESOLVE_MAP</tt> with r_type = <tt>R_X86_64_GLOB_DAT</tt> to get
|
||
|
the Base Address, which is 0, and Symbol Value, which is
|
||
|
<font color="lightgreen">5009e0</font>. Next it
|
||
|
sees <tt>foo</tt> of <tt>libfoo.so</tt> has
|
||
|
<tt>R_X86_64_GLOB_DAT</tt> relocation type,
|
||
|
so it calculates the new address as 5009e0 = 0 + 5009e0 + 0 (addend)
|
||
|
and stores the result somewhere inside <tt>.got</tt> section.
|
||
|
<p>
|
||
|
After <tt>ld.so</tt> has processed relocations of all
|
||
|
dynamic libraries, it starts processing the relocation table
|
||
|
of <tt>bar</tt>. When it sees <tt>foo</tt> of <tt>bar</tt>, it
|
||
|
calls <tt>RESOLVE_MAP</tt> again, but with r_type = <tt>R_X86_64_COPY</tt>. This time, the address returned is
|
||
|
the runtime address of <font color="lightblue"><tt>foo</tt> in <tt>libfoo.so</tt>'s
|
||
|
<tt>.data</tt> section</font>. As mentioned earlier, this
|
||
|
address holds the initial value of <tt>foo</tt>.
|
||
|
Next it sees <tt>foo</tt> of <tt>bar</tt> has <font color="lightgreen"><tt>R_X86_64_COPY</tt></font>
|
||
|
relocation type, so it uses <tt>memcpy</tt>
|
||
|
to copy data to <font color="lightgreen">5009e0</font>
|
||
|
(see the <tt>Sym. Value</tt> of <tt>.rela.dyn</tt> section of <tt>bar</tt> above)
|
||
|
from the runtime address of <font color="lightblue"><tt>foo</tt> in <tt>libfoo.so</tt>'s
|
||
|
<tt>.data</tt> section</font> (see Glibc source file <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=sysdeps/x86_64/dl-machine.h"><tt>sysdeps/x86_64/dl-machine.h</tt></a>)
|
||
|
</p><p>
|
||
|
The above example also illustrates <font color="Yellow">the difference
|
||
|
between <tt>.got</tt> section and <tt>.got.plt</tt> section.</font>
|
||
|
For the runtime linker <tt>ld.so</tt>, all it knows is
|
||
|
entries in <tt>PLTREL</tt> segment, i.e. <tt>.rela.dyn</tt> section,
|
||
|
(which corresponds to <tt>.got</tt> section)
|
||
|
must be <font color="Yellow">resolved/relocated immediately</font>, while entries in
|
||
|
<tt>JMPREL</tt> segment, i.e. <tt>.rela.plt</tt> section,
|
||
|
(which corresponds to <tt>.got.plt</tt> section) can use
|
||
|
<font color="Yellow">lazy binding</font>. For x86_64 architecture, the relocation is actually not
|
||
|
needed for <tt>R_X86_64_JUMP_SLOT</tt> relocation types (albeit the
|
||
|
symbol resolution is still needed)
|
||
|
|
||
|
|
||
|
</p><h2>PIC or no PIC</h2>
|
||
|
When building a dynamic library, we are told to <b>always</b> compile the code with <tt>-fPIC</tt>
|
||
|
option.
|
||
|
<p>
|
||
|
<b>What's the difference then</b> ?
|
||
|
</p><p>
|
||
|
Consider the following simple code:
|
||
|
</p><pre>#include <stdio.h>
|
||
|
int bar;
|
||
|
|
||
|
void foo() {
|
||
|
printf("%d\n",bar);
|
||
|
}
|
||
|
</pre>
|
||
|
Compile the above code in 32-bit mode with and without <tt>-fPIC</tt>:
|
||
|
<pre>$ gcc -shared -m32 foo.c -o nopic.so
|
||
|
$ gcc -shared -m32 -fPIC foo.c -o pic.so
|
||
|
</pre>
|
||
|
(If you try to compile the above in 64-bit mode, <font color="red">GCC will
|
||
|
stop and insist you should compile with <tt>-fPIC</tt> option</font>, i.e. you are going to
|
||
|
see error message such as
|
||
|
<tt>relocation R_X86_64_PC32 against symbol `XXXYYY' can not be used when making a shared object; recompile with -fPIC</tt>)
|
||
|
|
||
|
The sections and relocation tables of <font color="Orange"><tt>nopic.so</tt></font>
|
||
|
and <font color="LightCoral"><tt>pic.so</tt></font>
|
||
|
are shown at left and right hand side, respectively:
|
||
|
<pre>Section Headers: Section Headers:
|
||
|
[Nr] Name Type Addr [Nr] Name Type Addr
|
||
|
[ 0] NULL 0000 [ 0] NULL 0000
|
||
|
... ...
|
||
|
[ 8] .init PROGBITS 02f8 [ 8] .init PROGBITS 02f0
|
||
|
[ 9] .plt PROGBITS 0310 [ 9] .plt PROGBITS 0308
|
||
|
<font color="Orange">[10] .text PROGBITS 0340</font> [10] .text PROGBITS 0350
|
||
|
[11] .fini PROGBITS 0488 [11] .fini PROGBITS 04a8
|
||
|
[12] .rodata PROGBITS 04a4 [12] .rodata PROGBITS 04c4
|
||
|
... ...
|
||
|
[17] .dynamic DYNAMIC 14c0 [17] .dynamic DYNAMIC 14e0
|
||
|
[18] .got PROGBITS 1590 <font color="LightCoral">[18] .got PROGBITS 15a8</font>
|
||
|
[19] .got.plt PROGBITS 159c <font color="LightCoral">[19] .got.plt PROGBITS 15b8</font>
|
||
|
<font color="Orange">[20] .data PROGBITS 15b0</font> [20] .data PROGBITS 15d0
|
||
|
|
||
|
... ...
|
||
|
|
||
|
Relocation section '.rel.dyn' at offset 0x2b0 Relocation section '.rel.dyn' at offset 0x2b0
|
||
|
contains 7 entries: contains 5 entries:
|
||
|
Offset Info Type Sym.Value Sym. Name Offset Info Type Sym.Value Sym. Name
|
||
|
<font color="Orange">00000439 00000008 R_386_RELATIVE</font> 000015d0 00000008 R_386_RELATIVE
|
||
|
<font color="Orange">000015b0 00000008 R_386_RELATIVE</font> <font color="LightCoral">000015a8 00000106 R_386_GLOB_DAT 000015dc bar</font>
|
||
|
<font color="Orange">00000434 00000101 R_386_32 000015bc bar</font> ...
|
||
|
<font color="Orange">00000445 00000602 R_386_PC32 00000000 printf</font>
|
||
|
...
|
||
|
|
||
|
Relocation section '.rel.plt' at offset 0x2e8: Relocation section '.rel.plt' at offset 0x2d8
|
||
|
contains 2 entries: contains 3 entries:
|
||
|
Offset Info Type Sym.Value Sym. Name Offset Info Type Sym.Value Sym. Name
|
||
|
000015a8 00000207 R_386_JUMP_SLOT 00000000 __gmon_start__ 000015c4 00000207 R_386_JUMP_SLOT 00000000 __gmon_start__
|
||
|
000015ac 00000a07 R_386_JUMP_SLOT 00000000 __cxa_finalize <font color="LightCoral">000015c8 00000607 R_386_JUMP_SLOT 00000000 printf</font>
|
||
|
...
|
||
|
</pre>
|
||
|
When we compile with <tt>-fPIC</tt> we can see the variable <tt>bar</tt>
|
||
|
has the right relocation type (<tt>R_386_GLOB_DAT</tt>)
|
||
|
and the relocation takes place in the right section (<tt>.got</tt>) The same for
|
||
|
<tt>printf</tt>.
|
||
|
<p>
|
||
|
Without <tt>-fPIC</tt>, the relocations of the format string "\n", <tt>bar</tt>
|
||
|
and <tt>printf</tt> all take place inside the <tt>.text</tt> section!
|
||
|
But we know <tt>.text</tt> section is in a Read-Only <tt>LOAD</tt>
|
||
|
segment, so what <tt>ld.so</tt> would do ?
|
||
|
</p><p>
|
||
|
As expected, <tt>ld.so</tt> will make <tt>.text</tt> section
|
||
|
writeable, patch the bytes, and make it Read-Only again. Since the
|
||
|
relocation of both <tt>bar</tt> and <tt>printf</tt> are
|
||
|
in <tt>.rel.dyn</tt>, their relocations are performed immediately
|
||
|
(no lazy binding), so this approach is feasible.
|
||
|
</p><p>
|
||
|
So how does <tt>ld.so</tt> handle
|
||
|
<font color="Orange"><tt>R_386_RELATIVE</tt></font>,
|
||
|
<font color="Orange"><tt>R_386_32</tt></font>
|
||
|
and <font color="Orange"><tt>R_386_PC32</tt></font> relocation types ?
|
||
|
</p><p>
|
||
|
Let's look at the disassembly:
|
||
|
</p><pre>0000042c <foo>:
|
||
|
42c: 55 push ebp
|
||
|
42d: 89 e5 mov ebp,esp
|
||
|
42f: 83 ec 18 sub esp,0x18
|
||
|
432: 8b 15 <font color="Orange">00 00 00 00</font> mov edx,DWORD PTR ds:0x0 <-- reference to bar
|
||
|
438: b8 <font color="Orange">a4 04 00 00</font> mov eax,0x4a4 <-- reference to "%d\n" format string in .rodata
|
||
|
43d: 89 54 24 04 mov DWORD PTR [esp+0x4],edx
|
||
|
441: 89 04 24 mov DWORD PTR [esp],eax
|
||
|
444: e8 <font color="Orange">fc ff ff ff</font> call 445 <foo+0x19> <-- reference to printf
|
||
|
449: c9 leave
|
||
|
44a: c3 ret
|
||
|
</pre>
|
||
|
How would the 4 bytes starting at 445 (<tt>R_386_PC32</tt> type)
|
||
|
be patched ? Suppose at runtime, our
|
||
|
<font color="Orange"><tt>nopic.so</tt></font> is loaded
|
||
|
into memory with base address 8000, and the 4 bytes
|
||
|
to be patched are now at 8000 + 445 = 8445.
|
||
|
Furthermore, suppose <tt>ld.so</tt> has determined
|
||
|
the entry address of <tt>printf</tt> to be 10000, then
|
||
|
<tt>ld.so</tt> calculates the <b>relative</b> offset as follows:
|
||
|
<pre>10000 - 8445 + fffffffc = 7bb7
|
||
|
</pre>
|
||
|
(fffffffc is -4) so <tt>ld.so</tt> replaces <font color="Orange"><tt>fc ff ff ff</tt></font>
|
||
|
with <font color="Orange"><tt>b7 7b 00 00</tt></font>
|
||
|
<p>
|
||
|
To patch the 4 bytes starting at 434 (<tt>R_386_32</tt> type) is simpler.
|
||
|
<tt>ld.so</tt> will simply overwrite the 4 bytes with the runtime <b>absolute</b>
|
||
|
address of <tt>bar</tt>.
|
||
|
</p><p>
|
||
|
To patch the 4 bytes starting at 439 (<tt>R_386_RELATIVE</tt> type)
|
||
|
<tt>ld.so</tt> calculates the address as
|
||
|
</p><pre>10000 + 4a4 = 104a4
|
||
|
</pre>
|
||
|
so <tt>ld.so</tt> replaces <font color="Orange"><tt>a4 04 00 00</tt></font>
|
||
|
with <font color="Orange"><tt>a4 04 01 00</tt></font>
|
||
|
<p>
|
||
|
Finally, what about the <tt>R_386_RELATIVE</tt> relocation at <font color="Orange">15b0</font> ?
|
||
|
15b0 is the starting address of <tt>.data</tt> section, and the first 4 bytes
|
||
|
of <tt>.data</tt> section stores its own address, 15b0. So it has to be
|
||
|
relocated and patched as <tt>115b0</tt>.
|
||
|
</p><p>
|
||
|
In conclusion, <tt>R_386_RELATIVE</tt> means "32-bit relative to base address",
|
||
|
<tt>R_386_PC32</tt> means the "32-bit IP-relative offset"
|
||
|
and <tt>R_386_32</tt> means the "32-bit absolute."
|
||
|
|
||
|
</p><h2>Troubleshooting <tt>ld.so</tt></h2>
|
||
|
<h3>What is "<font color="red">error while loading shared libraries: requires glibc 2.5 or later dynamic linker</font>" ?</h3>
|
||
|
The cause of this error is the dynamic binary (or one of its dependent shared libraries)
|
||
|
you want to run only has <tt>.gnu.hash</tt> section, but the <tt>ld.so</tt> on the target machine
|
||
|
is too old to recognize <tt>.gnu.hash</tt>; it only recognizes the old-school <tt>.hash</tt> section.
|
||
|
<p>
|
||
|
This usually happens when the dynamic binary in question is built using newer version of GCC.
|
||
|
The solution is to recompile the code with either <tt>-static</tt> compiler command-line option
|
||
|
(to create a static binary), or the following option:
|
||
|
</p><pre>-Wl,--hash-style=both
|
||
|
</pre>
|
||
|
This tells the link editor <tt>ld</tt> to create both <tt>.gnu.hash</tt> and <tt>.hash</tt> sections.
|
||
|
<p>According to <tt>ld</tt> documentation <a href="https://web.archive.org/web/20201202024834/http://sourceware.org/binutils/docs/ld/Options.html">here</a>,
|
||
|
the old-school <tt>.hash</tt> section is the default, but the compiler can override it. For example,
|
||
|
the GCC (which is version 4.1.2) on RHEL (Red Hat Enterprise Linux) Server release 5.5 has
|
||
|
this line:
|
||
|
</p><pre>$ gcc -dumpspecs
|
||
|
....
|
||
|
*link:
|
||
|
%{!static:--eh-frame-hdr} %{!m32:-m elf_x86_64} %{m32:-m elf_i386} <font color="LightGreen">--hash-style=gnu</font> %{shared:-shared} ....
|
||
|
...
|
||
|
</pre>
|
||
|
|
||
|
<p>For more information, see <a href="https://web.archive.org/web/20201202024834/http://crtags.blogspot.com/2010/11/elf-elf-elf-dont-do-it.html">here</a>.
|
||
|
|
||
|
</p><h3>What is "<font color="red">Floating point exception</font>" ?</h3>
|
||
|
The cause of this error is the same as the previous question. On certain systems, e.g. RHEL, the old version <tt>ld.so</tt>
|
||
|
is <a href="https://web.archive.org/web/20201202024834/http://www.wikipedia.org/wiki/Backporting">backported</a> to emit "error while loading shared libraries: requires glibc 2.5 or later dynamic linker", but
|
||
|
this is not always the case, and you will see this error instead.
|
||
|
|
||
|
<h3>What is "<font color="red">.../libc.so.6: version `GLIBC_2.4' not found </font>" ?</h3>
|
||
|
As the error message says, some of the symbols need Glibc version 2.4 or higher. This can also be
|
||
|
seen by
|
||
|
<pre>$ objdump -x foo | grep 'Version References' -A10
|
||
|
|
||
|
Version References:
|
||
|
required from libc.so.6:
|
||
|
0x0d696914 0x00 03 GLIBC_2.4
|
||
|
0x09691a75 0x00 02 GLIBC_2.2.5
|
||
|
|
||
|
...
|
||
|
</pre>
|
||
|
The fix is to recompile the code with <tt>-static</tt> compiler command-line option.
|
||
|
|
||
|
<h3>What is "<font color="red">FATAL: kernel too old</font>" ?</h3>
|
||
|
Even if you recompile the code with <tt>-static</tt> compiler command-line option to avoid
|
||
|
any dependency on the dynamic Glibc library, you could still encounter the error
|
||
|
in question, and your code will exit with Segmentation Fault error.
|
||
|
<p>
|
||
|
This kernel version check is done by <tt>DL_SYSDEP_OSCHECK</tt> macro in Glibc's
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=sysdeps/unix/sysv/linux/dl-osinfo.h"><tt>sysdeps/unix/sysv/linux/dl-osinfo.h</tt></a>
|
||
|
It calls <tt>_dl_discover_osversion</tt> to get current kernel's version.
|
||
|
</p><p>
|
||
|
To wit, run your code (suppose it is not stripped) inside gdb,
|
||
|
</p><pre>(gdb) <font color="LightGreen">run</font>
|
||
|
Starting program: foo
|
||
|
FATAL: kernel too old
|
||
|
|
||
|
Program received signal SIGSEGV, Segmentation fault.
|
||
|
0x00000000004324a9 in ptmalloc_init ()
|
||
|
(gdb) <font color="LightGreen">call _dl_discover_osversion()</font>
|
||
|
$1 = 132617
|
||
|
(gdb) <font color="LightGreen">p/x $1</font>
|
||
|
$2 = 0x20609
|
||
|
(gdb)
|
||
|
</pre>
|
||
|
Here <tt>0x20609</tt> means the current kernel version is 2.6.9.
|
||
|
<p>
|
||
|
The fix (or hack) is to add the following function in your code:
|
||
|
</p><pre>int _dl_discover_osversion() { return 0xffffff; }
|
||
|
</pre>
|
||
|
and compile your code with <tt>-static</tt> compiler command-line option.
|
||
|
|
||
|
<h2>Exploring Glibc's <tt>pthread_t</tt></h2>
|
||
|
When one creates a thread using the Pthread API, one will get a <tt>pthread_t</tt> object as a handle.
|
||
|
In Glibc, <tt>pthread_t</tt> is actually a pointer pointing to a <tt>pthread</tt>
|
||
|
struct, which is opaque. Its definition can be found in Glibc's source tree at
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=nptl/descr.h"><tt>nptl/descr.h</tt></a>. The first member of <tt>pthread</tt> struct is yet
|
||
|
another struct called <tt>tcbhead_t</tt> defined in
|
||
|
system-dependent header files such as
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=nptl/sysdeps/x86_64/tls.h"><tt>nptl/sysdeps/x86_64/tls.h</tt></a>. It holds TLS related
|
||
|
information. It contains at least an integer member called <tt>multiple_threads</tt> which
|
||
|
indicates if the process is running in multi-thread mode.
|
||
|
<p>
|
||
|
The second member of <tt>pthread</tt> struct is also
|
||
|
a struct called <tt>list_t</tt> defined in
|
||
|
<a href="https://web.archive.org/web/20201202024834/http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=nptl/sysdeps/pthread/list.h"><tt>nptl/sysdeps/pthread/list.h</tt></a>.
|
||
|
</p><p>
|
||
|
The third and fourth members of <tt>pthread</tt> struct are thread ID and thread
|
||
|
group ID (both are of <tt>pid_t</tt> type).
|
||
|
</p><p>
|
||
|
Other members of <tt>pthread</tt> struct which are of interest: <tt>int cancelhandling</tt> for
|
||
|
cancellation information, <tt>int flags</tt> for thread attributes,
|
||
|
<tt>start_routine</tt> for start position of the code to be executed for the thread,
|
||
|
<tt>void *arg</tt> for the argument to <tt>start_routine</tt>
|
||
|
<tt>void *stackblock</tt> and <tt>size_t stackblock_size</tt> for thread-specific
|
||
|
stack information.
|
||
|
</p><p>
|
||
|
Since <tt>pthread</tt> struct is opaque, how can one obtain the above information,
|
||
|
or more precisely, how can one obtain the offsets of these members within the
|
||
|
<tt>pthread</tt> struct ? We can use the known information and search
|
||
|
for the memory region pointed by <tt>pthread_t</tt>, as in this <a href="https://web.archive.org/web/20201202024834/http://www.acsu.buffalo.edu/%7Echarngda/code/tcb.c">code snippet</a>.
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
</p></body></html>
|
||
|
<!--
|
||
|
FILE ARCHIVED ON 7:33:47 Sep 22, 2012 AND RETRIEVED FROM THE
|
||
|
INTERNET ARCHIVE ON 14:44:12 Oct 28, 2013.
|
||
|
JAVASCRIPT APPENDED BY WAYBACK MACHINE, COPYRIGHT INTERNET ARCHIVE.
|
||
|
|
||
|
ALL OTHER CONTENT MAY ALSO BE PROTECTED BY COPYRIGHT (17 U.S.C.
|
||
|
SECTION 108(a)(3)).
|
||
|
-->
|
||
|
<!--
|
||
|
FILE ARCHIVED ON 02:48:34 Dec 02, 2020 AND RETRIEVED FROM THE
|
||
|
INTERNET ARCHIVE ON 01:20:04 Feb 03, 2021.
|
||
|
JAVASCRIPT APPENDED BY WAYBACK MACHINE, COPYRIGHT INTERNET ARCHIVE.
|
||
|
|
||
|
ALL OTHER CONTENT MAY ALSO BE PROTECTED BY COPYRIGHT (17 U.S.C.
|
||
|
SECTION 108(a)(3)).
|
||
|
-->
|