128 lines
6.6 KiB
Markdown
128 lines
6.6 KiB
Markdown
# Linkers part 6
|
||
|
||
So many things to talk about. Let’s go back and cover relocations in some more
|
||
detail, with some examples.
|
||
|
||
## Relocations
|
||
|
||
As I said back in part 2, a relocation is a computation to perform on the
|
||
contents. And as I said yesterday, a relocation can also direct the linker to
|
||
take other actions, like creating a PLT or GOT entry. Let’s take a closer look
|
||
at the computation.
|
||
|
||
In general a relocation has a type, a symbol, an offset into the contents, and
|
||
an addend. From the linker’s point of view, the contents are simply an
|
||
uninterpreted series of bytes. A relocation changes those bytes as necessary to
|
||
produce the correct final executable. For example, consider the C code
|
||
`g = 0;` where `g` is a global variable. On the i386, the compiler will turn
|
||
this into an assembly language instruction, which will most likely be
|
||
`movl $0, g` (for position dependent code–position independent code would
|
||
loading the address of `g` from the GOT). Now, the `g` in the C code is a
|
||
global variable, and we all more or less know what that means. The `g` in the
|
||
assembly code is not that variable. It is a symbol which holds the address of
|
||
that variable.
|
||
|
||
The assembler does not know the address of the global variable `g`, which is
|
||
another way of saying that the assembler does not know the value of the symbol
|
||
`g`. It is the linker that is going to pick that address. So the assembler has
|
||
to tell the linker that it needs to use the address of `g` in this instruction.
|
||
The way the assembler does this is to create a relocation. We don’t use a
|
||
separate relocation type for each instruction; instead, each processor will
|
||
have a natural set of relocation types which are appropriate for the machine
|
||
architecture. Each type of relocation expresses a specific computation.
|
||
|
||
In the i386 case, the assembler will generate these bytes:
|
||
|
||
```
|
||
c7 05 00 00 00 00 00 00 00 00
|
||
```
|
||
|
||
The `c7 05` are the instruction (movl constant to address). The first four `00`
|
||
bytes are the 32-bit constant 0. The second four `00` bytes are the address.
|
||
The assembler tells the linker to put the value of the symbol `g` into those
|
||
four bytes by generating (in this case) a `R_386_32` relocation. For this
|
||
relocation the symbol will be `g`, the offset will be to the last four bytes of
|
||
the instruction, the type will be `R_386_32`, and the addend will be 0 (in the
|
||
case of the i386 the addend is stored in the contents rather than in the
|
||
relocation itself, but this is a detail). The type `R_386_32` expresses a
|
||
specific computation, which is: put the 32-bit sum of the value of the symbol
|
||
and the addend into the offset. Since for the i386 the addend is stored in the
|
||
contents, this can also be expressed as: add the value of the symbol to the
|
||
32-bit field at the offset. When the linker performs this computation, the
|
||
address in the instruction will be the address of the global variable g.
|
||
Regardless of the details, the important point to note is that the relocation
|
||
adjusts the contents by applying a specific computation selected by the type.
|
||
|
||
An example of a simple case which does use an addend would be
|
||
|
||
```c
|
||
char a[10]; // A global array.
|
||
char* p = &a[1]; // In a function.
|
||
```
|
||
|
||
The assignment to p will wind up requiring a relocation for the symbol `a`.
|
||
Here the addend will be 1, so that the resulting instruction references `a + 1`
|
||
rather than `a + 0`.
|
||
|
||
To point out how relocations are processor dependent, let’s consider `g = 0;`
|
||
on a RISC processor: the PowerPC (in 32-bit mode). In this case, multiple
|
||
assembly language instructions are required:
|
||
|
||
```asm
|
||
li 1,0 // Set register 1 to 0
|
||
lis 9,g@ha // Load high-adjusted part of g into register 9
|
||
stw 1,g@l(9) // Store register 1 to address in register 9 plus low adjusted part g
|
||
```
|
||
|
||
The `lis` instruction loads a value into the upper 16 bits of register 9,
|
||
setting the lower 16 bits to zero. The `stw` instruction adds a signed 16 bit
|
||
value to register 9 to form an address, and then stores the value of register 1
|
||
at that address. The `@ha` part of the operand directs the assembler to
|
||
generate a `R_PPC_ADDR16_HA` reloc. The `@l` produces a `R_PPC_ADDR16_LO`
|
||
reloc. The goal of these relocs is to compute the value of the symbol `g` and
|
||
use it as the store address.
|
||
|
||
That is enough information to determine the computations performed by these
|
||
relocs. The `R_PPC_ADDR16_HA` reloc computes
|
||
`(SYMBOL >> 16) + ((SYMBOL & 0x8000) ? 1 : 0)`. `The R_PPC_ADDR16_LO` computes
|
||
`SYMBOL & 0xffff`. The extra computation for `R_PPC_ADDR16_HA` is because the
|
||
`stw` instruction adds the signed 16-bit value, which means that if the low 16
|
||
bits appears negative we have to adjust the high 16 bits accordingly. The
|
||
offsets of the relocations are such that the 16-bit resulting values are stored
|
||
into the appropriate parts of the machine instructions.
|
||
|
||
The specific examples of relocations I’ve discussed here are ELF specific, but
|
||
the same sorts of relocations occur for any object file format.
|
||
|
||
The examples I’ve shown are for relocations which appear in an object file. As
|
||
discussed in part 4, these types of relocations may also appear in a shared
|
||
library, if they are copied there by the program linker. In ELF, there are also
|
||
specific relocation types which never appear in object files but only appear in
|
||
shared libraries or executables. These are the `JMP_SLOT`, `GLOB_DAT`, and
|
||
`RELATIVE` relocations discussed earlier. Another type of relocation which only
|
||
appears in an executable is a `COPY` relocation, which I will discuss later.
|
||
|
||
## Position Dependent Shared Libraries
|
||
|
||
I realized that in part 4 I forgot to say one of the important reasons that ELF
|
||
shared libraries use PLT and GOT tables. The idea of a shared library is to
|
||
permit mapping the same shared library into different processes. This only
|
||
works at maximum efficiency if the shared library code looks the same in each
|
||
process. If it does not look the same, then each process will need its own
|
||
private copy, and the savings in physical memory and sharing will be lost.
|
||
|
||
As discussed in part 4, when the dynamic linker loads a shared library which
|
||
contains position dependent code, it must apply a set of dynamic relocations.
|
||
Those relocations will change the code in the shared library, and it will no
|
||
longer be sharable.
|
||
|
||
The advantage of the PLT and GOT is that they move the relocations elsewhere,
|
||
to the PLT and GOT tables themselves. Those tables can then be put into a
|
||
read-write part of the shared library. This part of the shared library will be
|
||
much smaller than the code. The PLT and GOT tables will be different in each
|
||
process using the shared library, but the code will be the same.
|
||
|
||
I’ll be taking a vacation for the long weekend. My next post will most likely
|
||
be on Tuesday.
|
||
|