125 lines
7.0 KiB
Markdown
125 lines
7.0 KiB
Markdown
|
# .eh_frame
|
|||
|
|
|||
|
When gcc generates code that handles exceptions, it produces tables that
|
|||
|
describe how to unwind the stack. These tables are found in the `.eh_frame`
|
|||
|
section. The format of the `.eh_frame` section is very similar to the format of
|
|||
|
a DWARF `.debug_frame` section. Unfortunately, it is not precisely identical. I
|
|||
|
don’t know of any documentation which describes this format. The following
|
|||
|
should be read in conjunction with the relevant section of the DWARF standard,
|
|||
|
available from http://dwarfstd.org.
|
|||
|
|
|||
|
The `.eh_frame` section is a sequence of records. Each record is either a CIE
|
|||
|
(Common Information Entry) or an FDE (Frame Description Entry). In general
|
|||
|
there is one CIE per object file, and each CIE is associated with a list of
|
|||
|
FDEs. Each FDE is typically associated with a single function. The CIE and the
|
|||
|
FDE together describe how to unwind to the caller if the current instruction
|
|||
|
pointer is in the range covered by the FDE.
|
|||
|
|
|||
|
There should be exactly one FDE covering each instruction which may be being
|
|||
|
executed when an exception occurs. By default an exception can only occur
|
|||
|
during a function call or a throw. When using the `-fnon-call-exceptions` gcc
|
|||
|
option, an exception can also occur on most memory references and floating
|
|||
|
point operations. When using `-fasynchronous-unwind-tables`, the FDE will cover
|
|||
|
every instruction, to permit unwinding from a signal handler.
|
|||
|
|
|||
|
The general format of a CIE or FDE starts as follows:
|
|||
|
|
|||
|
* Length of record. Read 4 bytes. If they are not `0xffffffff`, they are the
|
|||
|
length of the CIE or FDE record. Otherwise the next 64 bits holds the length,
|
|||
|
and this is a 64-bit DWARF format. This is like `.debug_frame`.
|
|||
|
* A 4 byte ID. For a CIE this is 0. For an FDE it is the byte offset from this
|
|||
|
field to the start of the CIE with which this FDE is associated. The byte
|
|||
|
offset goes to the length record of the CIE. A positive value goes backward;
|
|||
|
that is, you have to subtract the value of the ID field from the current byte
|
|||
|
position to get the CIE position. This differs from `.debug_frame` in that
|
|||
|
the offset is relative rather than being an offset into the `.debug_frame`
|
|||
|
section.
|
|||
|
|
|||
|
A CIE record continues as follows:
|
|||
|
|
|||
|
* 1 byte CIE version. As of this writing this should be 1 or 3.
|
|||
|
* NUL terminated augmentation string. This is a sequence of characters. Very
|
|||
|
old versions of gcc used the string “eh” here, but I won’t document that.
|
|||
|
This is described further below.
|
|||
|
* Code alignment factor, an unsigned LEB128 (LEB128 is a DWARF encoding for
|
|||
|
numbers which I won’t describe here). This should always be 1 for `.eh_frame`.
|
|||
|
* Data alignment factor, a signed LEB128. This is a constant factored out of
|
|||
|
offset instructions, as in `.debug_frame`.
|
|||
|
* The return address register. In CIE version 1 this is a single byte; in CIE
|
|||
|
version 3 this is an unsigned LEB128. This indicates which column in the
|
|||
|
frame table represents the return address.
|
|||
|
|
|||
|
The next fields of the CIE depend on the augmentation string.
|
|||
|
|
|||
|
* If the augmentation string starts with ‘z’, we now find an unsigned LEB128
|
|||
|
which is the length of the augmentation data, rounded up so that the CIE ends
|
|||
|
on an address boundary. This is used to skip to the end of the augmentation
|
|||
|
data if an unrecognized augmentation character is seen.
|
|||
|
* If the next character in the augmentation string is ‘L’, the next byte in the
|
|||
|
CIE is the LSDA (Language Specific Data Area) encoding. This is a
|
|||
|
`DW_EH_PE_xxx` value (described later). The default is `DW_EH_PE_absptr`.
|
|||
|
* If the next character in the augmentation string is ‘R’, the next byte in the
|
|||
|
CIE is the FDE encoding. This is a `DW_EH_PE_xxx` value. The default is
|
|||
|
`DW_EH_PE_absptr`.
|
|||
|
* The character ‘S’ in the augmentation string means that this CIE represents a
|
|||
|
stack frame for the invocation of a signal handler. When unwinding the stack,
|
|||
|
signal stack frames are handled slightly differently: the instruction pointer
|
|||
|
is assumed to be before the next instruction to execute rather than after it.
|
|||
|
* If the next character in the augmentation string is ‘P’, the next byte in the
|
|||
|
CIE is the personality encoding, a `DW_EH_PE_xxx` value. This is followed by
|
|||
|
a pointer to the personality function, encoded using the personality
|
|||
|
encoding. I’ll describe the personality function some other day.
|
|||
|
|
|||
|
The remaining bytes are an array of `DW_CFA_xxx` opcodes which define the
|
|||
|
initial values for the frame table. This is then followed by `DW_CFA_nop`
|
|||
|
padding bytes as required to match the total length of the CIE.
|
|||
|
|
|||
|
An FDE starts with the length and ID described above, and then continues as
|
|||
|
follows.
|
|||
|
|
|||
|
* The starting address to which this FDE applies. This is encoded using the FDE
|
|||
|
encoding specified by the associated CIE.
|
|||
|
* The number of bytes after the start address to which this FDE applies. This
|
|||
|
is encoded using the FDE encoding.
|
|||
|
* If the CIE augmentation string starts with ‘z’, the FDE next has an unsigned
|
|||
|
LEB128 which is the total size of the FDE augmentation data. This may be used
|
|||
|
to skip data associated with unrecognized augmentation characters.
|
|||
|
* If the CIE does not specify `DW_EH_PE_omit` as the LSDA encoding, the FDE
|
|||
|
next has a pointer to the LSDA, encoded as specified by the CIE.
|
|||
|
|
|||
|
The remaining bytes in the FDE are an array of `DW_CFA_xxx` opcodes which set
|
|||
|
values in the frame table for unwinding to the caller.
|
|||
|
|
|||
|
The `DW_EH_PE_xxx` encodings describe how to encode values in a CIE or FDE. The
|
|||
|
basic encoding is as follows:
|
|||
|
|
|||
|
* `DW_EH_PE_absptr = 0x00`: An absolute pointer. The size is determined by
|
|||
|
whether this is a 32-bit or 64-bit address space, and will be 32 or 64 bits.
|
|||
|
* `DW_EH_PE_omit = 0xff`: The value is omitted.
|
|||
|
* `DW_EH_PE_uleb128 = 0x01`: The value is an unsigned LEB128.
|
|||
|
* `DW_EH_PE_udata2 = 0x02`, `DW_EH_PE_udata4 = 0x03`, `DW_EH_PE_udata8 = 0x04`:
|
|||
|
The value is stored as unsigned data with the specified number of bytes.
|
|||
|
* `DW_EH_PE_signed = 0x08`: A signed number. The size is determined by whether
|
|||
|
this is a 32-bit or 64-bit address space. I don’t think this ever appears in
|
|||
|
a CIE or FDE in practice.
|
|||
|
* `DW_EH_PE_sleb128 = 0x09`: A signed LEB128. Not used in practice.
|
|||
|
* `DW_EH_PE_sdata2 = 0x0a`, `DW_EH_PE_sdata4 = 0x0b`, `DW_EH_PE_sdata8 = 0x0c`:
|
|||
|
The value is stored as signed data with the specified number of bytes. Not
|
|||
|
used in practice.
|
|||
|
|
|||
|
In addition the above basic encodings, there are modifiers.
|
|||
|
|
|||
|
* `DW_EH_PE_pcrel = 0x10`: Value is PC relative.
|
|||
|
* `DW_EH_PE_textrel = 0x20`: Value is text relative.
|
|||
|
* `DW_EH_PE_datarel = 0x30`: Value is data relative.
|
|||
|
* `DW_EH_PE_funcrel = 0x40`: Value is relative to start of function.
|
|||
|
* `DW_EH_PE_aligned = 0x50`: Value is aligned: padding bytes are inserted as
|
|||
|
required to make value be naturally aligned.
|
|||
|
* `DW_EH_PE_indirect = 0x80`: This is actually the address of the real value.
|
|||
|
|
|||
|
If you follow all that, and also read up on `.debug_frame`, then you have
|
|||
|
enough information to unwind the stack at runtime, e.g. to implement glibc’s
|
|||
|
backtrace function. Later I’ll describe the LSDA and the personality function,
|
|||
|
which work together to implement exception catching on top of stack unwinding.
|
|||
|
|