158 lines
9.2 KiB
Markdown
158 lines
9.2 KiB
Markdown
# .gcc_except_table
|
||
|
||
Throwing an exception in C++ requires more than unwinding the stack. As the
|
||
program unwinds, local variable destructors must be executed. Catch clauses
|
||
must be examined to see if they should catch the exception. Exception
|
||
specifications must be checked to see if the exception should be redirected to
|
||
the unexpected handler. Similar issues arise in Go, Java, and even C when using
|
||
gcc’s cleanup function attribute.
|
||
|
||
As I described earlier, each CIE in the unwind data may contain a pointer to a
|
||
personality function, and each FDE may contain a pointer to the LSDA, the
|
||
Language Specific Data Area. Each language has its own personality function.
|
||
The LSDA is only used by the personality function, so it could in principle
|
||
differ for each language. However, at least for gcc, every language uses the
|
||
same format, since the LSDA is generated by the language-independent
|
||
middle-end.
|
||
|
||
The personality function takes five arguments:
|
||
|
||
1. A int version number, currently 1.
|
||
2. A bitmask of actions.
|
||
3. An exception class, a 64-bit unsigned integer which is specific to a language.
|
||
4. A pointer to information about the specific exception being thrown.
|
||
5. Unwinder state information.
|
||
|
||
The exception class permits code written in one language to work correctly when
|
||
an exception is thrown by code written in a different language. The value for
|
||
g++ is “GNUCC++\0” (or “GNUCC++\1” for a dependent exception, which is used
|
||
when rethrowing an exception). The value for Go is “GNUCGO\0\0”. The exception
|
||
specific information can only be examined if the exception class is recognized.
|
||
|
||
Unwinding the stack for an exception is done in two phases. In the first phase,
|
||
the unwinder walks up the stack passing the action `_UA_SEARCH_PHASE` (which
|
||
has the value 1) to each personality function that it finds. The personality
|
||
function should examine the LSDA to see if there is a handler for the exception
|
||
being thrown. It should return `_URC_HANDLER_FOUND` (`6`) if there is or
|
||
`_URC_CONTINUE_UNWIND` (`8`) if there isn’t. The search phase will continue
|
||
until a handler is found or until the top of the stack is reached. The unwinder
|
||
will not actually change anything while walking. If the top of the stack is
|
||
reached the unwinder will simply return, and the calling code will take the
|
||
appropriate action, which for C++ is to call `std::terminate`. Because of the
|
||
two phase unwinding approach, if `std::terminate` dumps core, a backtrace will
|
||
show the code which threw the exception.
|
||
|
||
If a handler is found, the second phase begins. The unwinder walks up the stack
|
||
passing the action `_UA_CLEANUP_PHASE` (`2`) to each personality function. The
|
||
unwinder will also set `_UA_FORCE_UNWIND` (`8`) in the actions bitmask if the
|
||
personality function may not catch the exception, because the unwinding is
|
||
happening due to some event like thread cancellation. The unwinder will walk up
|
||
the stack until it finds the handler—the stack frame for which the personality
|
||
function returned `_URC_HANDLER_FOUND`. When it calls that function, the
|
||
unwinder will pass `_UA_HANDLER_FRAME` (`4`) in the actions bitmask. This time,
|
||
the unwinder will changes things as it goes, removing stack frames.
|
||
|
||
In order to run destructors, the personality function will call `_Unwind_SetIP`
|
||
on the context parameter to set the program counter to point to the cleanup
|
||
routine, and then return `_URC_INSTALL_CONTEXT` (`7`) to tell the unwinder to
|
||
branch to the current context. The address which starts the cleanup is known as
|
||
a landing pad. The cleanup should do whatever it needs to do, and then call
|
||
`_Unwind_Resume`. The exception information needs to be passed to
|
||
`_Unwind_Resume`. The personality routine arranges to pass the exception
|
||
information to the cleanup by calling `_Unwind_SetGR` passing
|
||
`__builtin_eh_return_data_regno(0)` and the exception information passed to the
|
||
personality routine. Each target which supports this approach has to dedicate
|
||
two registers to holding exception information. This is the first one.
|
||
|
||
The personality function which finds the handler works pretty much the same
|
||
way. It may also use `_Unwind_SetGR` to set a value in
|
||
`__builtin_eh_return_data_regno(1)` to indicate which exception was found. The
|
||
exception handler may rethrow the exception via `_Unwind_RaiseException` or it
|
||
may simply continue a normal execution path.
|
||
|
||
At this point we’ve seen everything except how the personality function decides
|
||
whether it needs to run a cleanup or catch an exception. The personality
|
||
function makes this decision based on the LSDA. As mentioned above, while the
|
||
LSDA could be language dependent, in practice it is not. There is a different
|
||
personality function for each language, but they all do more or less the same
|
||
thing, omitting aspects which are not relevant for the language (e.g., there is
|
||
a personality function for C, but it only runs cleanups and does not bother to
|
||
look for exception handlers).
|
||
|
||
The LSDA is found in the section `.gcc_except_table` (the personality function
|
||
is just a function and lives in the `.text` section as usual). The personality
|
||
function gets a pointer to it by calling `_Unwind_GetLanguageSpecificData`. The
|
||
LSDA starts with the following fields:
|
||
|
||
1. A 1 byte encoding of the following field (a `DW_EH_PE_xxx` value).
|
||
2. If the encoding is not `DW_EH_PE_omit`, the landing pad base. This is the
|
||
base from which landing pad offsets are computed. If this is omitted, the
|
||
base comes from calling `_Unwind_GetRegionStart`, which returns the beginning
|
||
of the code described by the current FDE. In practice this field is normally
|
||
omitted.
|
||
3. A 1 byte encoding of the entries in the type table (a `DW_EH_PE_xxx` value).
|
||
4. If the encoding is not `DW_EH_PE_omit`, the types table pointer. This is an
|
||
unsigned LEB128 value, and is the byte offset from this field to the start
|
||
of the types table used for exception matching.
|
||
5. A 1 byte encoding of the fields in the call-site table (a `DW_EH_PE_xxx`
|
||
value).
|
||
6. An unsigned LEB128 value holding the length in bytes of the call-site table.
|
||
|
||
This header is immediately followed by the call-site table. Each entry in the
|
||
call-site table has four fields. The number of bytes in the header gives the
|
||
total length. Each entry in the call-site table describes a particular sequence
|
||
of instructions within the function that the FDE desribes.
|
||
|
||
1. The start of the instructions for the current call site, a byte offset from
|
||
the landing pad base. This is encoded using the encoding from the header.
|
||
2. The length of the instructions for the current call site, in bytes. This is
|
||
encoded using the encoding from the header.
|
||
3. A pointer to the landing pad for this sequence of instructions, or 0 if
|
||
there isn’t one. This is a byte offset from the landing pad base. This is
|
||
encoded using the encoding from the header.
|
||
4. The action to take, an unsigned LEB128. This is 1 plus a byte offset into
|
||
the action table. The value zero means that there is no action.
|
||
|
||
The call-site table is sorted by the start address field. If the personality
|
||
function finds that there is no entry for the current PC in the call-site
|
||
table, then there is no exception information. This should not happen in normal
|
||
operation, and in C++ will lead to a call to `std::terminate`. If there is an
|
||
entry in the call-site table, but the landing pad is zero, then there is
|
||
nothing to do: there are no destructors to run or exceptions to catch. This is
|
||
a normal case, and the unwinder will simply continue. If the action record is
|
||
zero, then there are destructors to run but no exceptions to catch. The
|
||
personality function will arrange to run the destructors as described above,
|
||
and unwinding will continue.
|
||
|
||
Otherwise, we have an offset into the action table. Each entry in the action
|
||
table is a pair of signed LEB128 values. The first number is a type filter. The
|
||
second number is a byte offset to the next entry in the action table. A byte
|
||
offset of 0 ends the current set of actions.
|
||
|
||
A type filter of zero indicates a cleanup, which is the same as an action
|
||
record of zero in the call-site table. This means that there is a cleanup to be
|
||
called even if none of the types match.
|
||
|
||
A positive type filter is an index into the types table. This is a negative
|
||
index: the value 1 means the entry preceding the types table base, 2 means the
|
||
entry before that, etc. The size of entries in the types table comes from the
|
||
encoding in the header, as does the base of the types table. Each entry in the
|
||
types table is a pointer to a type information structure. If this type
|
||
information structure matches the type of the exception, then we have found a
|
||
handler for this exception. The type filter value is a switch value will be
|
||
passed to the handler in exception register 1. The actual comparison of the
|
||
type information, and determining the type information from the exception
|
||
pointer, really is language dependent. In C++ this is a pointer to a
|
||
`std::type_info` structure. A `NULL` pointer in the types table is a catch-all
|
||
handler.
|
||
|
||
A negative type filter is a byte offset into the types table of a `NULL`
|
||
terminated list of pointers to type information structures. If the type of the
|
||
current exception does not match any of the entries in the list, then there is
|
||
an exception specification error. This is treated as an exception handler with
|
||
a negative switch value.
|
||
|
||
I think that covers everything about how gcc unwinds the stack and throws
|
||
exceptions.
|
||
|