airs-notes/gcc_except_table.md

9.2 KiB
Raw Blame History

.gcc_except_table

Throwing an exception in C++ requires more than unwinding the stack. As the program unwinds, local variable destructors must be executed. Catch clauses must be examined to see if they should catch the exception. Exception specifications must be checked to see if the exception should be redirected to the unexpected handler. Similar issues arise in Go, Java, and even C when using gccs cleanup function attribute.

As I described earlier, each CIE in the unwind data may contain a pointer to a personality function, and each FDE may contain a pointer to the LSDA, the Language Specific Data Area. Each language has its own personality function. The LSDA is only used by the personality function, so it could in principle differ for each language. However, at least for gcc, every language uses the same format, since the LSDA is generated by the language-independent middle-end.

The personality function takes five arguments:

  1. A int version number, currently 1.
  2. A bitmask of actions.
  3. An exception class, a 64-bit unsigned integer which is specific to a language.
  4. A pointer to information about the specific exception being thrown.
  5. Unwinder state information.

The exception class permits code written in one language to work correctly when an exception is thrown by code written in a different language. The value for g++ is “GNUCC++\0” (or “GNUCC++\1” for a dependent exception, which is used when rethrowing an exception). The value for Go is “GNUCGO\0\0”. The exception specific information can only be examined if the exception class is recognized.

Unwinding the stack for an exception is done in two phases. In the first phase, the unwinder walks up the stack passing the action _UA_SEARCH_PHASE (which has the value 1) to each personality function that it finds. The personality function should examine the LSDA to see if there is a handler for the exception being thrown. It should return _URC_HANDLER_FOUND (6) if there is or _URC_CONTINUE_UNWIND (8) if there isnt. The search phase will continue until a handler is found or until the top of the stack is reached. The unwinder will not actually change anything while walking. If the top of the stack is reached the unwinder will simply return, and the calling code will take the appropriate action, which for C++ is to call std::terminate. Because of the two phase unwinding approach, if std::terminate dumps core, a backtrace will show the code which threw the exception.

If a handler is found, the second phase begins. The unwinder walks up the stack passing the action _UA_CLEANUP_PHASE (2) to each personality function. The unwinder will also set _UA_FORCE_UNWIND (8) in the actions bitmask if the personality function may not catch the exception, because the unwinding is happening due to some event like thread cancellation. The unwinder will walk up the stack until it finds the handler—the stack frame for which the personality function returned _URC_HANDLER_FOUND. When it calls that function, the unwinder will pass _UA_HANDLER_FRAME (4) in the actions bitmask. This time, the unwinder will changes things as it goes, removing stack frames.

In order to run destructors, the personality function will call _Unwind_SetIP on the context parameter to set the program counter to point to the cleanup routine, and then return _URC_INSTALL_CONTEXT (7) to tell the unwinder to branch to the current context. The address which starts the cleanup is known as a landing pad. The cleanup should do whatever it needs to do, and then call _Unwind_Resume. The exception information needs to be passed to _Unwind_Resume. The personality routine arranges to pass the exception information to the cleanup by calling _Unwind_SetGR passing __builtin_eh_return_data_regno(0) and the exception information passed to the personality routine. Each target which supports this approach has to dedicate two registers to holding exception information. This is the first one.

The personality function which finds the handler works pretty much the same way. It may also use _Unwind_SetGR to set a value in __builtin_eh_return_data_regno(1) to indicate which exception was found. The exception handler may rethrow the exception via _Unwind_RaiseException or it may simply continue a normal execution path.

At this point weve seen everything except how the personality function decides whether it needs to run a cleanup or catch an exception. The personality function makes this decision based on the LSDA. As mentioned above, while the LSDA could be language dependent, in practice it is not. There is a different personality function for each language, but they all do more or less the same thing, omitting aspects which are not relevant for the language (e.g., there is a personality function for C, but it only runs cleanups and does not bother to look for exception handlers).

The LSDA is found in the section .gcc_except_table (the personality function is just a function and lives in the .text section as usual). The personality function gets a pointer to it by calling _Unwind_GetLanguageSpecificData. The LSDA starts with the following fields:

  1. A 1 byte encoding of the following field (a DW_EH_PE_xxx value).
  2. If the encoding is not DW_EH_PE_omit, the landing pad base. This is the base from which landing pad offsets are computed. If this is omitted, the base comes from calling _Unwind_GetRegionStart, which returns the beginning of the code described by the current FDE. In practice this field is normally omitted.
  3. A 1 byte encoding of the entries in the type table (a DW_EH_PE_xxx value).
  4. If the encoding is not DW_EH_PE_omit, the types table pointer. This is an unsigned LEB128 value, and is the byte offset from this field to the start of the types table used for exception matching.
  5. A 1 byte encoding of the fields in the call-site table (a DW_EH_PE_xxx value).
  6. An unsigned LEB128 value holding the length in bytes of the call-site table.

This header is immediately followed by the call-site table. Each entry in the call-site table has four fields. The number of bytes in the header gives the total length. Each entry in the call-site table describes a particular sequence of instructions within the function that the FDE desribes.

  1. The start of the instructions for the current call site, a byte offset from the landing pad base. This is encoded using the encoding from the header.
  2. The length of the instructions for the current call site, in bytes. This is encoded using the encoding from the header.
  3. A pointer to the landing pad for this sequence of instructions, or 0 if there isnt one. This is a byte offset from the landing pad base. This is encoded using the encoding from the header.
  4. The action to take, an unsigned LEB128. This is 1 plus a byte offset into the action table. The value zero means that there is no action.

The call-site table is sorted by the start address field. If the personality function finds that there is no entry for the current PC in the call-site table, then there is no exception information. This should not happen in normal operation, and in C++ will lead to a call to std::terminate. If there is an entry in the call-site table, but the landing pad is zero, then there is nothing to do: there are no destructors to run or exceptions to catch. This is a normal case, and the unwinder will simply continue. If the action record is zero, then there are destructors to run but no exceptions to catch. The personality function will arrange to run the destructors as described above, and unwinding will continue.

Otherwise, we have an offset into the action table. Each entry in the action table is a pair of signed LEB128 values. The first number is a type filter. The second number is a byte offset to the next entry in the action table. A byte offset of 0 ends the current set of actions.

A type filter of zero indicates a cleanup, which is the same as an action record of zero in the call-site table. This means that there is a cleanup to be called even if none of the types match.

A positive type filter is an index into the types table. This is a negative index: the value 1 means the entry preceding the types table base, 2 means the entry before that, etc. The size of entries in the types table comes from the encoding in the header, as does the base of the types table. Each entry in the types table is a pointer to a type information structure. If this type information structure matches the type of the exception, then we have found a handler for this exception. The type filter value is a switch value will be passed to the handler in exception register 1. The actual comparison of the type information, and determining the type information from the exception pointer, really is language dependent. In C++ this is a pointer to a std::type_info structure. A NULL pointer in the types table is a catch-all handler.

A negative type filter is a byte offset into the types table of a NULL terminated list of pointers to type information structures. If the type of the current exception does not match any of the entries in the list, then there is an exception specification error. This is treated as an exception handler with a negative switch value.

I think that covers everything about how gcc unwinds the stack and throws exceptions.