airs-notes/maskray-3.md

1051 lines
42 KiB
Markdown
Raw Normal View History

2021-02-03 00:08:25 +00:00
# C++ exception handling ABI
I wrote [an article](maskray-1.md) a few weeks ago to introduce stack unwinding
in detail. Today I will introduce C++ exception handling, an application of
stack unwinding. Exception handling has a variety of ABI (interoperability of
C++ implementations), the most widely used of which is [Itanium C++ ABI:
Exception Handling](https://itanium-cxx-abi.github.io/cxx-abi/abi-eh.html)
## Itanium C++ ABI: Exception Handling
Simplified exception handling process (from throw to catch):
* Call `__cxa_allocate_exception` to allocate space to store the exception
object and the exception header `__cxa_exception`
* Jump to `__cxa_throw`, set the `__cxa_exception` fields and then jump to
`_Unwind_RaiseException`
* In `_Unwind_RaiseException`, execute the search phase, call personality
routines to find matching try catch (type matching)
* In `_Unwind_RaiseException`, execute the cleanup phase: call personality
routines to find stack frames containing out-of-scope variables, and for each
stack frame, jump to its landing pad to execute the constructors. The landing
pad uses `_Unwind_Resume` to jump back to the cleanup phase
* The cleanup phase executed by `_Unwind_RaiseException` jumps to the landing
pad corresponding to the matching try catch
* The landing pad calls `__cxa_begin_catch`, executes the catch code, and then
calls `__cxa_end_catch`
* `__cxa_end_catch` destroys the exception object
Note: each stack frame may use a different personality routine. It is common
that many may share the same routine, though.
Among these steps, `_Unwind_RaiseException` is responsible for stack unwinding
and is language independent. The language-related concepts (catch block,
out-of-scope variable) in stack unwinding are interpreted/encapsulated by the
personality. This is a key idea that makes the ABI applicable to other
languages and allows other languages to be mixed with C++.
Therefore, Itanium C++ ABI: Exception Handling is divided into Level 1 Base ABI
and Level 2 C++ ABI. Base ABI describes the language-independent stack
unwinding part and defines the `_Unwind_*` API. Common implementations are:
* libgcc: `libgcc_s.so.1` and `libgcc_eh.a`
* Multiple libraries named libunwind (`libunwind.so` or `libunwind.a`). If you
use Clang, you can use `--rtlib=compiler-rt --unwindlib=libunwind` to choose
to link to libunwind, you can use llvm-project/libunwind or nongnu.org/libunwind
The C++ ABI is related to the C++ language and defines the `__cxa_*` API
(`__cxa_allocate_exception`, `__cxa_throw`, `__cxa_begin_catch`, etc.). Common
implementations are:
* libsupc++, part of libstdc++
* libc++abi in llvm-project
The C++ standard library implementation in llvm-project, libc++, can leverage
libc++abi, libcxxrt or libsupc++, but libc++abi is recommended.
## Level 1 Base ABI
### Data structures
The main data structure is:
```c
// Level 1
struct _Unwind_Exception {
_Unwind_Exception_Class exception_class; // an identifier, used to tell whether the exception is native
_Unwind_Exception_Cleanup_Fn exception_cleanup;
_Unwind_Word private_1; // zero: normal unwind; non-zero: forced unwind, the _Unwind_Stop_Fn function
_Unwind_Word private_2; // saved stack pointer
} __attribute__((aligned));
```
`exception_class` and `exception_cleanup` are set by the API that throws
exceptions in Level 2. The Level 1 API does not process `exception_class`, but
passes it to the personality routine. Personality routines use this value to
distinguish native and foreign exceptions.
> libc++abi `__cxa_throw` will set `exception_class` to `uint64_t` representing
> `"CLNGC++\0"`. libsupc++ uses `uint64_t` which means `"GNUCC++\0"`. The ABI
> requires that the lower bits contain `"C++\0"`. The exceptions thrown by
> libstdc++ will be treated as foreign exceptions by libc++abi. Only `catch
> (...)` can catch foreign exceptions.
>
> Exception propagation implementation mechanism will use another
> exception_class identifier to represent dependent exceptions.
`exception_cleanup` stores the destroying delete function of this exception
object, which is used by `__cxa_end_catch` to destroy a foreign exception.
The private unwinder state (`private_1` and `private_2`) in an exception object
should be neither read by nor written to by personality routines or other parts
of the language-specific runtime.
The information required for the Unwind operation (for a given IP/SP, how to
obtain the register information such as the IP/SP of the upper stack frame) is
implementation-dependent, and Level 1 ABI does not define it. In the ELF
system, `.eh_frame` and `.eh_frame_hdr` (`PT_EH_FRAME` program header) store
unwind information. See [Stack unwinding](maskray-1.md).
### Level 1 API
`_Unwind_Reason_Code _Unwind_RaiseException(_Unwind_Exception *obj);` Perform
stack unwinding for exceptions. It is noreturn under normal circumstances, and
will give control to matched catch handlers (catch block) or non-catch handlers
(code blocks that need to execute destructors) like longjmp. It is a two-phase
process, divided into phase 1 (search phase) and phase 2 (cleanup phase).
* In the search phase, find matched catch handler and record the stack pointer
in `private_2`
* Trace the call chain based on IP/SP and other saved registers
* For each stack frame, skip if there is no personality routine; call if
there is (actions set to `_UA_SEARCH_PHASE`)
* If personality returns `_URC_CONTINUE_UNWIND`, continue searching
* If personality returns `_URC_HANDLER_FOUND`, it means that a matched catch
handler or unmatched exception specification is found, and the search stops
* In the cleanup phase, jump to non-catch handlers (usually local variable
destructors), and then transfer the control to the matched catch handler
located in the search phase
* Trace the call chain based on IP/SP and other saved registers
* For each stack frame, skip if there is no personality routine; call if
there is one (actions are set to `_UA_CLEANUP_PHASE`, and the stack frame
marked by search phase will also set `_UA_HANDLER_FRAME`)
* If personality returns `_URC_CONTINUE_UNWIND`, it means there is no landing
pad, continue to unwind
* If personality returns `_URC_INSTALL_CONTEXT`, it means there is a landing
pad, jump to the landing pad
* For intermediate stack frames that are not marked in the search phase, the
landing pad performs cleanup work (usually destructors of out-of-scope
variables), and calls `_Unwind_Resume` to jump back to the cleanup phase
* For the stack frame marked by the search phase, the landing pad calls
`__cxa_begin_catch`, then executes the code in the catch block, and finally
calls `__cxa_end_catch` to destroy the exception object
The point of the two-phase process is to avoid any actual stack unwinding if
there is no handler. If there are just cleanup frames, an abort function can be
called. Cleanup frames are also less expensive than matching a handler.
However, parsing `.gcc_except_table` is probably not much less expensive than
additionally matching a handler:)
```cpp
static _Unwind_Reason_Code unwind_phase1(unw_context_t *uc, _Unwind_Context *ctx,
_Unwind_Exception *obj) {
// Search phase: unwind and call personality with _UA_SEARCH_PHASE for each frame
// until a handler (catch block) is found.
unw_init_local(uc, ctx);
for(;;) {
if (ctx->fdeMissing) return _URC_END_OF_STACK;
if (!step(ctx)) return _URC_FATAL_PHASE1_ERROR;
ctx->getFdeAndCieFromIP();
if (!ctx->personality) continue;
switch (ctx->personality(1, _UA_SEARCH_PHASE, obj->exception_class, obj, ctx)) {
case _URC_CONTINUE_UNWIND: break;
case _URC_HANDLER_FOUND:
unw_get_reg(ctx, UNW_REG_SP, &obj->private_2);
return _URC_NO_REASON;
default: return _URC_FATAL_PHASE1_ERROR; // e.g. stack corruption
}
}
return _URC_NO_REASON;
}
static _Unwind_Reason_Code unwind_phase2(unw_context_t *uc, _Unwind_Context *ctx,
_Unwind_Exception *obj) {
// Cleanup phase: unwind and call personality with _UA_CLEANUP_PHASE for each frame
// until reaching the handler. Restore the register state and transfer control.
unw_init_local(uc, ctx);
for(;;) {
if (ctx->fdeMissing) return _URC_END_OF_STACK;
if (!step(ctx)) return _URC_FATAL_PHASE2_ERROR;
ctx->getFdeAndCieFromIP();
if (!ctx->personality) continue;
_Unwind_Action actions = _UA_CLEANUP_PHASE;
size_t sp;
unw_get_reg(ctx, UNW_REG_SP, &sp);
if (sp == obj->private_2) actions |= _UA_HANDLER_FRAME;
switch (ctx->personality(1, actions, obj->exception_class, obj, ctx)) {
case _URC_CONTINUE_UNWIND:
break;
case _URC_INSTALL_CONTEXT:
unw_resume(ctx); // Return if there is an error
return _URC_FATAL_PHASE2_ERROR;
default: return _URC_FATAL_PHASE2_ERROR; // Unknown result code
}
}
return _URC_FATAL_PHASE2_ERROR;
}
_Unwind_Reason_Code _Unwind_RaiseException(_Unwind_Exception *obj) {
unw_context_t uc;
_Unwind_Context ctx;
__unw_getcontext(&uc);
_Unwind_Reason_Code phase1 = unwind_phase1(&uc, &ctx, obj);
if (phase1 != _URC_NO_REASON) return phase1;
unwind_phase2(&uc, &ctx, obj);
}
```
> C++ does not support resumptive exception handling (correcting the
> exceptional condition and resuming execution at the point where it was
> raised), so the two-phase process is not necessary, but two-phase allows C++
> and other languages to coexist on the call stack.
`_Unwind_Reason_Code _Unwind_ForcedUnwind(_Unwind_Exception *obj, _Unwind_Stop_Fn stop, void *stop_parameter);`
Execute forced unwinding: Skip the search phase and perform a slightly
different cleanup phase. `private_2` is used as the parameter of the stop
function. It is similar to a foreign exception but is rarely used.
`void _Unwind_Resume(_Unwind_Exception *obj);` Continue the unwind process of
phase 2. It is similar to longjmp, is noreturn, and is the only Level 1 API
that is directly called by the compiler. The compiler usually calls this
function at the end of non-catch handlers.
`void _Unwind_DeleteException(_Unwind_Exception *obj);` Destroy the specified
exception object. It is the only Level 1 API that handles `exception_cleanup` and
is called by `__cxa_end_catch`.
Many implementations provide extensions. Notably
`_Unwind_Reason_Code _Unwind_Backtrace(_Unwind_Trace_Fn callback, void *ref);`
is another special unwind process: it ignores personality and notifies an
external callback of stack frame information.
## Level 2 C++ ABI
This part deals with language-related concepts such as throw, catch blocks, and
out-of-scope variable destructors in C++.
### Data structures
Each thread has a global exception stack, `caughtExceptions` stores the top
(latest) exception on the stack, and `__cxa_exception::nextException` points to
the next exception in the stack.
```c
struct __cxa_eh_globals {
__cxa_exception *caughtExceptions;
unsigned uncaughtExceptions;
};
```
The definition of `__cxa_exception` is as follows, and the end of it stores the
`_Unwind_Exception` defined by Base ABI. `__cxa_exception` adds C++ semantic
information on the basis of `_Unwind_Exception`.
```cpp
// Level 2
struct __cxa_exception {
void *reserve; // here on 64-bit platforms
size_t referenceCount; // here on 64-bit platforms
std::type_info *exceptionType;
void (*exceptionDestructor)(void *);
unexpected_handler unexpectedHandler; // by default std::get_unexpected()
terminate_handler terminateHandler; // by default std::get_terminate()
__cxa_exception *nextException; // linked to the next exception on the thread stack
int handlerCount; // incremented in __cxa_begin_catch, decremented in __cxa_end_catch, negated in __cxa_rethrow; last non-dependent performs the clean
// The following fields cache information the catch handler found in phase 1.
int handlerSwitchValue; // ttypeIndex in libc++abi
const char *actionRecord;
const char *languageSpecificData;
void *catchTemp; // landingPad
void *adjustedPtr; // adjusted pointer of the exception object
_Unwind_Exception unwindHeader;
};
```
The information needed to process the exception (for a given IP, whether it is
in a try catch, whether there are out-of-scope variable destructors that need
to be executed, whether there is a dynamic exception specification) is called
language-specific data area (LSDA), which is the implementation Related, Level
2 ABI is not defined.
### Landing pad
A landing pad is a section of code related to exceptions in the text section
which performs one of the three tasks:
* In the cleanup clause, call destructors of out-of-scope variables or
callbacks registered by `__attribute__((cleanup(...)))`, and then use
`_Unwind_Resume` to jump back to cleanup phase
* A catch clause which captures the exception: call the constructors of
out-of-scope variables, then call `__cxa_begin_catch`, execute the catch
code, and finally call `__cxa_end_catch`
* rethrow: call destructors of out-of-scope variables in the catch clause,
then call `__cxa_end_catch`, and then use `_Unwind_Resume` to jump back to
cleanup phase
If a try block has multiple catch clauses, there will be multiple action table
entries in series in the language-specific data area, but the landing pad
includes all (conceptually merged) catch clauses. Before the personality
transfers control to the landing pad, it will call `_Unwind_SetGP` to set
`__buitin_eh_return_data_regno(1)` to store switchValue and inform the landing
pad which type matches.
A rethrow is triggered by `__cxa_rethrow` in the middle of the execution of the
catch code. It needs to destruct the local variables defined by the catch
clause and call `__cxa_end_catch` to offset the `__cxa_begin_catch` called at
the beginning of the catch clause.
### `.gcc_except_table`
The language-specific data area on the ELF platforms is usually stored in the
`.gcc_except_table` section. This section is parsed by `__gxx_personality_v0`
and `__gcc_personality_v0`. Its structure is very simple:
* header (@LPStart, @TType and call sites coding, the starting offset of action
records)
* call site table: Describe the landing pad offset (0 if not exists) and action
record offset (biased by 1, 0 for no action) that should be executed for each
call site (an address range)
* action table
* type table (referennced by postive switch values)
* dynamic exception specification (deprecated in C++, so rarely used)
(referenced by negative switch values)
Here is an example:
```
.section .gcc_except_table,"a",@progbits
.p2align 2
GCC_except_table0:
.Lexception0:
.byte 255 # @LPStart Encoding = omit
.byte 3 # @TType Encoding = udata4
.uleb128 .Lttbase0-.Lttbaseref0 # The start of action records
.Lttbaseref0:
.byte 1 # Call site Encoding = uleb128
.uleb128 .Lcst_end0-.Lcst_begin0
.Lcst_begin0: # 2 call site code ranges
.uleb128 .Ltmp0-.Lfunc_begin0 # >> Call Site 1 <<
.uleb128 .Ltmp1-.Ltmp0 # Call between .Ltmp0 and .Ltmp1
.uleb128 .Ltmp2-.Lfunc_begin0 # jumps to .Ltmp2
.byte 1 # On action: 1
.uleb128 .Ltmp1-.Lfunc_begin0 # >> Call Site 2 <<
.uleb128 .Lfunc_end0-.Ltmp1 # Call between .Ltmp1 and .Lfunc_end0
.byte 0 # has no landing pad
.byte 0 # On action: cleanup
.Lcst_end0:
.byte 1 # >> Action Record 1 <<
# Catch TypeInfo 1
.byte 0 # No further actions
.p2align 2
# >> Catch TypeInfos <<
.long _ZTIi # TypeInfo 1
```
Each call site record has two values besides call site offset and length:
landing pad offset and action record offset.
* The landing pad offset is 0. The action record offset should also be 0. No
landing pad
* The landing pad offset is not 0. With landing pad
* The action record offset is 0, also called cleanup (the description of
"cleanup" is somewhat ambiguous, because Level 1 has the term clean phase),
usually describing local variable destructors and
`__attribute__((cleanup(...)))`
* The action record offset is not 0. The action record offset points to an
action record in the action table. catch or noexcept specifier or exception
specification
Each action record has two values:
* switch value (SLEB128): a positive index indicates the the TypeInfo of the
catch type in the type table; a negative number indicates the offset of the
exception specification; 0 indicates a cleanup action which is similar to an
action record offset of 0 in the call site record
* offset to the next action record: 0 indicates there is no next action record.
This singly linked list form can describe multiple catches or an exception
specification list
> The offset to next action record can be used not only as a singly linked
> list, but also as a trie, but it is rare such compression can find its usage
> in the wild.
The values of landing pad offset/action record offset corresponding to
different areas in the program:
* A non-try block without local variable destructor:
`landing_pad_offset==0 && action_record_offset==0`
* A non-try block with local variable destructors:
`landing_pad_offset!=0 && action_record_offset==0`. phase 2 should stop and
call cleanup
* A non-try block with `__attribute__((cleanup(...)))`:
`landing_pad_offset!=0 && action_record_offset==0`. Same as above
* A try block: `landing_pad_offset!=0 && action_record_offset!=0`. The landing
pad points to the code block obtained by catch splicing. Action record
describes a catch for a switch value greater than 0
* A try block with `catch (...)`: Same as above. The action record is a switch
value greater than 0 points to an item with a value of 0 in the type table
(indicating catch any)
* In a function with noexcept specifier, it is possible to propagate the
exception to the caller area: `landing_pad_offset!=0 && action_record_offset!=0`.
The landing pad points to the code block that calls `std::terminate`. The
action record is a switch value greater than 0 points to an item with a value
of 0 in the type table (indicating catch any)
* In a function with an exception specifier, it may propagate the exception to
the caller area: `landing_pad_offset!=0 && action_record_offset!=0`. The
landing pad points to the code block that calls `__cxa_call_unexpected`.
Action record is a switch value less than 0 describing an exception specifier
list
### Level 2 API
`void *__cxa_allocate_exception(size_t thrown_size);`. The compiler generates a
call to this function for `throw A();` and allocates a section of memory to
store `__cxa_exception` and `A` object. `__cxa_exception` is immediately to the
left of `A` object. The following function illustrates the relationship between
the address of the exception object operated by the program and
`__cxa_exception`:
```cpp
static void *thrown_object_from_cxa_exception(__cxa_exception *exception_header) {
return static_cast<void *>(exception_header + 1);
}
```
`void __cxa_throw(void *thrown, std::type_info *tinfo, void (*destructor)(void
*));` Call the above function to find the `__cxa_exception` header, and fill in
each field (`referenceCount`, `exception_class`, `unexpectedHandler`,
`terminateHandler`, `exceptionType`, `exceptionDestructor`,
`unwindHeader.exception_cleanup`) and then call `_Unwind_RaiseException`. This
function is noreturn.
`void *__cxa_begin_catch(void *obj);` The compiler generates a call to this
function at the beginning of the catch block. For a native exception,
* Add handlerCount
* Push the global exception stack of the thread to decrease uncaught_exception
* Return the adjusted pointer of the exception object
For a foreign exception (there is not necessarily a `__cxa_exception` header),
* Push if the global exception stack of the thread is empty, otherwise execute
`std::terminate` (I dont know if there is a field similar to
`__cxa_exception::nextException`)
* Return `static_cast<_Unwind_Exception *>(obj) + 1` (assuming
`_Unwind_Exception` is next to the thrown object)
Simplified implementation:
```cpp
void __cxa_throw(void *thrown, std::type_info *tinfo, void (*destructor)(void *)) {
__cxa_exception *hdr = (__cxa_exception *)thrown - 1;
hdr->exceptionType = tinfo; hdr->destructor = destructor;
hdr->unexpectedHandler = std::get_unexpected();
hdr->terminateHandler = std::get_terminate();
hdr->unwindHeader.exception_class = ...;
__cxa_get_globals()->uncaughtExceptions++;
_Unwind_RaiseException(&hdr->unwindHeader);
// Failed to unwind, e.g. the .eh_frame FDE is absent.
__cxa_begin_catch(&hdr->unwindHeader);
std::terminate();
}
```
`void __cxa_end_catch();` is called at the end of the catch block or when
rethrow. For native exception:
* Get the current exception from the global exception stack of the thread,
reduce handlerCount
* When handlerCount reaches 0, pop the global exception stack of the thread
* If this is a native exception: call `__cxa_free_exception` when handlerCount
is decreased to 0 (if this is a dependent exception, decrease referenceCount
and call `__cxa_free_exception` when it reaches 0)
For a foreign exception,
* Call `_Unwind_DeleteException`
* Execute `__cxa_eh_globals::uncaughtExceptions = nullptr;` (due to the nature
of `__cxa_begin_catch`, there is exactly one exception in the stack)
`void __cxa_rethrow();` will mark the exception object, so that when
`handlerCount` is reduced to 0 by `__cxa_end_catch`, it will not be destroyed,
because this object will be reused by the cleanup phase restored by
`_Unwind_Resume`.
Note that, except for `__cxa_begin_catch` and `__cxa_end_catch`, most `__cxa_*`
functions cannot handle foreign exceptions (they do not have the
`__cxa_exception` header).
### Examples
For the following code:
```cpp
#include <stdio.h>
struct A { ~A(); };
struct B { ~B(); };
void foo() { throw 0xB612; }
void bar() { B b; foo(); }
void qux() { try { A a; bar(); } catch (int x) { puts(""); } }
```
The compiled assembly conceptually looks like this:
```cpp
void foo() {
__cxa_exception *thrown = __cxa_allocate_exception(4);
*thrown = 42;
__cxa_throw(thrown, &typeid(int), /*destructor=*/nullptr);
}
void bar() {
B b; foo(); return;
landing_pad: b.~B(); _Unwind_Resume();
}
void qux() {
A a; bar(); return;
landing_pad: __cxa_begin_catch(obj); puts(""); __cxa_end_catch(obj);
}
```
Control flow:
* `qux` calls `bar`, `bar` calls `foo`, `foo` throws an exception
* `foo` dynamically allocates a memory block, stores the thrown int and
`__cxa_exception` header, and then executes `__cxa_throw`
* `__cxa_throw` fills in other fields of `__cxa_exception` and calls
`_Unwind_RaiseException`
Next, `_Unwind_RaiseException` drives the two-phase process of Level 1.
* `_Unwind_RaiseException` executes phase 1: search phase
* For `bar`, call personality with `_UA_SEARCH_PHASE` as the actions
parameter and return `_URC_CONTINUE_UNWIND` (no catch handler)
* For `qux`, call personality with `_UA_SEARCH_PHASE` as the actions
parameter and return `_URC_HANDLER_FOUND` (with catch handler)
* The stack pointer of the stack frame that marked `qux` will be marked
(stored in `private_2`) and the search will stop
* `_Unwind_RaiseException` executes phase 2: cleanup phase
* Bar's stack frame is not marked by search phase, call personality with
`_UA_CLEANUP_PHASE` as actions parameter, return `_URC_INSTALL_CONTEXT`
* Jump to the landing pad of the bar's stack frame
* After cleaning the landing pad, use `_Unwind_Resume` to return to the
cleanup phase
* The stack frame of `qux` is marked by search phase, call personality with
`_UA_CLEANUP_PHASE|_UA_HANDLER_FRAME` as the actions parameter, and return
`_UA_INSTALL_CONTEXT`
* Jump to the landing pad of the `qux` stack frame
* The landing pad calls `__cxa_begin_catch`, executes the catch code, and
then calls `__cxa_end_catch`
## `__gxx_personality_v0`
A personality routine is called by Level 1 phase 1 and phase 2 to provide
language-related processing. Different languages, implementations or
architectures may use different personality routines. Common personalities are
as follows:
* `__gxx_personality_v0`: C++
* `__gxx_personality_sj0`: sjlj
* `__gcc_personality_v0`: C `-fexceptions` for `__attribute__((cleanup(...)))`
* `__CxxFrameHandler3`: Windows MSVC
* `__gxx_personality_seh0`: MinGW-w64 `-fseh-exceptions`
* `__objc_personality_v0`: ObjC in the macOS environment
The most common C++ implementation on ELF systems is `__gxx_personality_v0`. It
is implemented by:
* GCC: `libstdc++-v3/libsupc++/eh_personality.cc`
* libc++abi: `src/cxa_personality.cpp`
`_Unwind_Reason_Code (*__personality_routine)(int version, _Unwind_Action action, uint64 exceptionClass, _Unwind_Exception *exceptionObject, _Unwind_Context *context);`
In the absence of errors:
* For `_UA_SEARCH_PHASE`, returns
* `_URC_CONTINUE_UNWIND`: no lsda, or there is no landing pad, there is a
non-catch handler or a matched exception specification
* `_URC_HANDLER_FOUND`: there is a matched catch handler or an unmatched
exception specification
* For `_UA_CLEANUP_PHASE`, returns
* `_URC_CONTINUE_UNWIND`: no lsda, or there is no landing pad, or (not
produced by a compiler) there is no cleanup action
* `_URC_INSTALL_CONTEXT`: the other cases
Before transferring control to the landing pad, the personality will call
`_Unwind_SetGP` to set two registers (architecture related,
`__buitin_eh_return_data_regno(0)` and `__buitin_eh_return_data_regno(1)`) to
store `_Unwind_Exception *` and `switchValue`.
Code:
```cpp
_unwind_Reason_Code __gxx_personality_v0(int version, _Unwind_Action actions, uint64_t exceptionClass, _Unwind_Exception *exc, _Unwind_Context *ctx) {
if (actions == (_UA_CLEANUP_PHASE | _UA_HANDLER_FRAME) && is_native) {
auto *hdr = (__cxa_exception *)(exc+1) - 1;
// Load cached results from phase 1.
results.switchValue = hdr->handlerSwitchValue;
results.actionRecord = hdr->actionRecord;
results.languageSpecificData = hdr->languageSpecificData;
results.landingPad = reinterpret_cast<uintptr_t>(hdr->catchTemp);
results.adjustedPtr = hdr->adjustedPtr;
_Unwind_SetGR(...);
_Unwind_SetGR(...);
_Unwind_SetIP(ctx, res.landingPad);
return _URC_INSTALL_CONTEXT;
}
scan_eh_tab(results, actions, native_exception, unwind_exception, context);
if (results.reason == _URC_CONTINUE_UNWIND ||
results.reason == _URC_FATAL_PHASE1_ERROR)
return results.reason;
if (actions & _UA_SEARCH_PHASE) {
auto *hdr = (__cxa_exception *)(exc+1) - 1;
// Cache LSDA results in hdr.
hdr->handlerSwitchValue = results.switchValue;
hdr->actionRecord = results.actionRecord;
hdr->languageSpecificData = results.languageSpecificData;
hdr->catchTemp = reinterpret_cast<void *>(results.landingPad);
hdr->adjustedPtr = results.adjustedPtr;
return _URC_HANDLER_FOUND;
}
// _UA_CLEANUP_PHASE
_Unwind_SetGR(...);
_Unwind_SetGR(...);
_Unwind_SetIP(ctx, res.landingPad);
return _URC_INSTALL_CONTEXT;
}
```
For a native exception, when the personality returns `_URC_HANDLER_FOUND` in
the search phase, the LSDA related information of the stack frame will be
cached. When the personality is called again in the cleanup phase with the
argument `actions == (_UA_CLEANUP_PHASE | _UA_HANDLER_FRAME)`, the personality
loads the cache and there is no need to parse `.gcc_except_table`.
In the remaining three cases, the personality has to parse `.gcc_except_table`:
* `actions & _UA_SEARCH_PHASE`
* `actions & _UA_CLEANUP_PHASE && actions & _UA_HANDLER_FRAME && !is_native`:
`catch (...)` can catch a foreign exception. An exception specification
terminates upon a foreign exception.
* `actions & _UA_CLEANUP_PHASE && !(actions & _UA_HANDLER_FRAME)`: non-catch
handlers and unmatched catch handlers, matched exception specification.
Another case is `_Unwind_ForcedUnwind`.
```cpp
static void scan_eh_tab(...) {
...
const uint8_t *lsda = (const uint8_t *)_Unwind_GetLanguageSpecificData(context);
if (lsda == nullptr) { res.reason = _URC_CONTINUE_UNWIND; return; }
res.languageSpecificData = lsda;
uintptr_t ipOffset = _Unwind_GetIP(context) - 1 - _Unwind_GetRegionStart(context);
for each call site entry {
if (!(start <= ipOffset && ipOffset < start + length))
continue;
res.landingPad = landingPad;
if (landingPad == 0) { res.reason = _URC_CONTINUE_UNWIND; return; }
if (actionRecord == 0) { // cleanup
res.reason = actions & _UA_SEARCH_PHASE ? _URC_CONTINUE_UNWIND : _URC_HANDLER_FOUND;
return;
}
// A catch or a dynamic exception specification.
const uint8_t *action = actionTableStart + (actionRecord - 1);
bool hasCleanup = false;
for(;;) {
res.actionRecord = action;
int64_t switchValue = readSLEB128(&action);
if (switchValue > 0) { // catch
auto *catchType = ...;
if (catchType == nullptr) { // catch (...)
res.switchValue = switchValue; res.actionRecord = action;
res.adjustedPtr = getThrownObjectPtr(exc); res.reason = _URC_HANDLER_FOUND;
return;
} else if (is_native) { // catch (T ...)
auto *hdr = (__cxa_exception *)(exc+1) - 1;
if (catchType->can_catch(hdr->exceptionType, adjustedPtr)) {
res.switchValue = switchValue; res.actionRecord = action;
res.adjustedPtr = adjustedPtr; res.reason = _URC_HANDLER_FOUND;
return;
}
}
} else if (switchValue < 0) { // dynamic exception specification
if (actions & _UA_FORCE_UNWIND) {
// Skip if forced unwinding.
} else if (is_native) {
if (!exception_spec_can_catch) {
// The landing pad will call __cxa_call_unexpected.
assert(actions & _UA_SEARCH_PHASE);
res.switchValue = switchValue; res.actionRecord = action;
res.adjustedPtr = adjustedPtr; res.reason = _URC_HANDLER_FOUND;
return;
}
} else {
// A foreign exception cannot be matched by the exception specification. The landing pad will call __cxa_call_unexpected.
res.switchValue = switchValue; res.actionRecord = action;
res.adjustedPtr = getThrownObjectPtr(exc); res.reason = _URC_HANDLER_FOUND;
return;
}
} else { // switchValue == 0: cleanup
hasCleanup = true;
}
const uint8_t *temp = action;
int64_t actionOffset = readSLEB128(&temp);
if (actionOffset == 0) { // End of action list
res.reason = hasCleanup && actions & _UA_CLEANUP_PHASE
? _URC_HANDLER_FOUND : _URC_CONTINUE_UNWIND;
return;
}
action += actionOffset;
}
}
call_terminate();
}
```
### `__gcc_personality_v0'
libgcc and compiler-rt/lib/builtins implement this function to handle
`__attribute__((cleanup(...)))`. The implementation does not return
`_URC_HANDLER_FOUND` in the search phase, so the cleanup handler cannot serve
as a catch handler. However, we can supply our own implementation to return
`_URC_HANDLER_FOUND` in the search phase... On x86-64,
`__buitin_eh_return_data_regno(0)` is RAX. We can let the cleanup handler pass
RAX to the landing pad.
```cpp
// a.cc
#include <exception>
#include <stdio.h>
extern "C" void my_catch();
extern "C" void throw_exception() { throw 42; }
int main() {
fprintf(stderr, "uncaught exceptions: %d\n", std::uncaught_exceptions());
my_catch();
fprintf(stderr, "uncaught exceptions: %d\n", std::uncaught_exceptions());
}
```
```c
// b.c
#include <setjmp.h>
#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
#include <unwind.h>
void throw_exception();
struct __cxa_eh_globals {
struct __cxa_exception *caughtExceptions;
unsigned uncaughtExceptions;
};
struct __cxa_eh_globals *__cxa_get_globals();
static uintptr_t readULEB128(const uint8_t **a) {
uintptr_t res = 0, shift = 0;
const uint8_t *p = *a;
uint8_t b;
do {
b = *p++;
res |= (b & 0x7f) << shift;
shift += 7;
} while (b & 0x80);
*a = p;
return res;
}
_Unwind_Reason_Code __gcc_personality_v0(int version, _Unwind_Action actions,
uint64_t exception_class,
struct _Unwind_Exception *obj,
struct _Unwind_Context *ctx) {
const uint8_t *lsda = _Unwind_GetLanguageSpecificData(ctx);
if (lsda == 0)
return _URC_CONTINUE_UNWIND;
uintptr_t func = _Unwind_GetRegionStart(ctx);
uintptr_t pc = _Unwind_GetIP(ctx) - 1 - func;
if (*lsda++ != 255) // Skip LPStart
readULEB128(&lsda);
if (*lsda++ != 255) // Skip TType
readULEB128(&lsda);
uintptr_t call_site_table_len = 0;
if (*lsda++ == 1)
call_site_table_len = readULEB128(&lsda);
const uint8_t *end = lsda + call_site_table_len;
while (lsda < end) {
uintptr_t start = readULEB128(&lsda), len = readULEB128(&lsda),
lpad = readULEB128(&lsda);
if (!(start <= pc && pc < start + len))
continue;
if (lpad == 0)
return _URC_CONTINUE_UNWIND;
if (actions & _UA_SEARCH_PHASE)
return _URC_HANDLER_FOUND;
_Unwind_SetGR(ctx, __builtin_eh_return_data_regno(0), (uintptr_t)obj);
_Unwind_SetGR(ctx, __builtin_eh_return_data_regno(1), 0); // switchValue==0
_Unwind_SetIP(ctx, func + lpad);
return _URC_INSTALL_CONTEXT;
}
return _URC_FATAL_PHASE2_ERROR;
}
struct Catch {
struct _Unwind_Exception *obj;
jmp_buf env;
bool do_catch;
};
__attribute__((used))
static void my_jump(struct Catch *c) {
if (c->do_catch) {
struct __cxa_eh_globals *globals = __cxa_get_globals();
globals->uncaughtExceptions--;
longjmp(c->env, 1);
}
}
__attribute__((naked)) static void my_cleanup(struct Catch *c) {
asm("movq %rax, (%rdi); jmp my_jump");
}
void my_catch() {
__attribute__((cleanup(my_cleanup))) struct Catch c;
if (setjmp(c.env) == 0) {
c.do_catch = 1;
throw_exception();
} else {
fprintf(stderr, "caught exception: %p\n", c.obj);
fprintf(stderr, "value: %d\n", *(int *)(c.obj + 1));
c.do_catch = 0;
}
}
```
```
% clang -c -fexceptions a.cc b.c
% clang++ a.o b.o
% ./a.out
uncaught exceptions: 0
caught exception: 0x10f7f10
value: 42
uncaught exceptions: 0
```
### Rethrow
The landing pad section briefly described the code executed by rethrow. Usually
caught exception will be destroyed in `__cxa_end_catch`, so `__cxa_rethrow`
will mark the exception object and increase `handlerCount`.
C++11 introduced Exception Propagation (N2179; `std::rethrow_exception` etc),
and libstdc++ uses `__cxa_dependent_exception` to achieve. For design see
https://gcc.gnu.org/legacy-ml/libstdc++/2008-05/msg00079.html
```c
struct __cxa_dependent_exception {
void *reserve;
void *primaryException;
};
```
`std::current_exception` and `std::rethrow_exception` will increase the
reference count.
In libstdc++, `__cxa_rethrow` calls GCC extension `_Unwind_Resume_or_Rethrow`
which can resume forced unwinding.
### LLVM IR
In construction.
* nounwind: cannot unwind
* unwtables: force generation of the unwind table regardless of nounwind
```
if uwtables
if nounwind
CantUnwind
else
Unwind Table
else
do nothing
```
## Compiler behavior
* `-fno-exceptions -fno-asynchronous-unwind-tables`: neither `.eh_frame` nor
`.gcc_except_table` exists
* `-fno-exceptions -fasynchronous-unwind-tables`: `.eh_frame` exists,
`.gcc_except_table` doesn't
* `-fexceptions`: both `.eh_frame` and `.gcc_except_table` exist
When an exception propagates from a function to its caller:
* no `.eh_frame`: `_Unwind_RaiseException` returns `_URC_END_OF_STACK`.
`__cxa_throw` calls `std::terminate`
* `.eh_frame` without `.gcc_except_table`: pass-through (local variable
destructors are not called). This is the case of
`-fno-exceptions -fasynchronous-unwind-tables`.
* `.eh_frame` with empty `.gcc_except_table`: `__gxx_personality_v0` calls
`std::terminate` since no call site code range matches
* `.eh_frame` with proper `.gcc_except_table`: unwind
Combined with the above description, when an exception will propagate to a
caller of a noexcept function:
* `-fno-exceptions -fno-asynchronous-unwind-tables`: `std::terminate`
* `-fno-exceptions -fasynchronous-unwind-tables`: pass-through. Local variable
destructors are not called.
* `-fexceptions`
* Clang: `.gcc_except_table` which arranges for calling `__clang_call_terminate`
* GCC: produce header-only `.gcc_except_table` (4 bytes)
## Misc
### Use libc++ and libc++abi
On Linux, compared with clang, clang++ additionally links against
libstdc++/libc++ and libm.
Dynamically link against libc++.so (which depends on libc++abi.so)
(additionally specify `-pthread` if threads are used):
```
clang++ -stdlib=libc++ -nostdlib++ a.cc -lc++ -lc++abi
# clang -stdlib=libc++ a.cc -lc++ -lc++abi does not pass -lm to the linker.
```
If compile actions and link actions are separate (`-stdlib=libc++` passes
`-lc++` but its position is undesired, so just don't use it):
```
clang++ -nostdlib++ a.cc -lc++ -lc++abi
```
Statically link in libc++.a (which includes the members of libc++abi.a). This
requires a `-DLIBCXX_ENABLE_STATIC_ABI_LIBRARY=on` build:
```
clang++ -stdlib=libc++ -static-libstdc++ -nostdlib++ a.cc -pthread
```
Statically link in libc++.a and libc++abi.a. This is a bit inferior because
there is a duplicate -lc++ passed by the driver.
```
clang++ -stdlib=libc++ -static-libstdc++ -nostdlib++ a.cc -Wl,--push-state,-Bstatic -lc++ -lc++abi -Wl,--pop-state -pthread
```
### libc++abi and libsupc++
It is worth noting that the `<exception> <stdexcept>` type layout provided by
libc++abi (such as `logic_error`, `runtime_error`, etc.) are specifically
compatible with libsupc++. After GCC 5 libstdc++ abandoned ref-counted
`std::string`, libsupc++ still uses `__cow_string` for `logic_error` and other
exception classes. libc++abi uses a similar ref-counted string.
libsupc++ and libc++abi do not use inline namespace and have conflicting symbol
names. Therefore, usually a libc++/libc++abi application cannot use a shared
object (ODR violation) of a dynamically linked `libstdc++.so`.
If you make some efforts, you can still solve this problem: compile the
non-libsupc++ part of libstdc++ to get self-made `libstdc++.so.6`. The
executable file link libc++abi provides the C++ ABI symbols required by
`libstdc++.so.6`.
### Monolithic `.gcc_except_table`
Prior to Clang 12, a monolithic `.gcc_except_table` was used. Like many other
metadata sections, the main problem with the monolithic sections is that they
cannot be garbage collected by the linker. For RISC-V `-mrelax` and basic block
sections, there is a bigger problem: `.gcc_except_table` has relocations pointing
to text sections local symbols. If the pointed text sections are discarded in
the COMDAT group, these relocations will be rejected by the linker (`error:
relocation refers to a symbol in a discarded section`).
![`.eh_frame` with monolithic `.gcc_except_table`](img/eh_frame_and_monolithic_gcc_except_table.svg)
![monolithic `.gcc_except_table`](img/monolithic_gcc_except_table.svg)
The solution is to use fragmented
`.gcc_except_table`(https://reviews.llvm.org/D83655).
![fragmented `.gcc_except_table`](img/fragmented_gcc_except_table.svg)
But the actual deployment is not that simple :) LLD processes `--gc-sections`
first (it is not clear which `.eh_frame` pieces are live), and then processes
(and garbage collects) `.eh_frame`.
During `--gc-sections`, all `.eh_frame` pieces are live. They will mark all
`.gcc_except_table.*` live. According to the GC rules of the section group, a
`.gcc_except_table.*` will mark other sections (including `.text.*`) live in
the same section group. The result is that `.text.*` in all section groups
cannot be GC, resulting in increased input size.
![bad GC with `.gcc_except_table.*`](img/lsda_gc.svg)
https://reviews.llvm.org/D91579 fixed this problem: For `.eh_frame`, do not mark
`.gcc_except_table` in section group.
![good GC with `.gcc_except_table.*`](img/lsda_gc_new.svg)
### `clang -fbasic-block-sections=`
This option produces one section for each basic block (more aggressive than
`-ffunction-sections`) for aggressive machine basic block optimizations. There
are some challenges integrating LSDA into this framework.
You can either allocate a `.gcc_except_table` for each basic block section
needing LSDA, or let all basic block sections use the same `.gcc_except_table`.
The LLVM implementation chose the latter, which has several advantages:
* No duplicate headers
* Sharable type table
* Sharable action table (this only matters for the deprecated exception
specification)
There is only one LPStart when using the same `.gcc_except_table`, and it is
necessary to ensure that all offsets from landing pads to LPStart can be
represented by relocations. Because most architectures do not have a difference
relocation type (`R_RISCV_SUB*`), placing landing pads in the same section is
the choice.
### Exception handling ABI for the ARM architecture
The overall structure is the same as Itanium C++ ABI: Exception Handling, with
some differences in data structure, `_Unwind_*`, etc.
[Stack unwinding](maskray-1.md) contains a few notes.
### Compact Exception Tables for MIPS ABIs
In construction.
Use `.eh_frame_entry` and `.gnu_extab` to describe.
Design thoughts:
* Exception code ranges are sorted and must be linearly searched. Therefore it
would be more compact to specify each relative to the previous one, rather
than relative to a fixed base.
* The landing pad is often close to the exception region that uses it.
Therefore it is better to use the end of the exception region as the
reference point, than use the function base address.
* The action table can be integrated directly with the exception region
definition itself. This removes one indirection. The threading of actions can
still occur, by providing an offset to the next exception encoding of
interest.
* Often the action threading is to the next exception region, so optimizing
that case is important.
* Catch types and exception specification type lists cannot easily be encoded
inline with the exception regions themselves. It is necessary to preserve the
unique indices that are automatically created by the DWARF scheme.
It uses compact unwind descriptors similar to ARM EH. Builtin PR1 means there
is no language-dependent data, Builtin PR2 is used for C/C++