airs-notes/linkers-16.md

88 lines
4.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Linkers part 16
## C++ Template Instantiation
There is still more C++ fun at link time, though somewhat less related to the
linker proper. A C++ program can declare templates, and instantiate them with
specific types. Ideally those specific instantiations will only appear once in
a program, not once per source file which instantiates the templates. There are
a few ways to make this work.
For object file formats which support COMDAT and vague linkage, which I
described yesterday, the simplest and most reliable mechanism is for the
compiler to generate all the template instantiations required for a source file
and put them into the object file. They should be marked as COMDAT, so that the
linker discards all but one copy. This ensures that all template instantiations
will be available at link time, and that the executable will have only one
copy. This is what gcc does by default for systems which support it. The
obvious disadvantages are the time required to compile all the duplicate
template instantiations and the space they take up in the object files. This is
sometimes called the Borland model, as this is what Borlands C++ compiler did.
Another approach is to not generate any of the template instantiations at
compile time. Instead, when linking, if we need a template instantiation which
is not found, invoke the compiler to build it. This can be done either by
running the linker and looking for error messages or by using a linker plugin
to handle an undefined symbol error. The difficulties with this approach are to
find the source code to compile and to find the right options to pass to the
compiler. Typically the source code is placed into a repository file of some
sort at compile time, so that it is available at link time. The complexities of
getting the compilation steps right are why this approach is not the default.
When it works, though, it can be faster than the duplicate instantiation
approach. This is sometimes called the Cfront model.
gcc also supports explicit template instantiation, which can be used to control
exactly where templates are instantiated. This approach can work if you have
complete control over your source code base, and can instantiate all required
templates in some central place. This approach is used for gccs C++ library,
libstdc++.
C++ defines a keyword export which is supposed to permit exporting template
definitions in such a way that they can be read back in by the compiler. gcc
does not support this keyword. If it worked, it could be a slightly more
reliable way of using a repository when using the Cfront model.
## Exception Frames
C++ and other languages support exceptions. When an exception is thrown in one
function and caught in another, the program needs to reset the stack pointer
and registers to the point where the exception is caught. While resetting the
stack pointer, the program needs to identify all local variables in the part of
the stack being discarded, and run their destructors if any. This process is
known as unwinding the stack.
The information needed to unwind the stack is normally stored in tables in the
program. Supporting library code is used to read the tables and perform the
necessary operations. Im not going to describe the details of those tables
here. However, there is a linker optimization which applies to them.
The support libraries need to be able to find the exception tables at runtime
when an exception occurs. An exception can be thrown in one shared library and
caught in a different shared library, so finding all the required exception
tables can be a nontrivial operation. One approach that can be used is to
register the exception tables at program startup time or shared library load
time. The registration can be done at the right time using the global
constructor mechanism.
However, this approach imposes a runtime cost for exceptions, in that it takes
longer for the program to start. Therefore, this is not ideal. The linker can
optimize this by building tables which can be used to find the exception
tables. The tables built by the GNU linker are sorted for fast lookup by the
runtime library. The tables are put into a `PT_GNU_EH_FRAME` segment. The
supporting libraries then need a way to look up a segment of this type. This is
done via the `dl_iterate_phdr` API provided by the GNU dynamic linker.
Note that if the compiler believes that the linker will generate a
`PT_GNU_EH_FRAME` segment, it wont generate the startup code to register the
exception tables. Thus the linker must not fail to create this segment.
Since the GNU linker needs to look at the exception tables in order to generate
the `PT_GNU_EH_FRAME` segment, it will also optimize by discarding duplicate
exception table information.
I know this is section is rather short on details. I hope the general idea is
clear.
More tomorrow.