88 lines
4.8 KiB
Markdown
88 lines
4.8 KiB
Markdown
# Linkers part 16
|
||
|
||
## C++ Template Instantiation
|
||
|
||
There is still more C++ fun at link time, though somewhat less related to the
|
||
linker proper. A C++ program can declare templates, and instantiate them with
|
||
specific types. Ideally those specific instantiations will only appear once in
|
||
a program, not once per source file which instantiates the templates. There are
|
||
a few ways to make this work.
|
||
|
||
For object file formats which support COMDAT and vague linkage, which I
|
||
described yesterday, the simplest and most reliable mechanism is for the
|
||
compiler to generate all the template instantiations required for a source file
|
||
and put them into the object file. They should be marked as COMDAT, so that the
|
||
linker discards all but one copy. This ensures that all template instantiations
|
||
will be available at link time, and that the executable will have only one
|
||
copy. This is what gcc does by default for systems which support it. The
|
||
obvious disadvantages are the time required to compile all the duplicate
|
||
template instantiations and the space they take up in the object files. This is
|
||
sometimes called the Borland model, as this is what Borland’s C++ compiler did.
|
||
|
||
Another approach is to not generate any of the template instantiations at
|
||
compile time. Instead, when linking, if we need a template instantiation which
|
||
is not found, invoke the compiler to build it. This can be done either by
|
||
running the linker and looking for error messages or by using a linker plugin
|
||
to handle an undefined symbol error. The difficulties with this approach are to
|
||
find the source code to compile and to find the right options to pass to the
|
||
compiler. Typically the source code is placed into a repository file of some
|
||
sort at compile time, so that it is available at link time. The complexities of
|
||
getting the compilation steps right are why this approach is not the default.
|
||
When it works, though, it can be faster than the duplicate instantiation
|
||
approach. This is sometimes called the Cfront model.
|
||
|
||
gcc also supports explicit template instantiation, which can be used to control
|
||
exactly where templates are instantiated. This approach can work if you have
|
||
complete control over your source code base, and can instantiate all required
|
||
templates in some central place. This approach is used for gcc’s C++ library,
|
||
libstdc++.
|
||
|
||
C++ defines a keyword export which is supposed to permit exporting template
|
||
definitions in such a way that they can be read back in by the compiler. gcc
|
||
does not support this keyword. If it worked, it could be a slightly more
|
||
reliable way of using a repository when using the Cfront model.
|
||
|
||
## Exception Frames
|
||
|
||
C++ and other languages support exceptions. When an exception is thrown in one
|
||
function and caught in another, the program needs to reset the stack pointer
|
||
and registers to the point where the exception is caught. While resetting the
|
||
stack pointer, the program needs to identify all local variables in the part of
|
||
the stack being discarded, and run their destructors if any. This process is
|
||
known as unwinding the stack.
|
||
|
||
The information needed to unwind the stack is normally stored in tables in the
|
||
program. Supporting library code is used to read the tables and perform the
|
||
necessary operations. I’m not going to describe the details of those tables
|
||
here. However, there is a linker optimization which applies to them.
|
||
|
||
The support libraries need to be able to find the exception tables at runtime
|
||
when an exception occurs. An exception can be thrown in one shared library and
|
||
caught in a different shared library, so finding all the required exception
|
||
tables can be a nontrivial operation. One approach that can be used is to
|
||
register the exception tables at program startup time or shared library load
|
||
time. The registration can be done at the right time using the global
|
||
constructor mechanism.
|
||
|
||
However, this approach imposes a runtime cost for exceptions, in that it takes
|
||
longer for the program to start. Therefore, this is not ideal. The linker can
|
||
optimize this by building tables which can be used to find the exception
|
||
tables. The tables built by the GNU linker are sorted for fast lookup by the
|
||
runtime library. The tables are put into a `PT_GNU_EH_FRAME` segment. The
|
||
supporting libraries then need a way to look up a segment of this type. This is
|
||
done via the `dl_iterate_phdr` API provided by the GNU dynamic linker.
|
||
|
||
Note that if the compiler believes that the linker will generate a
|
||
`PT_GNU_EH_FRAME` segment, it won’t generate the startup code to register the
|
||
exception tables. Thus the linker must not fail to create this segment.
|
||
|
||
Since the GNU linker needs to look at the exception tables in order to generate
|
||
the `PT_GNU_EH_FRAME` segment, it will also optimize by discarding duplicate
|
||
exception table information.
|
||
|
||
I know this is section is rather short on details. I hope the general idea is
|
||
clear.
|
||
|
||
More tomorrow.
|
||
|