111 lines
5.9 KiB
Markdown
111 lines
5.9 KiB
Markdown
# Linkers part 12
|
||
|
||
I apologize for the pause in posts. We moved over the weekend. Last Friday AT&T
|
||
told me that the new DSL was working at our new house. However, it did not
|
||
actually start working outside the house until Wednesday. Then a problem with
|
||
the internal wiring meant that it was not working inside the house until today.
|
||
I am now finally back online at home.
|
||
|
||
## Symbol Resolution
|
||
|
||
I find that symbol resolution is one of the trickier aspects of a linker.
|
||
Symbol resolution is what the linker does the second and subsequent times that
|
||
it sees a particular symbol. I’ve already touched on the topic in a few
|
||
previous entries, but let’s look at it in a bit more depth.
|
||
|
||
Some symbols are local to a specific object files. We can ignore these for the
|
||
purposes of symbol resolution, as by definition the linker will never see them
|
||
more than once. In ELF these are the symbols with a binding of `STB_LOCAL`.
|
||
|
||
In general, symbols are resolved by name: every symbol with the same name is
|
||
the same entity. We’ve already seen a few exceptions to that general rule. A
|
||
symbol can have a version: two symbols with the same name but different
|
||
versions are different symbols. A symbol can have non-default visibility: a
|
||
symbol with hidden visibility in one shared library is not the same as a symbol
|
||
with the same name in a different shared library.
|
||
|
||
The characteristics of a symbol which matter for resolution are:
|
||
|
||
* The symbol name
|
||
* The symbol version.
|
||
* Whether the symbol is the default version or not.
|
||
* Whether the symbol is a definition or a reference or a common symbol.
|
||
* The symbol visibility.
|
||
* Whether the symbol is weak or strong (i.e., non-weak).
|
||
* Whether the symbol is defined in a regular object file being included in the
|
||
output, or in a shared library.
|
||
* Whether the symbol is thread local.
|
||
* Whether the symbol refers to a function or a variable.
|
||
|
||
The goal of symbol resolution is to determine the final value of the symbol.
|
||
After all symbols are resolved, we should know the specific object file or
|
||
shared library which defines the symbol, and we should know the symbol’s type,
|
||
size, etc. It is possible that some symbols will remain undefined after all the
|
||
symbol tables have been read; in general this is only an error if some
|
||
relocation refers to that symbol.
|
||
|
||
At this point I’d like to present a simple algorithm for symbol resolution, but
|
||
I don’t think I can. I’ll try to hit all the high points, though. Let’s assume
|
||
that we have two symbols with the same name. Let’s call the symbol we saw first
|
||
A and the new symbol B. (I’m going to ignore symbol visibility in the algorithm
|
||
below; the effects of visibility should be obvious, I hope.)
|
||
|
||
1. If A has a version:
|
||
* If B has a version different from A, they are actually different symbols.
|
||
* If B has the same version as A, they are the same symbol; carry on.
|
||
* If B does not have a version, and A is the default version of the symbol,
|
||
they are the same symbol; carry on.
|
||
* Otherwise B is probably a different symbol. But note that if A and B are
|
||
both undefined references, then it is possible that A refers to the default
|
||
version of the symbol but we don’t yet know that. In that case, if B does
|
||
not have a version, A and B really are the same symbol. We can’t tell until
|
||
we see the actual definition.
|
||
2. If A does not have a version:
|
||
* If B does not have a version, they are the same symbol; carry on.
|
||
* If B has a version, and it is the default version, they are the same
|
||
symbol; carry on.
|
||
* Otherwise, B is probably a different symbol, as above.
|
||
3. If A is thread local and B is not, or vice-versa, then we have an error.
|
||
4. If A is an undefined reference:
|
||
* If B is an undefined reference, then we can complete the resolution, and
|
||
more or less ignore B.
|
||
* If B is a definition or a common symbol, then we can resolve A to B.
|
||
5. If A is a strong definition in an object file:
|
||
* If B is an undefined reference, then we resolve B to A.
|
||
* If B is a strong definition in an object file, then we have a multiple
|
||
definition error.
|
||
* If B is a weak definition in an object file, then A overrides B. In effect,
|
||
B is ignored.
|
||
* If B is a common symbol, then we treat B as an undefined reference.
|
||
* If B is a definition in a shared library, then A overrides B. The dynamic
|
||
linker will change all references to B in the shared library to refer to A
|
||
instead.
|
||
6. If A is a weak definition in an object file, we act just like the strong
|
||
definition case, with one exception: if B is a strong definition in an
|
||
object file. In the original SVR4 linker, this case was treated as a
|
||
multiple definition error. In the Solaris and GNU linkers, this case is
|
||
handled by letting B override A.
|
||
7. If A is a common symbol in an object file:
|
||
* If B is a common symbol, we set the size of A to be the maximum of the size
|
||
of A and the size of B, and then treat B as an undefined reference.
|
||
* If B is a definition in a shared library with function type, then A
|
||
overrides B (this oddball case is required to correctly handle some Unix
|
||
system libraries).
|
||
* Otherwise, we treat A as an undefined reference.
|
||
8. If A is a definition in a shared library, then if B is a definition in a
|
||
regular object (strong or weak), it overrides A. Otherwise we act as though
|
||
A were defined in an object file.
|
||
9. If A is a common symbol in a shared library, we have a funny case. Symbols
|
||
in shared libraries must have addresses, so they can’t be common in the same
|
||
sense as symbols in an object file. But ELF does permit symbols in a shared
|
||
library to have the type `STT_COMMON` (this is a relatively recent
|
||
addition). For purposes of symbol resolution, if A is a common symbol in a
|
||
shared library, we still treat it as a definition, unless B is also a common
|
||
symbol. In the latter case, B overrides A, and the size of B is set to the
|
||
maximum of the size of A and the size of B.
|
||
|
||
I hope I got all that right.
|
||
|
||
More tomorrow, assuming the Internet connection holds up.
|
||
|