463 lines
22 KiB
Markdown
463 lines
22 KiB
Markdown
|
# Copy relocations, canonical PLT entries and protected visibility
|
||
|
|
||
|
Background:
|
||
|
|
||
|
* `-fno-pic` can only be used by executables. On most platforms and
|
||
|
architectures, direct access relocations are used to reference external data
|
||
|
symbols.
|
||
|
* `-fpic` can be used by both executables and shared objects. Windows has
|
||
|
`__declspec(dllimport)` but most other binary formats allow a default
|
||
|
visibility external data to be resolved to a shared object, so generally
|
||
|
direct access relocations are disallowed.
|
||
|
* `-fpie` was introduced as a mode similar to `-fpic` for ELF: the compiler can
|
||
|
make the assumption that the produced object file can only be used by
|
||
|
executables, thus all definitions are non-preemptible and thus
|
||
|
interprocedural optimizations can apply on them.
|
||
|
|
||
|
For
|
||
|
|
||
|
```c
|
||
|
extern int a;
|
||
|
int *foo() { return &a; }
|
||
|
```
|
||
|
|
||
|
`-fno-pic` typically produces an absolute relocation (a PC-relative relocation
|
||
|
can be used as well). On ELF x86-64 it is usually `R_X86_64_32` in the position
|
||
|
dependent small code model. If a is defined in the executable (by another
|
||
|
translation unit), everything works fine. If a turns out to be defined in a
|
||
|
shared object, its real address will be non-constant at link time. Either
|
||
|
action needs to be taken:
|
||
|
|
||
|
* Emit a dynamic relocation in every use site. Text sections are usually
|
||
|
non-writable. A dynamic relocation applied on a non-writable section is
|
||
|
called a text relocation.
|
||
|
* Emit a single copy relocation. Copy relocations only work for executables.
|
||
|
The linker obtains the size of the symbol, allocates the bytes in `.bss`
|
||
|
(this may make the object writable. On LLD a readonly area may be picked.),
|
||
|
and emit an `R_*_COPY` relocation. All references resolve to the new location.
|
||
|
|
||
|
Multiple text relocations are even less acceptable, so on ELF a copy relocation
|
||
|
is generally used. Here is a nice description from [Rich
|
||
|
Felker](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55012): "Copy relocations
|
||
|
are not a case of overriding the definition in the abstract machine, but an
|
||
|
implementation detail used to support data objects in shared libraries when the
|
||
|
main program is non-PIC."
|
||
|
|
||
|
Copy relocations have drawbacks:
|
||
|
|
||
|
* Break page sharing.
|
||
|
* Make the symbol properties (e.g. size) part of ABI.
|
||
|
* If the shared object is linked with `-Bsymbolic` or `--dynamic-list` and
|
||
|
defines a data symbol copy relocated by the executable, the address of the
|
||
|
symbol may be different in the shared object and in the executable.
|
||
|
|
||
|
What went poorly was that `-fno-pic` code had no way to avoid copy relocations
|
||
|
on ELF. Traditionally copy relocations could only occur in `-fno-pic` code. A
|
||
|
GCC 5 change made this possible for x86-64. Please read on.
|
||
|
|
||
|
## x86-64: copy relocations and `-fpie`
|
||
|
|
||
|
`-fpic` using GOT indirection for external data symbols has cost. Making
|
||
|
`-fpie` similar to `-fpic` in this regard incurs costs if the data symbol turns
|
||
|
out to be defined in the executable. Having the data symbol defined in another
|
||
|
translation unit linked into the executable is very common, especially if the
|
||
|
vendor uses fully/mostly statically linking mode.
|
||
|
|
||
|
In GCC 5, ["x86-64: Optimize access to globals in PIE with copy
|
||
|
reloc"](https://gcc.gnu.org/git/?p=gcc.git&a=commit;h=77ad54d911dd7cb88caf697ac213929f6132fdcf)
|
||
|
started to use direct access relocations for external data symbols on x86-64 in
|
||
|
`-fpie` mode.
|
||
|
|
||
|
```c
|
||
|
extern int a;
|
||
|
int foo() { return a; }
|
||
|
```
|
||
|
|
||
|
* GCC<5: `movq a@GOTPCREL(%rip), %rax; movl (%rax), %eax` (8 bytes)
|
||
|
* GCC>=5: `movl a(%rip), %eax` (6 bytes)
|
||
|
|
||
|
This change is actually useful for architectures other than x86-64 but is never
|
||
|
implemented for other architectures. What went wrong: the change was
|
||
|
implemented as an inflexible configure-time choice (`HAVE_LD_PIE_COPYRELOC`),
|
||
|
defaulting to such a behavior if ld supports PIE copy relocations (most
|
||
|
binutils installations). Keep in mind that such a `-fpie` default [breaks
|
||
|
`-Bsymbolic` and `--dynamic-list` in shared objects](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65888).
|
||
|
|
||
|
Clang addressed the inflexible configure-time choice via an opt-in option
|
||
|
`-mpie-copy-relocations` (D19996).
|
||
|
|
||
|
I noticed that:
|
||
|
|
||
|
* The option can be used for `-fno-pic` code as well to prevent copy
|
||
|
relocations on ELF. This is occasionally users want (if their shared objects
|
||
|
use `-Bsymbolic` and export data symbols (usually undesired from API
|
||
|
perspecitives but can avoid costs at times)), and they switch from `-fno-pic`
|
||
|
to `-fpic` just for this purpose.
|
||
|
* The option name should describe the code generation behavior, instead of the
|
||
|
inferred behavior at the linking stage on a partibular binary format.
|
||
|
* The option does not need to tie to ELF.
|
||
|
* On COFF, the behavior is like always `-fdirect-access-external-data`.
|
||
|
`__declspec(dllimport)` is needed to enable indirect access.
|
||
|
* On Mach-O, the behavior is like `-fdirect-access-external-data` for
|
||
|
`-fno-pic` (only available on arm) and the opposite for `-fpic`.
|
||
|
* H.J. Lu introduced `R_X86_64_GOTPCRELX` and `R_X86_64_REX_GOTPCRELX` as GOT
|
||
|
optimization to x86-64 psABI. This is great! With the optimization, GOT
|
||
|
indirection can be optimized, so the incured cost is very low now.
|
||
|
|
||
|
So I proposed an alternative option `-f[no-]direct-access-external-data`:
|
||
|
https://reviews.llvm.org/D92633
|
||
|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98112. My wish on the GCC side is
|
||
|
to drop `HAVE_LD_PIE_COPYRELOC` and (x86-64) default to GOT indirection for
|
||
|
external data symbols in `-fpie` mode.
|
||
|
|
||
|
Please keep in mind that `-f[no-]semantic-interposition` is for definitions
|
||
|
while `-f[no-]direct-access-external-data` is for undefined data symbols. GCC 5
|
||
|
introduced `-fno-semantic-interposition` to use local aliases for references to
|
||
|
definitions in the same translation unit.
|
||
|
|
||
|
## `STV_PROTECTED`
|
||
|
|
||
|
Now let's consider how `STV_PROTECTED` comes into play. Here is the generic ABI
|
||
|
definition:
|
||
|
|
||
|
> A symbol defined in the current component is protected if it is visible in
|
||
|
> other components but not preemptable, meaning that any reference to such a
|
||
|
> symbol from within the defining component must be resolved to the definition
|
||
|
> in that component, even if there is a definition in another component that
|
||
|
> would preempt by the default rules. A symbol with `STB_LOCAL` binding may not
|
||
|
> have `STV_PROTECTED` visibility. If a symbol definition with `STV_PROTECTED`
|
||
|
> visibility from a shared object is taken as resolving a reference from an
|
||
|
> executable or another shared object, the `SHN_UNDEF` symbol table entry
|
||
|
> created has `STV_DEFAULT` visibility.
|
||
|
|
||
|
A non-local `STV_DEFAULT` defined symbol is by default preemptible in a shared
|
||
|
object on ELF. `STV_PROTECTED` can make the symbol non-preemptible. You may
|
||
|
have noticed that I use "preemptible" while the generic ABI uses "preemptable"
|
||
|
and LLVM IR uses "`dso_preemptable`". Both forms work. "preemptible" is my
|
||
|
opition because it is more common.
|
||
|
|
||
|
### Protected data symbols and copy relocations
|
||
|
|
||
|
Many folks consider that copy relocations are best-effort support provided by
|
||
|
the toolchain. `STV_PROTECTED` is intended as an optimization and the
|
||
|
optimization can error out if it can't be done for whatever reason. Since copy
|
||
|
relocations are already oftentimes unacceptable, it is natural to think that we
|
||
|
should just disallow copy relocations on protected data symbols.
|
||
|
|
||
|
However, GNU ld 2.26 made a change which enabled copy relocations on protected
|
||
|
data symbols for i386 and x86-64.
|
||
|
|
||
|
A glibc change ["Add `ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA` to
|
||
|
x86"](https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=62da1e3b00b51383ffa7efc89d8addda0502e107)
|
||
|
is needed to make copy relocations on protected data symbols work.
|
||
|
["[AArch64][BZ #17711] Fix extern protected data handling"](https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=0910702c4d2cf9e8302b35c9519548726e1ac489)
|
||
|
and ["[ARM][BZ #17711] Fix extern protected data handling"](https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=3bcea719ddd6ce399d7bccb492c40af77d216e42)
|
||
|
ported the thing to arm and aarch64.
|
||
|
|
||
|
Despite the glibc support, GNU ld aarch64 errors relocation
|
||
|
`R_AARCH64_ADR_PREL_PG_HI21` against symbol `foo` which may bind externally can
|
||
|
not be used when making a shared object; recompile with `-fPIC`.
|
||
|
|
||
|
powerpc64 ELFv2 is interesting: TOC indirection (TOC is a variant of GOT) is
|
||
|
used everywhere, data symbols normally have no direct access relocations, so
|
||
|
this is not a problem.
|
||
|
|
||
|
```c
|
||
|
// b.c
|
||
|
__attribute__((visibility("protected"))) int foo;
|
||
|
// a.c
|
||
|
extern int foo;
|
||
|
int main() { return foo; }
|
||
|
```
|
||
|
|
||
|
```
|
||
|
gcc -fuse-ld=bfd -fpic -shared b.c -o b.so
|
||
|
gcc -fuse-ld=bfd -pie -fno-pic a.c ./b.so
|
||
|
```
|
||
|
|
||
|
gold does not allow copy relocations on protected data symbols, but it misses
|
||
|
some cases: https://sourceware.org/bugzilla/show_bug.cgi?id=19823.
|
||
|
|
||
|
### Protected data symbols and direct accesses
|
||
|
|
||
|
If a protected data symbol in a shared object is copy relocated, allowing
|
||
|
direct accesses will cause the shared object to operate on a different copy
|
||
|
from the executable. Therefore, direct accesses to protected data symbols have
|
||
|
to be disallowed in `-fpic` code, just in case the symbols may be copy
|
||
|
relocated. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65248 changed GCC 5 to
|
||
|
use GOT indirection for protected external data.
|
||
|
|
||
|
```c
|
||
|
__attribute__((visibility("protected"))) int foo;
|
||
|
int val() { return foo; }
|
||
|
// -fPIC: GOT on at least aarch64, arm, i386, x86-64
|
||
|
```
|
||
|
|
||
|
This caused unneeded pessimization for protected external data. Clang always
|
||
|
treats protected similar to hidden/internal.
|
||
|
|
||
|
For older GCC (and all versions of Clang), direct accesses are produced in
|
||
|
`-fpic` code. Mixing such object files can silently break copy relocations on
|
||
|
protected data symbols. Therefore, GNU ld made the change
|
||
|
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=ca3fe95e469b9daec153caa2c90665f5daaec2b5
|
||
|
to error in `-shared` mode.
|
||
|
|
||
|
```
|
||
|
% cat a.s
|
||
|
leaq foo(%rip), %rax
|
||
|
|
||
|
.data
|
||
|
.global foo
|
||
|
.protected foo
|
||
|
foo:
|
||
|
```
|
||
|
```
|
||
|
% gcc -fuse-ld=bfd -shared a.s
|
||
|
/usr/bin/ld.bfd: /tmp/ccchu3Xo.o: relocation R_X86_64_PC32 against protected symbol `foo' can not be used when making a shared object
|
||
|
/usr/bin/ld.bfd: final link failed: bad value
|
||
|
collect2: error: ld returned 1 exit status
|
||
|
```
|
||
|
|
||
|
This led to a heated discussion
|
||
|
https://sourceware.org/legacy-ml/binutils/2016-03/msg00312.html. Swift folks
|
||
|
noticed this https://bugs.swift.org/browse/SR-1023 and their reaction was to
|
||
|
switch from GNU ld to gold.
|
||
|
|
||
|
GNU ld's aarch64 port does not have the diagnostic.
|
||
|
|
||
|
binutils commit ["x86: Clear `extern_protected_data` for
|
||
|
`GNU_PROPERTY_NO_COPY_ON_PROTECTED`"](https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=73784fa565bd66f1ac165816c03e5217b7d67bbc)
|
||
|
introduced
|
||
|
`GNU_PROPERTY_NO_COPY_ON_PROTECTED`. With this property, `ld -shared` will not
|
||
|
error for relocation `R_X86_64_PC32` against protected symbol `foo` can not be
|
||
|
used when making a shared object.
|
||
|
|
||
|
The two issues above are the costs enabling copy relocations on protected data
|
||
|
symbols. Personally I don't think copy relocations on protected data symbols
|
||
|
are actually leveraged. GNU ld's x86 port can just (1) reject such copy
|
||
|
relocations and (2) allow direct accesses referencing protected data symbols in
|
||
|
`-shared` mode. But I am not really clear about the glibc case. I wish
|
||
|
`GNU_PROPERTY_NO_COPY_ON_PROTECTED` can become the default or be phased out in
|
||
|
the future.
|
||
|
|
||
|
### Protected function symbols and canonical PLT entries
|
||
|
|
||
|
```c
|
||
|
// b.c
|
||
|
__attribute__((visibility("protected"))) void *foo () {
|
||
|
return (void *)foo;
|
||
|
}
|
||
|
```
|
||
|
|
||
|
GNU ld's aarch64 and x86 ports rejects the above code. On many other
|
||
|
architectures including powerpc the code is supported.
|
||
|
|
||
|
```
|
||
|
% gcc -fpic -shared b.c -fuse-ld=bfd b.c -o b.so
|
||
|
/usr/bin/ld.bfd: /tmp/cc3Ay0Gh.o: relocation R_X86_64_PC32 against protected symbol `foo' can not be used when making a shared object
|
||
|
/usr/bin/ld.bfd: final link failed: bad value
|
||
|
collect2: error: ld returned 1 exit status
|
||
|
% gcc -shared -fuse-ld=bfd -fpic b.c -o b.so
|
||
|
/usr/bin/ld.bfd: /tmp/ccXdBqMf.o: relocation R_AARCH64_ADR_PREL_PG_HI21 against symbol `foo' which may bind externally can not be used when making a shared object; recompile with -fPIC
|
||
|
/tmp/ccXdBqMf.o: in function `foo':
|
||
|
a.c:(.text+0x0): dangerous relocation: unsupported relocation
|
||
|
collect2: error: ld returned 1 exit status
|
||
|
```
|
||
|
|
||
|
The rejection is mainly a historical issue to make pointer equality work with
|
||
|
`-fno-pic` code. The GNU ld idea is that:
|
||
|
|
||
|
* The compiler emits GOT-generating relocations for `-fpic` code (in reality it
|
||
|
does it for declarations but not for definitions).
|
||
|
* `-fno-pic` main executable uses direct access relocation types and gets a
|
||
|
canonical PLT entry.
|
||
|
* glibc ld.so resolves the GOT in the shared object to the canonical PLT entry.
|
||
|
|
||
|
Actually we can take the interepretation that a canonical PLT entry is
|
||
|
incompatible with a shared `STV_PROTECTED` definition, and reject the attempt
|
||
|
to create a canonical PLT entry (gold/LLD). And we can keep producing direct
|
||
|
access relocations referencing protected symbols for `-fpic` code.
|
||
|
`STV_PROTECTED` is no different from `STV_HIDDEN`.
|
||
|
|
||
|
On many architectures, a branch instruction uses a branch specific relocation
|
||
|
type (e.g. `R_AARCH64_CALL26`, `R_PPC64_REL24`, `R_RISCV_CALL_PLT`). This is
|
||
|
great because the address is insignificant and the linker can arrange for a
|
||
|
regular PLT if the symbol turns out to be external.
|
||
|
|
||
|
On i386, a branch in `-fno-pic` code emits an `R_386_PC32` relocation, which is
|
||
|
indistinguishable from an address taken operation. If the symbol turns out to
|
||
|
be external, the linker has to employ a tricky called "canonical PLT entry"
|
||
|
(`st_shndx=0, st_value!=0`). The term is a parlance within a few LLD
|
||
|
developers, but not broadly adopted.
|
||
|
|
||
|
```c
|
||
|
// a.c
|
||
|
extern void foo(void);
|
||
|
int main() { foo(); }
|
||
|
```
|
||
|
```
|
||
|
% gcc -m32 -shared -fuse-ld=bfd -fpic b.c -o b.so
|
||
|
% gcc -m32 -fno-pic -no-pie -fuse-ld=lld a.c ./b.so
|
||
|
|
||
|
% gcc -m32 -fno-pic a.c ./b.so -fuse-ld=lld
|
||
|
ld.lld: error: cannot preempt symbol: foo
|
||
|
>>> defined in ./b.so
|
||
|
>>> referenced by a.c
|
||
|
>>> /tmp/ccDGhzEy.o:(main)
|
||
|
collect2: error: ld returned 1 exit status
|
||
|
|
||
|
% gcc -m32 -fno-pic -no-pie a.c ./b.so -fuse-ld=bfd
|
||
|
# canonical PLT entry; foo has different addresses in a.out and b.so.
|
||
|
% gcc -m32 -fno-pic -pie a.c ./b.so -fuse-ld=bfd
|
||
|
/usr/bin/ld.bfd: /tmp/ccZ3Rl8Y.o: warning: relocation against `foo' in read-only section `.text'
|
||
|
/usr/bin/ld.bfd: warning: creating DT_TEXTREL in a PIE
|
||
|
% gcc -m32 -fno-pic -pie a.c ./b.so -fuse-ld=bfd -z text
|
||
|
/usr/bin/ld.bfd: /tmp/ccUv8wXc.o: warning: relocation against `foo' in read-only section `.text'
|
||
|
/usr/bin/ld.bfd: read-only segment has dynamic relocations
|
||
|
collect2: error: ld returned 1 exit status
|
||
|
```
|
||
|
|
||
|
This used to be a problem for x86-64 as well, until ["x86-64: Generate branch
|
||
|
with PLT32 relocation"](https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=bd7ab16b4537788ad53521c45469a1bdae84ad4a)
|
||
|
changed call/jmp foo to emit `R_X86_64_PLT32` instead of `R_X86_64_PC32`. Note:
|
||
|
(`-fpie`/`-fpic`) `call/jmp foo@PLT` always emits `R_X86_64_PLT32`.
|
||
|
|
||
|
The relocation type name is a bit misleading, `_PLT32` does not mean that a PLT
|
||
|
will always be created. Rather, it is optional: the linker can resolve `_PLT32`
|
||
|
to any place where the function will be called. If the symbol is preemptible,
|
||
|
the place is usually the PLT entry. If the symbol is non-preemptible, the
|
||
|
linker can convert `_PLT32` into `_PC32`. A function symbol can be either
|
||
|
branched or taken address. For an address taken operation, the function symbol
|
||
|
is used in a manner similar to a data symbol. `R_386_PLT32` cannot be used. LLD
|
||
|
and gold will just reject the link if text relocations are disabled.
|
||
|
|
||
|
On i386, my proposal is that branches to a default visibility function
|
||
|
declaration should use `R_386_PLT32` instead of `R_386_PC32`, in a manner
|
||
|
similar to x86-64. Originally I thought an assembler change sufficed:
|
||
|
https://sourceware.org/bugzilla/show_bug.cgi?id=27169. Please read the next
|
||
|
section why this should be changed on the compiler side.
|
||
|
|
||
|
### Non-default visibility ifunc and `R_386_PC32`
|
||
|
|
||
|
For a call to a hidden function declaration, the compiler produces an
|
||
|
`R_386_PC32` relocation. The relocation is an indicator that EBX may not be set
|
||
|
up.
|
||
|
|
||
|
If the declaration refers to an ifunc definition, the linker will resolve the
|
||
|
`R_386_PC32` to an IPLT entry. For `-pie` and `-shared` links, the IPLT entry
|
||
|
references EBX. If the call site does not set up EBX to be
|
||
|
`_GLOBAL_OFFSET_TABLE_`, the IPLT call will be incorrect.
|
||
|
|
||
|
GNU ld has implemented a diagnostic (["i686 ifunc and non-default symbol
|
||
|
visibility"](https://sourceware.org/bugzilla/show_bug.cgi?id=20515)) to catch
|
||
|
the problem. If we change `call/jmp foo` to always use `R_386_PLT32`, such a
|
||
|
diagnostic will be lost.
|
||
|
|
||
|
Can we change the compiler to emit `call/jmp foo@PLT` for default visibility
|
||
|
function declarations? If the compiler emits such a modifier but does not set
|
||
|
up EBX, the ifunc can still be non-preemptible (e.g. hidden in another
|
||
|
translation unit or `-Bsymbolic`) and we will still have a dilemma.
|
||
|
|
||
|
Personally, I think avoiding a canonical PLT entry is more useful than a ld
|
||
|
ifunc diagnostic. i386 ABI is legacy and the x86 maintainer will not make the
|
||
|
change, though.
|
||
|
|
||
|
## Summary
|
||
|
|
||
|
I hope the above give an overview to interested readers. Symbol interposition
|
||
|
is subtle. One has to think about all the factors related to symbol
|
||
|
interposition and the relevant toolchain fixes are like a whack-a-mole game. I
|
||
|
appreciate all the prior discussions and I believe many unsatisfactory things
|
||
|
can be fixed in a quite backward-compatible way.
|
||
|
|
||
|
Some features are inherently incompatible. We make the trade-off in favor of
|
||
|
more important features. Here are two things that should not work. However, if
|
||
|
`-fpie` or `-fno-direct-access-external-data` is specified, both limitations
|
||
|
will be circumvented.
|
||
|
|
||
|
* Copy relocations on protected data symbols.
|
||
|
* Canonical PLT entries on protected function symbols. With the `R_386_PLT32`
|
||
|
change, this issue will only affect function pointers.
|
||
|
|
||
|
People sometimes simply just say: "protected visibility does not work." I'd
|
||
|
argue that Clang+gold/LLD works quite well.
|
||
|
|
||
|
The things on GCC+GNU ld side are inconsistent, though. Here is a list of
|
||
|
changes I wish can happen:
|
||
|
|
||
|
* GCC: add `-f[no-]direct-access-external-data`.
|
||
|
* GCC: drop `HAVE_LD_PIE_COPYRELOC` in favor of `-f[no-]direct-access-external-data`.
|
||
|
* GCC x86-64: default to GOT indirection for external data symbols in `-fpie`
|
||
|
mode.
|
||
|
* GCC or GNU as i386: emit `R_386_PLT32` for branches to undefined function
|
||
|
symbols.
|
||
|
* GNU ld x86: disallow copy relocations on protected data symbols. (I think
|
||
|
canonical PLT entries on protected symbols have been disallowed.)
|
||
|
* GCC aarch64/arm/x86/...: allow direct access relocations on protected symbols
|
||
|
in `-fpic` mode.
|
||
|
* GNU ld aarch64/x86: allow direct access relocations on protected data symbols
|
||
|
in `-shared` mode.
|
||
|
|
||
|
The breaking changes for GCC+GNU ld:
|
||
|
|
||
|
* The "copy relocations on protected data symbols" scheme has been supported in
|
||
|
the past few years with GNU ld on x86, but it did not work before circa 2015,
|
||
|
and should not work in the future. Fortunately the breaking surface may be
|
||
|
narrow: this scheme does not work with gold or LLD. Many architectures don't
|
||
|
work.
|
||
|
* ld is not the only consumer of `R_386_PLT32`. The Linux kernel has code
|
||
|
resolving relocations and it needs to be fixed (patch uploaded: https://github.com/ClangBuiltLinux/linux/issues/1210).
|
||
|
|
||
|
I'll conclude thie article with random notes on other binary formats:
|
||
|
|
||
|
Windows/COFF `__declspec(dllimport)` gives us a different perspecitive how
|
||
|
external references can be designed. The annotation is verbose but
|
||
|
differentiates the two cases (1) the symbol has to be defined in the same
|
||
|
linkage unit (2) the symbol can be defined in another linkage unit. If we lift
|
||
|
the "the symbol visibility is decided by the most constrained visibility"
|
||
|
requirement for protected->default, a COFF undefined/defined symbol is quite
|
||
|
like a protected undefined/defined symbol in ELF. `__declspec(dllimport)` gives
|
||
|
the undefined symbol default visibility (i.e. the LLVM IR `dllimport` is
|
||
|
redundant). `__declspec(dllexport)` is something which cannot be modeled with
|
||
|
the existing ELF visibilities.
|
||
|
|
||
|
For an undefined variable, Mach-O uses `__attribute__((visibility("hidden")))`
|
||
|
to say "a definition must be available in another translation unit in the same
|
||
|
linkage unit" but does not actually mark the undefined symbol anyway. COFF uses
|
||
|
`__declspec(dllimport)` to convey this. In ELF,
|
||
|
`__attribute__((visibility("hidden")))` additionally makes the undefined symbol
|
||
|
unexportable. The Mach-O notation actually resembles COFF: it can be exported
|
||
|
by the definition in another translation unit. From its behavior, I think it
|
||
|
would be more appropriately mapped to LLVM IR protected instead of hidden.
|
||
|
|
||
|
## Appendix
|
||
|
|
||
|
For a `STB_GLOBAL`/`STB_WEAK` symbol,
|
||
|
|
||
|
`STV_DEFAULT`: both compiler & linker need to assume such symbols can be
|
||
|
preempted in `-fpic` mode. The compiler emits GOT indirection by default. GCC
|
||
|
`-fno-semantic-interposition` uses local aliases on defined non-weak function
|
||
|
symbols for x86 (unimplemented in other architectures). Clang
|
||
|
`-fno-semantic-interposition` uses local aliases on defined non-weak symbols
|
||
|
(both function and data) for x86.
|
||
|
|
||
|
`STV_PROTECTED`: GCC `-fpic` uses GOT indirection for data symbols, regardless
|
||
|
of defined or undefined. This pessimization is to make a misfeature "copy
|
||
|
relocation on protected data symbol" work
|
||
|
(https://maskray.me/blog/2021-01-09-copy-relocations-canonical-plt-entries-and-protected#protected-data-symbols-and-direct-accesses).
|
||
|
Clang code generation treats `STV_PROTECTED` the same way as `STV_HIDDEN`.
|
||
|
|
||
|
`STV_HIDDEN`: non-preemptible, regardless of defined or undefined. The compiler
|
||
|
suppresses GOT indirection, unless undefined `STB_WEAK`.
|
||
|
|
||
|
For defined symbols, `-fno-pic`/`-fpie` can avoid GOT indirection for
|
||
|
`STV_DEFAULT` (and GCC `STV_PROTECTED`). `-fvisibility=hidden` can change
|
||
|
visibility.
|
||
|
|
||
|
For undefined symbols, `-fpie`/`-fpic` use GOT indirection by default. Clang
|
||
|
`-fno-direct-access-external-data` (discussed in my article) can avoid GOT
|
||
|
indirection. If you `-fpic -fno-direct-access-external-data` & `ld
|
||
|
-shared`, you'll need additional linker options to make the linker know defined
|
||
|
non-`STB_LOCAL` `STV_DEFAULT` symbols are non-preemptible.
|
||
|
|