dynamic-dso/README.md

130 lines
4.8 KiB
Markdown
Raw Normal View History

2020-12-27 22:59:59 +00:00
# dynso
Define dynamic shared objects and resolvable symbols at runtime, without
creating an ELF file anywhere or touching the filesystem.
2020-12-27 23:37:38 +00:00
It also only works on glibc, it will explode in your face if you try to run it
with eg. musl. Additionally, your glibc binaries must *not* be stripped of
their symbol tables!
2020-12-27 22:59:59 +00:00
## Usage
If you ever use this in production, you, together with everyone else using it,
will die.
Other than that, here's an example:
```c
// create a library
struct dynso_lib* l;
2020-12-27 23:37:38 +00:00
dynso_create(&l, 0, /* base address of the library - you can keep this at 0 */
2020-12-27 22:59:59 +00:00
(char*)"this is just a display name", "libtest", /* latter is the soname */
NULL, LM_ID_BASE /* from dlfcn.h, you need to define _GNU_SOURCE first! */);
// define some symbols...
dynso_add_sym(l, "testsym", (void*)0x694201337);
dynso_add_sym_ex(l, "testfunction", a_function,
STT_FUNC /* from elf.h */, 32 /* symbol size */);
// this loads all symbols into the global context, which means they can now
// be looked up by dlsym(), and be resolved by other dynamic libraries that
// depend on it. adding more symbols won't be possible anymore, though.
dynso_bind(l);
void* x = dlsym(RTLD_DEFAULT, "testsym");
printf(" dlsym(\"testsym\") = %p\n", x);
x = dlsym(RTLD_DEFAULT, "testfunction");
printf(" dlsym(\"testfunction\") = %p\n", x);
void (*somefunc)(void) = x;
printf("calling the resolved function:\n");
somefunc();
// free the used memory
dynso_remove(l);
```
2020-12-27 23:37:38 +00:00
Example output:
```
dlsym("testsym") = 0x694201337
dlsym("testfunction") = 0x5589a1437d75
calling the resolved function:
hello world!
```
2020-12-27 22:59:59 +00:00
## Dependencies
glibc, seems to work with 2.30.
## Compilation
```sh
make
```
## Installation
2020-12-27 23:37:38 +00:00
Don't.
## How it works
Basically, it works by manipulating `ld.so`'s internal data structures. (That
is, if it works at all).
glibc's `ld.so` internally keeps track of all DSOs using a thing
called a `link_map`, which is essentially a linked list of loaded DSOs. It is
documented in your system's `link.h` header file, and can be accessed from
`_r_debug.r_map`. (`dlopen` also returns `link_map`s, cast to a void pointer.)
However, that header file is lying to you. Internally, glibc adds *lots* of
stuff to this struct, as you can witness in `include/link.h` in the glibc
source code repository. With this knowledge, we can readily manipulate a lot
of things in order to have it do what we want.
Alright, that's great, but how do you create a new DSO without using
`dlopen`? For that, you can look at how `ld.so` adds the
[vDSO](https://lwn.net/Articles/446528/) to the `link_map` chain: it calls
`_dl_new_object`, an internal function. This one returns a new `link_map`
object, which is then initialized, and then aded to the global `link_map` chain
by calling `_dl_add_to_namespace_list`. Additionally, a call to
`_dl_setup_hash` seems to be needed to keep symbol resolution code happy.
So, how do you get to those functions? They aren't exported: you won't be
finding them in `ld.so`'s `.dynsym` section, which is the section containing
all exported symbols. However, when unstripped, `ld.so` also has a *second*
symbol table, `.symtab`. This one does contain a number of internal symbols in
its list, including the ones we need!
Now that we can instantiate new `link_map` objects, how do we add symbols to
them to make them resolvable? This is where the 'hidden' part of a `link_map`
comes into play: when glibc tries to resolve a symbol (`elf/dl-lookup.c`),
it will do a lookup in a hash table which maps symbol names of a single DSO to
their entries in the symbol table. Two lookup algorithms are used: one made by
the GNU people, and a legacy one invented for SysV. On first sight the former
looked more complcated to get to work correctly, so I opted for the latter.
The SysV algorithm calculates the symbol name's hash modulo some value (which
is taken from a field in the `link_map`), which it then uses to index another
table (`l_chain`), from where it reads an index into the actual symbol table.
From that point on, it starts walking the symbol table linearly until it finds
a matching symbol.
If you've written some code that accomplishes the above, you'll notice that
lookups with `dlsym()` will still return nothing. `ld.so`, instead of just
checking *all* DSOs for a given symbol, as this will not work with how symbols
are supposed to work in the ELF ABI: each DSO has a separate 'scope' of other
DSOs it can access symbols of, some symbols have certain visibility parameters
set (global, internal, hidden, and protected -- especially the latter one
requires this approach based on scopes), and other complicating factors.
However, a DSO's scope is also accessible from this very same `link_map`
struct, so we can just inject ourselves whenever that's needed. After doing
that, `dlsym()` works!
2020-12-27 22:59:59 +00:00
## License
```
be gay, do crimes, death to america
```