130 lines
4.8 KiB
Markdown
130 lines
4.8 KiB
Markdown
# dynso
|
|
|
|
Define dynamic shared objects and resolvable symbols at runtime, without
|
|
creating an ELF file anywhere or touching the filesystem.
|
|
|
|
It also only works on glibc, it will explode in your face if you try to run it
|
|
with eg. musl. Additionally, your glibc binaries must *not* be stripped of
|
|
their symbol tables!
|
|
|
|
## Usage
|
|
|
|
If you ever use this in production, you, together with everyone else using it,
|
|
will die.
|
|
|
|
Other than that, here's an example:
|
|
|
|
```c
|
|
// create a library
|
|
struct dynso_lib* l;
|
|
dynso_create(&l, 0, /* base address of the library - you can keep this at 0 */
|
|
(char*)"this is just a display name", "libtest", /* latter is the soname */
|
|
NULL, LM_ID_BASE /* from dlfcn.h, you need to define _GNU_SOURCE first! */);
|
|
|
|
// define some symbols...
|
|
dynso_add_sym(l, "testsym", (void*)0x694201337);
|
|
dynso_add_sym_ex(l, "testfunction", a_function,
|
|
STT_FUNC /* from elf.h */, 32 /* symbol size */);
|
|
|
|
// this loads all symbols into the global context, which means they can now
|
|
// be looked up by dlsym(), and be resolved by other dynamic libraries that
|
|
// depend on it. adding more symbols won't be possible anymore, though.
|
|
dynso_bind(l);
|
|
|
|
void* x = dlsym(RTLD_DEFAULT, "testsym");
|
|
printf(" dlsym(\"testsym\") = %p\n", x);
|
|
x = dlsym(RTLD_DEFAULT, "testfunction");
|
|
printf(" dlsym(\"testfunction\") = %p\n", x);
|
|
|
|
void (*somefunc)(void) = x;
|
|
printf("calling the resolved function:\n");
|
|
somefunc();
|
|
|
|
// free the used memory
|
|
dynso_remove(l);
|
|
```
|
|
|
|
Example output:
|
|
|
|
```
|
|
dlsym("testsym") = 0x694201337
|
|
dlsym("testfunction") = 0x5589a1437d75
|
|
calling the resolved function:
|
|
hello world!
|
|
```
|
|
|
|
## Dependencies
|
|
|
|
glibc, seems to work with 2.30.
|
|
|
|
## Compilation
|
|
|
|
```sh
|
|
make
|
|
```
|
|
|
|
## Installation
|
|
|
|
Don't.
|
|
|
|
## How it works
|
|
|
|
Basically, it works by manipulating `ld.so`'s internal data structures. (That
|
|
is, if it works at all).
|
|
|
|
glibc's `ld.so` internally keeps track of all DSOs using a thing
|
|
called a `link_map`, which is essentially a linked list of loaded DSOs. It is
|
|
documented in your system's `link.h` header file, and can be accessed from
|
|
`_r_debug.r_map`. (`dlopen` also returns `link_map`s, cast to a void pointer.)
|
|
|
|
However, that header file is lying to you. Internally, glibc adds *lots* of
|
|
stuff to this struct, as you can witness in `include/link.h` in the glibc
|
|
source code repository. With this knowledge, we can readily manipulate a lot
|
|
of things in order to have it do what we want.
|
|
|
|
Alright, that's great, but how do you create a new DSO without using
|
|
`dlopen`? For that, you can look at how `ld.so` adds the
|
|
[vDSO](https://lwn.net/Articles/446528/) to the `link_map` chain: it calls
|
|
`_dl_new_object`, an internal function. This one returns a new `link_map`
|
|
object, which is then initialized, and then aded to the global `link_map` chain
|
|
by calling `_dl_add_to_namespace_list`. Additionally, a call to
|
|
`_dl_setup_hash` seems to be needed to keep symbol resolution code happy.
|
|
|
|
So, how do you get to those functions? They aren't exported: you won't be
|
|
finding them in `ld.so`'s `.dynsym` section, which is the section containing
|
|
all exported symbols. However, when unstripped, `ld.so` also has a *second*
|
|
symbol table, `.symtab`. This one does contain a number of internal symbols in
|
|
its list, including the ones we need!
|
|
|
|
Now that we can instantiate new `link_map` objects, how do we add symbols to
|
|
them to make them resolvable? This is where the 'hidden' part of a `link_map`
|
|
comes into play: when glibc tries to resolve a symbol (`elf/dl-lookup.c`),
|
|
it will do a lookup in a hash table which maps symbol names of a single DSO to
|
|
their entries in the symbol table. Two lookup algorithms are used: one made by
|
|
the GNU people, and a legacy one invented for SysV. On first sight the former
|
|
looked more complcated to get to work correctly, so I opted for the latter.
|
|
|
|
The SysV algorithm calculates the symbol name's hash modulo some value (which
|
|
is taken from a field in the `link_map`), which it then uses to index another
|
|
table (`l_chain`), from where it reads an index into the actual symbol table.
|
|
From that point on, it starts walking the symbol table linearly until it finds
|
|
a matching symbol.
|
|
|
|
If you've written some code that accomplishes the above, you'll notice that
|
|
lookups with `dlsym()` will still return nothing. `ld.so`, instead of just
|
|
checking *all* DSOs for a given symbol, as this will not work with how symbols
|
|
are supposed to work in the ELF ABI: each DSO has a separate 'scope' of other
|
|
DSOs it can access symbols of, some symbols have certain visibility parameters
|
|
set (global, internal, hidden, and protected -- especially the latter one
|
|
requires this approach based on scopes), and other complicating factors.
|
|
|
|
However, a DSO's scope is also accessible from this very same `link_map`
|
|
struct, so we can just inject ourselves whenever that's needed. After doing
|
|
that, `dlsym()` works!
|
|
|
|
## License
|
|
|
|
```
|
|
be gay, do crimes, death to america
|
|
```
|