From dc9fa36d69c56937fd1041715a3e3f9d96185cf5 Mon Sep 17 00:00:00 2001 From: sys64738 Date: Mon, 28 Dec 2020 00:37:38 +0100 Subject: [PATCH] wirteup ig --- README.md | 74 +++++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 69 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index 1e5ad15..01ca570 100644 --- a/README.md +++ b/README.md @@ -3,9 +3,9 @@ Define dynamic shared objects and resolvable symbols at runtime, without creating an ELF file anywhere or touching the filesystem. -It works by manipulating `ld.so`'s internal data structures. (That is, if it -works at all). It also only works on glibc, it will explode in your face if you -try to run it with eg. musl. +It also only works on glibc, it will explode in your face if you try to run it +with eg. musl. Additionally, your glibc binaries must *not* be stripped of +their symbol tables! ## Usage @@ -17,7 +17,7 @@ Other than that, here's an example: ```c // create a library struct dynso_lib* l; -dynso_create(&l, 0, +dynso_create(&l, 0, /* base address of the library - you can keep this at 0 */ (char*)"this is just a display name", "libtest", /* latter is the soname */ NULL, LM_ID_BASE /* from dlfcn.h, you need to define _GNU_SOURCE first! */); @@ -44,6 +44,15 @@ somefunc(); dynso_remove(l); ``` +Example output: + +``` + dlsym("testsym") = 0x694201337 + dlsym("testfunction") = 0x5589a1437d75 +calling the resolved function: +hello world! +``` + ## Dependencies glibc, seems to work with 2.30. @@ -56,7 +65,62 @@ make ## Installation -no +Don't. + +## How it works + +Basically, it works by manipulating `ld.so`'s internal data structures. (That +is, if it works at all). + +glibc's `ld.so` internally keeps track of all DSOs using a thing +called a `link_map`, which is essentially a linked list of loaded DSOs. It is +documented in your system's `link.h` header file, and can be accessed from +`_r_debug.r_map`. (`dlopen` also returns `link_map`s, cast to a void pointer.) + +However, that header file is lying to you. Internally, glibc adds *lots* of +stuff to this struct, as you can witness in `include/link.h` in the glibc +source code repository. With this knowledge, we can readily manipulate a lot +of things in order to have it do what we want. + +Alright, that's great, but how do you create a new DSO without using +`dlopen`? For that, you can look at how `ld.so` adds the +[vDSO](https://lwn.net/Articles/446528/) to the `link_map` chain: it calls +`_dl_new_object`, an internal function. This one returns a new `link_map` +object, which is then initialized, and then aded to the global `link_map` chain +by calling `_dl_add_to_namespace_list`. Additionally, a call to +`_dl_setup_hash` seems to be needed to keep symbol resolution code happy. + +So, how do you get to those functions? They aren't exported: you won't be +finding them in `ld.so`'s `.dynsym` section, which is the section containing +all exported symbols. However, when unstripped, `ld.so` also has a *second* +symbol table, `.symtab`. This one does contain a number of internal symbols in +its list, including the ones we need! + +Now that we can instantiate new `link_map` objects, how do we add symbols to +them to make them resolvable? This is where the 'hidden' part of a `link_map` +comes into play: when glibc tries to resolve a symbol (`elf/dl-lookup.c`), +it will do a lookup in a hash table which maps symbol names of a single DSO to +their entries in the symbol table. Two lookup algorithms are used: one made by +the GNU people, and a legacy one invented for SysV. On first sight the former +looked more complcated to get to work correctly, so I opted for the latter. + +The SysV algorithm calculates the symbol name's hash modulo some value (which +is taken from a field in the `link_map`), which it then uses to index another +table (`l_chain`), from where it reads an index into the actual symbol table. +From that point on, it starts walking the symbol table linearly until it finds +a matching symbol. + +If you've written some code that accomplishes the above, you'll notice that +lookups with `dlsym()` will still return nothing. `ld.so`, instead of just +checking *all* DSOs for a given symbol, as this will not work with how symbols +are supposed to work in the ELF ABI: each DSO has a separate 'scope' of other +DSOs it can access symbols of, some symbols have certain visibility parameters +set (global, internal, hidden, and protected -- especially the latter one +requires this approach based on scopes), and other complicating factors. + +However, a DSO's scope is also accessible from this very same `link_map` +struct, so we can just inject ourselves whenever that's needed. After doing +that, `dlsym()` works! ## License