Generate a DSO at runtime and load symbols from it using dlsym(), without creating an actual ELF or touching the filesystem
Go to file
Triss dc9fa36d69 wirteup ig 2020-12-28 00:37:38 +01:00
.gitignore now in library form 2020-12-28 00:12:29 +01:00
LICENSE now in library form 2020-12-28 00:12:29 +01:00
Makefile now in library form 2020-12-28 00:12:29 +01:00
README.md wirteup ig 2020-12-28 00:37:38 +01:00
dynso.c now in library form 2020-12-28 00:12:29 +01:00
dynso.h now in library form 2020-12-28 00:12:29 +01:00
dynso_internal.h now in library form 2020-12-28 00:12:29 +01:00
example.c now in library form 2020-12-28 00:12:29 +01:00

README.md

dynso

Define dynamic shared objects and resolvable symbols at runtime, without creating an ELF file anywhere or touching the filesystem.

It also only works on glibc, it will explode in your face if you try to run it with eg. musl. Additionally, your glibc binaries must not be stripped of their symbol tables!

Usage

If you ever use this in production, you, together with everyone else using it, will die.

Other than that, here's an example:

// create a library
struct dynso_lib* l;
dynso_create(&l, 0, /* base address of the library - you can keep this at 0 */
	(char*)"this is just a display name", "libtest", /* latter is the soname */
	NULL, LM_ID_BASE /* from dlfcn.h, you need to define _GNU_SOURCE first! */);

// define some symbols...
dynso_add_sym(l, "testsym", (void*)0x694201337);
dynso_add_sym_ex(l, "testfunction", a_function,
	STT_FUNC /* from elf.h */, 32 /* symbol size */);

// this loads all symbols into the global context, which means they can now
// be looked up by dlsym(), and be resolved by other dynamic libraries that
// depend on it. adding more symbols won't be possible anymore, though.
dynso_bind(l);

void* x = dlsym(RTLD_DEFAULT, "testsym");
printf("  dlsym(\"testsym\") = %p\n", x);
x = dlsym(RTLD_DEFAULT, "testfunction");
printf("  dlsym(\"testfunction\") = %p\n", x);

void (*somefunc)(void) = x;
printf("calling the resolved function:\n");
somefunc();

// free the used memory
dynso_remove(l);

Example output:

  dlsym("testsym") = 0x694201337
  dlsym("testfunction") = 0x5589a1437d75
calling the resolved function:
hello world!

Dependencies

glibc, seems to work with 2.30.

Compilation

make

Installation

Don't.

How it works

Basically, it works by manipulating ld.so's internal data structures. (That is, if it works at all).

glibc's ld.so internally keeps track of all DSOs using a thing called a link_map, which is essentially a linked list of loaded DSOs. It is documented in your system's link.h header file, and can be accessed from _r_debug.r_map. (dlopen also returns link_maps, cast to a void pointer.)

However, that header file is lying to you. Internally, glibc adds lots of stuff to this struct, as you can witness in include/link.h in the glibc source code repository. With this knowledge, we can readily manipulate a lot of things in order to have it do what we want.

Alright, that's great, but how do you create a new DSO without using dlopen? For that, you can look at how ld.so adds the vDSO to the link_map chain: it calls _dl_new_object, an internal function. This one returns a new link_map object, which is then initialized, and then aded to the global link_map chain by calling _dl_add_to_namespace_list. Additionally, a call to _dl_setup_hash seems to be needed to keep symbol resolution code happy.

So, how do you get to those functions? They aren't exported: you won't be finding them in ld.so's .dynsym section, which is the section containing all exported symbols. However, when unstripped, ld.so also has a second symbol table, .symtab. This one does contain a number of internal symbols in its list, including the ones we need!

Now that we can instantiate new link_map objects, how do we add symbols to them to make them resolvable? This is where the 'hidden' part of a link_map comes into play: when glibc tries to resolve a symbol (elf/dl-lookup.c), it will do a lookup in a hash table which maps symbol names of a single DSO to their entries in the symbol table. Two lookup algorithms are used: one made by the GNU people, and a legacy one invented for SysV. On first sight the former looked more complcated to get to work correctly, so I opted for the latter.

The SysV algorithm calculates the symbol name's hash modulo some value (which is taken from a field in the link_map), which it then uses to index another table (l_chain), from where it reads an index into the actual symbol table. From that point on, it starts walking the symbol table linearly until it finds a matching symbol.

If you've written some code that accomplishes the above, you'll notice that lookups with dlsym() will still return nothing. ld.so, instead of just checking all DSOs for a given symbol, as this will not work with how symbols are supposed to work in the ELF ABI: each DSO has a separate 'scope' of other DSOs it can access symbols of, some symbols have certain visibility parameters set (global, internal, hidden, and protected -- especially the latter one requires this approach based on scopes), and other complicating factors.

However, a DSO's scope is also accessible from this very same link_map struct, so we can just inject ourselves whenever that's needed. After doing that, dlsym() works!

License

be gay, do crimes, death to america