writeups/2023/maple/01-misc-coinflip.md

453 lines
14 KiB
Markdown

## manipulating python RNG using the seed: `coinflip`
>Your fate lies in a coin flip. Will you take control of your destiny?
>
>Author: nneonneo
>
>`nc coinflip.ctf.maplebacon.org 1337`
Files:
[server.py](https://git.lain.faith/haskal/writeups/raw/branch/main/2023/maple/coinflip/server.py)
### summary
python `Random` is used to generate random coin flips as `1` or `0` bits, where if you get `1` then
you lose money, and if you get `0` you win money. user input directly seeds the RNG, which can be
constructed in a way that forces most of the coin flips to go in the `0` direction. using this
computed seed, we can trick the server into giving us lots of money and winning the game
### challenge source
we are given the source code, in python for the challenge. it's not very long, so reviewing it is
straightforward
#### the `Coin` class
```python
class Coin:
def __init__(self, coin_id):
self.random = Random(coin_id)
self.flips_left = 0
self.buffer = None
def flip(self):
if self.flips_left == 0:
self.buffer = self.random.getrandbits(32)
self.flips_left = 32
res = self.buffer & 1
self.buffer >>= 1
self.flips_left -= 1
return res
```
this is an implementation of random coin flips, which uses python's builtin `Random` class with a
given seed `coin_id`. every time a coin flip is needed, it is taken as the next bit of 32-bit output
from the `Random` instance, and when all bits are used a new 32-bit value is requested from `Random`
this is an interesting implementation because you might expect something more typical like
`random.randint(0, 1)` or `random.choice([True, False])` which might be more idiomatic
#### the "main function"
```python
if __name__ == "__main__":
signal.alarm(60)
print("Welcome to Maple Betting!")
print("We'll be betting on the outcome of a fair coin flip.")
print("You'll start with $1 - try to make lots of money and you'll get flags!")
game_id = input("Which coin would you like to use? ")
num_rounds = input("How many rounds do you want to go for? ")
num_rounds = int(num_rounds)
if num_rounds > 20_000_000:
print("Can't play that long, I'm afraid.")
exit(1)
```
there is a time limit of 60 seconds to interact with the server, and we are asked for a "game id"
and a number of rounds to play. this is the only user input to the program
```python
print("Alright, let's go!")
coin = Coin(int(game_id, 0))
money = 1
for nr in range(num_rounds):
money += [1, -1][coin.flip()]
if money <= 0:
print(f"Oops, you went broke at round {nr+1}!")
exit(1)
```
the "game id" is used to construct `Coin` (and becomes the seed for the RNG). then, the server plays
the rounds you requested. on each round, a `1` or `0` bit is generated, and if the result is `0`,
you win `$1`, and if the result is `1` then you lose `$1`
```python
print(f"You finished with ${money} in the pot.")
if money < 18_000:
print("At least you didn't go broke!")
elif money < 7_000_000:
print(f"Pretty good!")
else:
print(f"What the hell?! You bankrupted the casino! Take your spoils: {FLAG}")
```
so in order to win the game and get the flag, we need to make an amount of money over `$7000000`
### naive approach
since we can control the seed, one approach is to try seed values until one wins the necessary
amount of games
something like this
```python
seed = 0
best_seed = 0
best_money = 0
while True:
coin = Coin(seed)
money = 0
for _ in range(20_000_000):
money += [1, -1][coin.flip()]
if money > best_money:
best_money = money
best_seed = seed
if money <= 0:
break
seed += 1
```
running this for a while might get you money up to like, `$15000` or so depending on how long you run
it, but ultimately it's nowhere near the needed `$7000000` and the improvements rapidly diminish with
further running time. clearly this approach is not going to work
### background: python RNG (Mersenne twister)
to find out how we can manipulate the python RNG, let's take a closer look at the implementation
the RNG in python is a [Mersenne Twister](https://en.wikipedia.org/wiki/Mersenne_Twister). a quick
summary of how the Mersenne twister works
- a state array `MT` of 32-bit integers is initialized using a given seed value
- index `index` set to `n`
- to generate a value, pick the next `MT[index]` and increment `index`. additionally the output
value is "tempered" with some bitwise operations
- if `index >= n`, run the `twist` operation (notice `twist` is also run the first time a random
value is generated)
- `twist` scrambles `MT` using some more bitwise operations
we can find the full implementation of the Mersenne twister in CPython in
[Modules/_randommodule.c](https://github.com/python/cpython/blob/main/Modules/_randommodule.c)
here's the implementation of generating a 32 bit number, if you do some tracing through calls in
CPython, you can find that this is the underlying C function that gets called by the
`getrandbits(32)` call of `Coin.flip`. i've added comments to explain the main parts of the function
```c
static uint32_t
genrand_uint32(RandomObject *self)
{
uint32_t y;
static const uint32_t mag01[2] = {0x0U, MATRIX_A};
/* mag01[x] = x * MATRIX_A for x=0,1 */
uint32_t *mt;
mt = self->state;
// XXX: the contents of this block are the "twist" operation
if (self->index >= N) { /* generate N words at one time */
int kk;
for (kk=0;kk<N-M;kk++) {
y = (mt[kk]&UPPER_MASK)|(mt[kk+1]&LOWER_MASK);
mt[kk] = mt[kk+M] ^ (y >> 1) ^ mag01[y & 0x1U];
}
for (;kk<N-1;kk++) {
y = (mt[kk]&UPPER_MASK)|(mt[kk+1]&LOWER_MASK);
mt[kk] = mt[kk+(M-N)] ^ (y >> 1) ^ mag01[y & 0x1U];
}
y = (mt[N-1]&UPPER_MASK)|(mt[0]&LOWER_MASK);
mt[N-1] = mt[M-1] ^ (y >> 1) ^ mag01[y & 0x1U];
self->index = 0;
}
// XXX: the following extracts the MT array value and applies the "temper" operation
y = mt[self->index++];
y ^= (y >> 11);
y ^= (y << 7) & 0x9d2c5680U;
y ^= (y << 15) & 0xefc60000U;
y ^= (y >> 18);
return y;
}
```
to recap, in order to win the coinflip game, we need a lot of coin flips to be a `0` bit; which
means we need a lot of the output values to have mostly `0` bits in them
looking at the "temper" operation above, we can see that if the value `y` is `0x00000000` before
tempering, it will still be `0x00000000` after tempering. this means that we need the state array
elements themselves to be `0x00000000` to get what we want
additionally, during the "twist" operation we also find that having `MT` values of mostly
`0x00000000` results in "twist" producing an output `MT` which still contains mostly `0x00000000`
it's starting to look pretty convenient that `0` is the bit needed to win money!
### `init_by_array`: how to get known values into `MT`
the last step of background information is to understand how python generates the `MT` array using a
given seed
this is the C function that seeds the RNG using a given Python argument
```c
static int
random_seed(RandomObject *self, PyObject *arg)
{
int result = -1; /* guilty until proved innocent */
PyObject *n = NULL;
uint32_t *key = NULL;
size_t bits, keyused;
int res;
if (arg == NULL || arg == Py_None) {
// ... seed from /dev/urandom ...
return 0;
}
// ...
n = /* coerce argument into PyLong */;
// ...
/* Now split n into 32-bit chunks, from the right. */
bits = _PyLong_NumBits(n);
if (bits == (size_t)-1 && PyErr_Occurred())
goto Done;
/* Figure out how many 32-bit chunks this gives us. */
keyused = bits == 0 ? 1 : (bits - 1) / 32 + 1;
/* Convert seed to byte sequence. */
key = (uint32_t *)PyMem_Malloc((size_t)4 * keyused);
if (key == NULL) {
PyErr_NoMemory();
goto Done;
}
res = _PyLong_AsByteArray((PyLongObject *)n,
(unsigned char *)key, keyused * 4,
PY_LITTLE_ENDIAN,
0); /* unsigned */
if (res == -1) {
goto Done;
}
// ...
/* XXX: call init_by_array using uint32_t[] from the PyLong value */
init_by_array(self, key, keyused);
// ...
}
```
to summarize, the way seeding works is
- given a `PyLong` value (ie, any integer in python), an array of `uint32_t` is produced consisting
of 32-bit chunks of the given value
- `init_by_array` is called with the given array
next, let's look at `init_by_array` and its helper function `init_genrand`
```c
/* initializes mt[N] with a seed */
static void
init_genrand(RandomObject *self, uint32_t s)
{
int mti;
uint32_t *mt;
mt = self->state;
mt[0]= s;
for (mti=1; mti<N; mti++) {
mt[mti] =
(1812433253U * (mt[mti-1] ^ (mt[mti-1] >> 30)) + mti);
/* See Knuth TAOCP Vol2. 3rd Ed. P.106 for multiplier. */
/* In the previous versions, MSBs of the seed affect */
/* only MSBs of the array mt[]. */
/* 2002/01/09 modified by Makoto Matsumoto */
}
self->index = mti;
return;
}
/* initialize by an array with array-length */
/* init_key is the array for initializing keys */
/* key_length is its length */
static void
init_by_array(RandomObject *self, uint32_t init_key[], size_t key_length)
{
size_t i, j, k; /* was signed in the original code. RDH 12/16/2002 */
uint32_t *mt;
mt = self->state;
init_genrand(self, 19650218U);
i=1; j=0;
k = (N>key_length ? N : key_length);
for (; k; k--) {
mt[i] = (mt[i] ^ ((mt[i-1] ^ (mt[i-1] >> 30)) * 1664525U))
+ init_key[j] + (uint32_t)j; /* non linear */
i++; j++;
if (i>=N) { mt[0] = mt[N-1]; i=1; }
if (j>=key_length) j=0;
}
for (k=N-1; k; k--) {
mt[i] = (mt[i] ^ ((mt[i-1] ^ (mt[i-1] >> 30)) * 1566083941U))
- (uint32_t)i; /* non linear */
i++;
if (i>=N) { mt[0] = mt[N-1]; i=1; }
}
mt[0] = 0x80000000U; /* MSB is 1; assuring non-zero initial array */
}
```
this departs from the traditional method of seeding a Mersenne twister. the original method is
`init_genrand`, which implements the canonical seeding function. the procedure here is:
- call `init_genrand` with a *static seed*
- combine the resulting `MT` array with the `init_key` (recall this is 32-bit chunks of the seed
value from python) using some bitwise operations and arithmetic
this is useful for us, because this means it is actually possible to control almost all values in
`MT` (except for the first one, which is hardcoded to `0x80000000`). this would not be possible with
just the canonical seeding implementation
### cracking the Mersenne twister (actually)
[z3](https://github.com/Z3Prover/z3) can be used to reverse the `init_by_array` function and produce
an input that causes `MT` to contain the desired values. this is particularly made possible by the
operations being all deterministic arithmetic on 32-bit integers, and all loops being a fixed number
of iterations
first, we set up `init_state.txt` which contains the output of `init_genrand(19650218)` since that
is static
then, the following script implements the reversing
```python
N = 624
L = N
with open("init_state.txt", "r") as f:
init_mt = [int(x.strip()) for x in f]
def crack():
s = z3.Solver()
init_key = [z3.BitVec(f"init_key_{i}", 32) for i in range(L)]
mt = [z3.BitVecVal(x, 32) for x in init_mt]
```
we initialize a `init_key` of all symbolic values, and a `MT` (which in the real code would have
just been generated by `init_genrand`) to the required concrete values
next, we simulate the same operations as `init_by_array`, implemented in python
```python
i = 1
j = 0
k = N
while k > 0:
mt[i] = (mt[i] ^ ((mt[i-1] ^ (mt[i-1] >> 30)) * z3.BitVecVal(1664525, 32))) + init_key[j] + z3.BitVecVal(j, 32)
i += 1
j += 1
if i >= N:
mt[0] = mt[N-1]
i = 1
if j >= L:
j = 0
k -= 1
k = N - 1
while k > 0:
mt[i] = (mt[i] ^ ((mt[i-1] ^ (mt[i-1] >> 30)) * z3.BitVecVal(1566083941, 32))) - z3.BitVecVal(i, 32)
i += 1
if i >= N:
mt[0] = mt[N-1]
i = 1
k -= 1
```
finally, we add constraints on the output values `1..N` so that they are all zero
```python
for idx in range(1, N):
s.add(mt[idx] == 0)
```
now, we are ready to solve with z3 and output the result as an integer literal which would be passed
to the `Random` constructor, which we concatenate from the 32-bit chunks
```python
s.check()
m = s.model()
x = bytearray()
for ik in init_key:
ikv = m[ik].as_long()
x += struct.pack("<I", ikv)
print(hex(int.from_bytes(x, byteorder='little')))
if __name__ == "__main__":
crack()
```
running this produces a solution value fairly quickly. the full script can be found at
[coinflip_crack.py](https://git.lain.faith/haskal/writeups/raw/branch/main/2023/maple/coinflip_crack.py)
and
[init_state.txt](https://git.lain.faith/haskal/writeups/raw/branch/main/2023/maple/init_state.txt)
next, we need to find out how many rounds to run with this seed
```python
from server import Coin
with open("computed_seed.txt", "r") as f:
seed = int(f.read().strip(), 0)
coin = Coin(seed)
money = 0
for i in range(20_000_000):
money += [1, -1][coin.flip()]
if money > 7_000_000:
print(i)
break
```
with this we get `14819684`
### jackpot!!!
running the server with the computed seed and number of rounds results in the flag! (reproduced here
only locally, i don't remember what the server flag was)
```text
$ (sleep 2; cat seed.txt; sleep 2; echo 14819684) | python3 server.py
Welcome to Maple Betting!
We'll be betting on the outcome of a fair coin flip.
You'll start with $1 - try to make lots of money and you'll get flags!
Which coin would you like to use? How many rounds do you want to go for? Alright, let's go!
You finished with $7000001 in the pot.
What the hell?! You bankrupted the casino! Take your spoils: ctf{test_flag_test_flag}
```