writeups/2023/maple/01-misc-coinflip.md

14 KiB

manipulating python RNG using the seed: coinflip

Your fate lies in a coin flip. Will you take control of your destiny?

Author: nneonneo

nc coinflip.ctf.maplebacon.org 1337

Files: server.py

summary

python Random is used to generate random coin flips as 1 or 0 bits, where if you get 1 then you lose money, and if you get 0 you win money. user input directly seeds the RNG, which can be constructed in a way that forces most of the coin flips to go in the 0 direction. using this computed seed, we can trick the server into giving us lots of money and winning the game

challenge source

we are given the source code, in python for the challenge. it's not very long, so reviewing it is straightforward

the Coin class

class Coin:
    def __init__(self, coin_id):
        self.random = Random(coin_id)
        self.flips_left = 0
        self.buffer = None

    def flip(self):
        if self.flips_left == 0:
            self.buffer = self.random.getrandbits(32)
            self.flips_left = 32
        res = self.buffer & 1
        self.buffer >>= 1
        self.flips_left -= 1
        return res

this is an implementation of random coin flips, which uses python's builtin Random class with a given seed coin_id. every time a coin flip is needed, it is taken as the next bit of 32-bit output from the Random instance, and when all bits are used a new 32-bit value is requested from Random

this is an interesting implementation because you might expect something more typical like random.randint(0, 1) or random.choice([True, False]) which might be more idiomatic

the "main function"

if __name__ == "__main__":
    signal.alarm(60)
    print("Welcome to Maple Betting!")
    print("We'll be betting on the outcome of a fair coin flip.")
    print("You'll start with $1 - try to make lots of money and you'll get flags!")

    game_id = input("Which coin would you like to use? ")
    num_rounds = input("How many rounds do you want to go for? ")
    num_rounds = int(num_rounds)
    if num_rounds > 20_000_000:
        print("Can't play that long, I'm afraid.")
        exit(1)

there is a time limit of 60 seconds to interact with the server, and we are asked for a "game id" and a number of rounds to play. this is the only user input to the program

    print("Alright, let's go!")
    coin = Coin(int(game_id, 0))
    money = 1
    for nr in range(num_rounds):
        money += [1, -1][coin.flip()]
        if money <= 0:
            print(f"Oops, you went broke at round {nr+1}!")
            exit(1)

the "game id" is used to construct Coin (and becomes the seed for the RNG). then, the server plays the rounds you requested. on each round, a 1 or 0 bit is generated, and if the result is 0, you win $1, and if the result is 1 then you lose $1

    print(f"You finished with ${money} in the pot.")
    if money < 18_000:
        print("At least you didn't go broke!")
    elif money < 7_000_000:
        print(f"Pretty good!")
    else:
        print(f"What the hell?! You bankrupted the casino! Take your spoils: {FLAG}")

so in order to win the game and get the flag, we need to make an amount of money over $7000000

naive approach

since we can control the seed, one approach is to try seed values until one wins the necessary amount of games

something like this

seed = 0
best_seed = 0
best_money = 0
while True:
    coin = Coin(seed)
    money = 0
    for _ in range(20_000_000):
        money += [1, -1][coin.flip()]
        if money > best_money:
            best_money = money
            best_seed = seed

        if money <= 0:
            break

    seed += 1

running this for a while might get you money up to like, $15000 or so depending on how long you run it, but ultimately it's nowhere near the needed $7000000 and the improvements rapidly diminish with further running time. clearly this approach is not going to work

background: python RNG (Mersenne twister)

to find out how we can manipulate the python RNG, let's take a closer look at the implementation

the RNG in python is a Mersenne Twister. a quick summary of how the Mersenne twister works

  • a state array MT of 32-bit integers is initialized using a given seed value
  • index index set to n
  • to generate a value, pick the next MT[index] and increment index. additionally the output value is "tempered" with some bitwise operations
  • if index >= n, run the twist operation (notice twist is also run the first time a random value is generated)
    • twist scrambles MT using some more bitwise operations

we can find the full implementation of the Mersenne twister in CPython in Modules/_randommodule.c

here's the implementation of generating a 32 bit number, if you do some tracing through calls in CPython, you can find that this is the underlying C function that gets called by the getrandbits(32) call of Coin.flip. i've added comments to explain the main parts of the function

static uint32_t
genrand_uint32(RandomObject *self)
{
    uint32_t y;
    static const uint32_t mag01[2] = {0x0U, MATRIX_A};
    /* mag01[x] = x * MATRIX_A  for x=0,1 */
    uint32_t *mt;

    mt = self->state;

    // XXX: the contents of this block are the "twist" operation

    if (self->index >= N) { /* generate N words at one time */
        int kk;

        for (kk=0;kk<N-M;kk++) {
            y = (mt[kk]&UPPER_MASK)|(mt[kk+1]&LOWER_MASK);
            mt[kk] = mt[kk+M] ^ (y >> 1) ^ mag01[y & 0x1U];
        }
        for (;kk<N-1;kk++) {
            y = (mt[kk]&UPPER_MASK)|(mt[kk+1]&LOWER_MASK);
            mt[kk] = mt[kk+(M-N)] ^ (y >> 1) ^ mag01[y & 0x1U];
        }
        y = (mt[N-1]&UPPER_MASK)|(mt[0]&LOWER_MASK);
        mt[N-1] = mt[M-1] ^ (y >> 1) ^ mag01[y & 0x1U];

        self->index = 0;
    }

    // XXX: the following extracts the MT array value and applies the "temper" operation

    y = mt[self->index++];
    y ^= (y >> 11);
    y ^= (y << 7) & 0x9d2c5680U;
    y ^= (y << 15) & 0xefc60000U;
    y ^= (y >> 18);
    return y;
}

to recap, in order to win the coinflip game, we need a lot of coin flips to be a 0 bit; which means we need a lot of the output values to have mostly 0 bits in them

looking at the "temper" operation above, we can see that if the value y is 0x00000000 before tempering, it will still be 0x00000000 after tempering. this means that we need the state array elements themselves to be 0x00000000 to get what we want

additionally, during the "twist" operation we also find that having MT values of mostly 0x00000000 results in "twist" producing an output MT which still contains mostly 0x00000000

it's starting to look pretty convenient that 0 is the bit needed to win money!

init_by_array: how to get known values into MT

the last step of background information is to understand how python generates the MT array using a given seed

this is the C function that seeds the RNG using a given Python argument

static int
random_seed(RandomObject *self, PyObject *arg)
{
    int result = -1;  /* guilty until proved innocent */
    PyObject *n = NULL;
    uint32_t *key = NULL;
    size_t bits, keyused;
    int res;

    if (arg == NULL || arg == Py_None) {
        // ... seed from /dev/urandom ...
        return 0;
    }

    // ...
    n = /* coerce argument into PyLong */;
    // ...

    /* Now split n into 32-bit chunks, from the right. */
    bits = _PyLong_NumBits(n);
    if (bits == (size_t)-1 && PyErr_Occurred())
        goto Done;

    /* Figure out how many 32-bit chunks this gives us. */
    keyused = bits == 0 ? 1 : (bits - 1) / 32 + 1;

    /* Convert seed to byte sequence. */
    key = (uint32_t *)PyMem_Malloc((size_t)4 * keyused);
    if (key == NULL) {
        PyErr_NoMemory();
        goto Done;
    }
    res = _PyLong_AsByteArray((PyLongObject *)n,
                              (unsigned char *)key, keyused * 4,
                              PY_LITTLE_ENDIAN,
                              0); /* unsigned */
    if (res == -1) {
        goto Done;
    }

    // ...
    /* XXX: call init_by_array using uint32_t[] from the PyLong value */
    init_by_array(self, key, keyused);
    // ...
}

to summarize, the way seeding works is

  • given a PyLong value (ie, any integer in python), an array of uint32_t is produced consisting of 32-bit chunks of the given value
  • init_by_array is called with the given array

next, let's look at init_by_array and its helper function init_genrand

/* initializes mt[N] with a seed */
static void
init_genrand(RandomObject *self, uint32_t s)
{
    int mti;
    uint32_t *mt;

    mt = self->state;
    mt[0]= s;
    for (mti=1; mti<N; mti++) {
        mt[mti] =
        (1812433253U * (mt[mti-1] ^ (mt[mti-1] >> 30)) + mti);
        /* See Knuth TAOCP Vol2. 3rd Ed. P.106 for multiplier. */
        /* In the previous versions, MSBs of the seed affect   */
        /* only MSBs of the array mt[].                                */
        /* 2002/01/09 modified by Makoto Matsumoto                     */
    }
    self->index = mti;
    return;
}

/* initialize by an array with array-length */
/* init_key is the array for initializing keys */
/* key_length is its length */
static void
init_by_array(RandomObject *self, uint32_t init_key[], size_t key_length)
{
    size_t i, j, k;       /* was signed in the original code. RDH 12/16/2002 */
    uint32_t *mt;

    mt = self->state;
    init_genrand(self, 19650218U);
    i=1; j=0;
    k = (N>key_length ? N : key_length);
    for (; k; k--) {
        mt[i] = (mt[i] ^ ((mt[i-1] ^ (mt[i-1] >> 30)) * 1664525U))
                 + init_key[j] + (uint32_t)j; /* non linear */
        i++; j++;
        if (i>=N) { mt[0] = mt[N-1]; i=1; }
        if (j>=key_length) j=0;
    }
    for (k=N-1; k; k--) {
        mt[i] = (mt[i] ^ ((mt[i-1] ^ (mt[i-1] >> 30)) * 1566083941U))
                 - (uint32_t)i; /* non linear */
        i++;
        if (i>=N) { mt[0] = mt[N-1]; i=1; }
    }

    mt[0] = 0x80000000U; /* MSB is 1; assuring non-zero initial array */
}

this departs from the traditional method of seeding a Mersenne twister. the original method is init_genrand, which implements the canonical seeding function. the procedure here is:

  • call init_genrand with a static seed
  • combine the resulting MT array with the init_key (recall this is 32-bit chunks of the seed value from python) using some bitwise operations and arithmetic

this is useful for us, because this means it is actually possible to control almost all values in MT (except for the first one, which is hardcoded to 0x80000000). this would not be possible with just the canonical seeding implementation

cracking the Mersenne twister (actually)

z3 can be used to reverse the init_by_array function and produce an input that causes MT to contain the desired values. this is particularly made possible by the operations being all deterministic arithmetic on 32-bit integers, and all loops being a fixed number of iterations

first, we set up init_state.txt which contains the output of init_genrand(19650218) since that is static

then, the following script implements the reversing

N = 624
L = N

with open("init_state.txt", "r") as f:
    init_mt = [int(x.strip()) for x in f]

def crack():
    s = z3.Solver()

    init_key = [z3.BitVec(f"init_key_{i}", 32) for i in range(L)]
    mt = [z3.BitVecVal(x, 32) for x in init_mt]

we initialize a init_key of all symbolic values, and a MT (which in the real code would have just been generated by init_genrand) to the required concrete values

next, we simulate the same operations as init_by_array, implemented in python

    i = 1
    j = 0
    k = N
    while k > 0:
        mt[i] = (mt[i] ^ ((mt[i-1] ^ (mt[i-1] >> 30)) * z3.BitVecVal(1664525, 32))) + init_key[j] + z3.BitVecVal(j, 32)
        i += 1
        j += 1
        if i >= N:
            mt[0] = mt[N-1]
            i = 1
        if j >= L:
            j = 0

        k -= 1

    k = N - 1
    while k > 0:
        mt[i] = (mt[i] ^ ((mt[i-1] ^ (mt[i-1] >> 30)) * z3.BitVecVal(1566083941, 32))) - z3.BitVecVal(i, 32)
        i += 1
        if i >= N:
            mt[0] = mt[N-1]
            i = 1

        k -= 1

finally, we add constraints on the output values 1..N so that they are all zero

    for idx in range(1, N):
        s.add(mt[idx] == 0)

now, we are ready to solve with z3 and output the result as an integer literal which would be passed to the Random constructor, which we concatenate from the 32-bit chunks

    s.check()
    m = s.model()

    x = bytearray()
    for ik in init_key:
        ikv = m[ik].as_long()
        x += struct.pack("<I", ikv)

    print(hex(int.from_bytes(x, byteorder='little')))

if __name__ == "__main__":
    crack()

running this produces a solution value fairly quickly. the full script can be found at coinflip_crack.py and init_state.txt

next, we need to find out how many rounds to run with this seed

from server import Coin
with open("computed_seed.txt", "r") as f:
    seed = int(f.read().strip(), 0)
coin = Coin(seed)
money = 0
for i in range(20_000_000):
    money += [1, -1][coin.flip()]
    if money > 7_000_000:
        print(i)
        break

with this we get 14819684

jackpot!!!

running the server with the computed seed and number of rounds results in the flag! (reproduced here only locally, i don't remember what the server flag was)

$ (sleep 2; cat seed.txt; sleep 2; echo 14819684) | python3 server.py
Welcome to Maple Betting!
We'll be betting on the outcome of a fair coin flip.
You'll start with $1 - try to make lots of money and you'll get flags!
Which coin would you like to use? How many rounds do you want to go for? Alright, let's go!
You finished with $7000001 in the pot.
What the hell?! You bankrupted the casino! Take your spoils: ctf{test_flag_test_flag}