writeups/2020/gctf/sharky/README.md

# sharky

writeup by [haskal](https://awoo.systems) for [BLÅHAJ](https://blahaj.awoo.systems)

**Crypto**
**231 points**
**39 solves**

>Can you find the round keys?

provided files: `challenge.py`, `sha256.py`

## tldr

for this i used [Z3](https://github.com/Z3Prover/z3) except with angr's
[claripy](https://docs.angr.io/advanced-topics/claripy) frontend instead of the direct frontend
because i've never used the direct frontend. using claripy/z3 we can construct a symbolic emulation
of the provided sha256 algorithm with the unknowns being symbolic variables. by applying constraints
for the hash output you can have z3 solve for the unknown hash constants

## writeup

initial analysis shows this is a modified sha256 algorithm with the first 8 round constants changed
to random values. the objective is to figure out what these values are given a known hash of a known
plaintext. from `challenge.py`:

```python
NUM_KEYS = 8
MSG = b'Encoded with random keys'
```

first, i checked to see if anything else was funky with the hand coded sha256 implementation by
initializing the "unknown" round constants to the standard constants and checking if the resulting
hash matched the expected "standard" hash for this message. it does, so the only difference is the
first 8 round constants

[sha256](https://en.wikipedia.org/wiki/SHA-2) (this links to wikipedia because i literally studied
sha256 from the wikipedia article during the CTF) starts by padding the message to a multiple of 512
bits. then, for each chunk of 512 bits it expands the chunk to 2048 bits by bit rotations, xors, and
arithmetic. the result is called `w`. `w` is compressed into a 256 bit state `h'` by taking the
previous state and doing 64 rounds of more bit rotations, bitwise ops, and arithmetic combining the
previous state with the nth word of `w` and the nth word of `k`, the sha256 constants table. `h`
also starts with known constants. also sha256 basically operates on 32-bit words, always. so all the
symbolic variables should be 32-bit

in this case we see the message is short enough so there is only one 512-bit chunk of padded input.
this makes things easier. i started by precomputing the `w` array for the known input, since this
doesn't depend on any (modified) constants. then, i took the verbatim python implementation of the
algorithm and stripped out every component except the round function `compression_step` that
combines `h'`, `w[i]` and `k[i]`. i applied some basic simplifications to optimize symbolic
execution such as replacing the manual bit rotate operation with the `claripy.RotateRight` builtin
and removing the bitmasking done after addition, since claripy bit vectors can't change size.
otherwise it's the original provided code, except now it's operating on symbolic variables instead
of actual numbers. it looks like this

```python
def rotate_right(v, n):
    # w = (v >> n) | (v << (32 - n))
    # return w & 0xffffffff
    return claripy.RotateRight(v, n)

def compression_step(state, k_i, w_i):
    a, b, c, d, e, f, g, h = state
    s1 = rotate_right(e, 6) ^ rotate_right(e, 11) ^ rotate_right(e, 25)
    ch = (e & f) ^ (~e & g)
    tmp1 = (h + s1 + ch + k_i + w_i)
    s0 = rotate_right(a, 2) ^ rotate_right(a, 13) ^ rotate_right(a, 22)
    maj = (a & b) ^ (a & c) ^ (b & c)
    tmp2 = (tmp1 + s0 + maj)
    tmp3 = (d + tmp1)
    return (tmp2, a, b, c, tmp3, e, f, g)
```

i figured this was good enough for z3 to work with and just threw it in

```
init_h = [
    # standard constants
    0x6a09e667, 0xbb67ae85, 0x3c6ef372, 0xa54ff53a, 0x510e527f, 0x9b05688c,
    0x1f83d9ab, 0x5be0cd19
]
init_h = [claripy.BVV(x, 32) for x in init_h]

secrets = [claripy.BVS(f"rc_{i}", 32) for i in range(8)]

init_k = secrets + [claripy.BVV(x, 32) for x in [
    ... rest of the standard constants
]]

def compression(state, w):
    round_keys = init_k
    for i in range(64):
        state = compression_step(state, round_keys[i], w[i])
    return state

w = [...] # computed w

real_hash = sys.argv[1]
real_hash = binascii.unhexlify(real_hash)
real_hash = struct.unpack(">8L", real_hash)

solver = claripy.Solver()

state = init_h[:]
out_state = compression(state, w)
state = [x + y for x, y in zip(state, out_state)]

for i in range(8):
    solver.add(state[i] == real_hash[i])

for i in range(8):
    print(solver.eval(secrets[i], 1)[0])
```

this did Not terminate at all. this is a side effect of me never knowing what i'm doing but in
general it seems that z3 does very poorly when you have few constraints that are each very large and
complex. in fact these ones were so large you literally cannot print them to the console

so i opted to do some handholding. i noticed that since constants 8-63 are the standard ones, the
state from round 8 to the final hash is completely known, and we can backtrack from the final hash
to the round 8 state by solving each round individually. here's the annotated code for this

```python
# i set up round 63 with the final addition that occurs after all rounds are complete
solver = claripy.Solver()
state = [claripy.BVS(f"state_{i}", 32) for i in range(8)]
out_state = compression_step(state, w[63], init_k[63])
for i in range(8):
    solver.add(out_state[i] + init_h[i] == real_hash[i])

# this is a helper function that evaluates the input state of a round that produces a known output
# state and reassigns a new wanted_state for that
def get_wanted_state():
    global solver
    global state
    wanted_state = [None]*8
    for i in range(8):
        res = solver.eval(state[i], 10)
        if len(res) != 1:
            print("ERROR", i, res)
            raise SystemExit()
        res = res[0]
        wanted_state[i] = res
    return wanted_state

# calculate the input to round 63
wanted_state = get_wanted_state()

# emulate 62 downto 8
# each round, we create a symbolic state array for the input, add constraints for the previously
# calculated output, and evaluate
for round in range(62, 8 - 1, -1):
    print("emulating round", round)

    solver = claripy.Solver()
    state = [claripy.BVS(f"state_{round}_{i}", 32) for i in range(8)]
    out_state = compression_step(state, w[round], init_k[round])
    for i in range(8):
        solver.add(out_state[i] == wanted_state[i])

    wanted_state = get_wanted_state()

# now we have the exact state that round 8 started with
print("round 8 input state: ", [hex(x) for x in wanted_state])
```

now there should be a lot less work for z3 to guess on, it only needs to work through rounds 0-7
with the symbolic round constants

```python
state = init_h[:]
out_state = compression_round_0_7(state, w)
state = [x + y for x, y in zip(state, out_state)]

for i in range(8):
    solver.add(state[i] == wanted_state[i])

for i in range(8):
    print(solver.eval(secrets[i], 1)[0])
```

this took absolutely ages to complete and when it did the result was `unsat`. yikes

i'm still not sure what exactly the issue was. it's likely i just made a typo somewhere and didn't
see it. but i deleted this and replaced it with a new strategy, which is to continue going backwards
from round 7 to round 0 instead of going forwards from 0 to 7. what this does is creates a larger
number of constraints -- instead of a single set of 8 constraints on the output we have constraints
for each intermediate state in between each round. but each constraint is individually less complex,
which results in the solving going faster and actually working (!!)

```python
solver = claripy.Solver()

# go from round 7 down to 0
for round in range(7, -1, -1):
    print("emulating weird round", round)
    # create symbolic variables for this intermediate state
    state = [claripy.BVS(f"test_r{round}_{i}", 32) for i in range(8)]
    out_state = compression_step(state, w[round], init_k[round])
    for i in range(8):
        # constrain the output to the wanted input state for the next round
        solver.add(out_state[i] == wanted_state[i])

    # transfer input state to next round's wanted state
    wanted_state = [None]*8
    for i in range(8):
        wanted_state[i] = state[i]

# add the initial state constraints
for i in range(8):
    solver.add(state[i] == init_h[i])

print("answers")
for i in range(len(secrets)):
    res = solver.eval(secrets[i], 1)
    sys.stdout.write(hex(res[0]))
    if i < len(secrets) - 1:
        sys.stdout.write(",")
    sys.stdout.flush()

sys.stdout.write("\n")
sys.stdout.flush()
print("SHONKS")
```

this works. for some inputs it takes a long time but most of the time it comes up with the answer
quickly. i was too lazy for an automatic script so i simply pasted the answer into netcat

```
$ python solve.py abdc6d366dd37bb452b56335cd8bfbc043e2284001891d358e04a9e76c3b2a49
...
answers
0x9da77f78,0x24597709,0x4fbc375,0xaa5cd315,0xcd73dc57,0x12dafbf7,0x7e206bb4,0xb9097fba
SHONKS
```

the lesson here is if you want z3 to terminate sometime this millenium it's better to provide a
larger number of small constraints than a small number of Giant constraints even if they're
semantically equivalent. i would probably know this if i had ever been formally trained in symbolic
execution,

---

since my team is [BLÅHAJ](https://blahaj.awoo.systems) and this challenge is named `sharky` it was
mandatory to get this flag, so i'm happy i figured it out even though crypto isn't really my strong
suit (yet,)