247 lines
9.1 KiB
Markdown
247 lines
9.1 KiB
Markdown
# sharky
|
|
|
|
writeup by [haskal](https://awoo.systems) for [BLÅHAJ](https://blahaj.awoo.systems)
|
|
|
|
**Crypto**
|
|
**231 points**
|
|
**39 solves**
|
|
|
|
>Can you find the round keys?
|
|
|
|
provided files: `challenge.py`, `sha256.py`
|
|
|
|
## tldr
|
|
|
|
for this i used [Z3](https://github.com/Z3Prover/z3) except with angr's
|
|
[claripy](https://docs.angr.io/advanced-topics/claripy) frontend instead of the direct frontend
|
|
because i've never used the direct frontend. using claripy/z3 we can construct a symbolic emulation
|
|
of the provided sha256 algorithm with the unknowns being symbolic variables. by applying constraints
|
|
for the hash output you can have z3 solve for the unknown hash constants
|
|
|
|
## writeup
|
|
|
|
initial analysis shows this is a modified sha256 algorithm with the first 8 round constants changed
|
|
to random values. the objective is to figure out what these values are given a known hash of a known
|
|
plaintext. from `challenge.py`:
|
|
|
|
```python
|
|
NUM_KEYS = 8
|
|
MSG = b'Encoded with random keys'
|
|
```
|
|
|
|
first, i checked to see if anything else was funky with the hand coded sha256 implementation by
|
|
initializing the "unknown" round constants to the standard constants and checking if the resulting
|
|
hash matched the expected "standard" hash for this message. it does, so the only difference is the
|
|
first 8 round constants
|
|
|
|
[sha256](https://en.wikipedia.org/wiki/SHA-2) (this links to wikipedia because i literally studied
|
|
sha256 from the wikipedia article during the CTF) starts by padding the message to a multiple of 512
|
|
bits. then, for each chunk of 512 bits it expands the chunk to 2048 bits by bit rotations, xors, and
|
|
arithmetic. the result is called `w`. `w` is compressed into a 256 bit state `h'` by taking the
|
|
previous state and doing 64 rounds of more bit rotations, bitwise ops, and arithmetic combining the
|
|
previous state with the nth word of `w` and the nth word of `k`, the sha256 constants table. `h`
|
|
also starts with known constants. also sha256 basically operates on 32-bit words, always. so all the
|
|
symbolic variables should be 32-bit
|
|
|
|
in this case we see the message is short enough so there is only one 512-bit chunk of padded input.
|
|
this makes things easier. i started by precomputing the `w` array for the known input, since this
|
|
doesn't depend on any (modified) constants. then, i took the verbatim python implementation of the
|
|
algorithm and stripped out every component except the round function `compression_step` that
|
|
combines `h'`, `w[i]` and `k[i]`. i applied some basic simplifications to optimize symbolic
|
|
execution such as replacing the manual bit rotate operation with the `claripy.RotateRight` builtin
|
|
and removing the bitmasking done after addition, since claripy bit vectors can't change size.
|
|
otherwise it's the original provided code, except now it's operating on symbolic variables instead
|
|
of actual numbers. it looks like this
|
|
|
|
```python
|
|
def rotate_right(v, n):
|
|
# w = (v >> n) | (v << (32 - n))
|
|
# return w & 0xffffffff
|
|
return claripy.RotateRight(v, n)
|
|
|
|
def compression_step(state, k_i, w_i):
|
|
a, b, c, d, e, f, g, h = state
|
|
s1 = rotate_right(e, 6) ^ rotate_right(e, 11) ^ rotate_right(e, 25)
|
|
ch = (e & f) ^ (~e & g)
|
|
tmp1 = (h + s1 + ch + k_i + w_i)
|
|
s0 = rotate_right(a, 2) ^ rotate_right(a, 13) ^ rotate_right(a, 22)
|
|
maj = (a & b) ^ (a & c) ^ (b & c)
|
|
tmp2 = (tmp1 + s0 + maj)
|
|
tmp3 = (d + tmp1)
|
|
return (tmp2, a, b, c, tmp3, e, f, g)
|
|
```
|
|
|
|
i figured this was good enough for z3 to work with and just threw it in
|
|
|
|
```
|
|
init_h = [
|
|
# standard constants
|
|
0x6a09e667, 0xbb67ae85, 0x3c6ef372, 0xa54ff53a, 0x510e527f, 0x9b05688c,
|
|
0x1f83d9ab, 0x5be0cd19
|
|
]
|
|
init_h = [claripy.BVV(x, 32) for x in init_h]
|
|
|
|
secrets = [claripy.BVS(f"rc_{i}", 32) for i in range(8)]
|
|
|
|
init_k = secrets + [claripy.BVV(x, 32) for x in [
|
|
... rest of the standard constants
|
|
]]
|
|
|
|
def compression(state, w):
|
|
round_keys = init_k
|
|
for i in range(64):
|
|
state = compression_step(state, round_keys[i], w[i])
|
|
return state
|
|
|
|
w = [...] # computed w
|
|
|
|
real_hash = sys.argv[1]
|
|
real_hash = binascii.unhexlify(real_hash)
|
|
real_hash = struct.unpack(">8L", real_hash)
|
|
|
|
solver = claripy.Solver()
|
|
|
|
state = init_h[:]
|
|
out_state = compression(state, w)
|
|
state = [x + y for x, y in zip(state, out_state)]
|
|
|
|
for i in range(8):
|
|
solver.add(state[i] == real_hash[i])
|
|
|
|
for i in range(8):
|
|
print(solver.eval(secrets[i], 1)[0])
|
|
```
|
|
|
|
this did Not terminate at all. this is a side effect of me never knowing what i'm doing but in
|
|
general it seems that z3 does very poorly when you have few constraints that are each very large and
|
|
complex. in fact these ones were so large you literally cannot print them to the console
|
|
|
|
so i opted to do some handholding. i noticed that since constants 8-63 are the standard ones, the
|
|
state from round 8 to the final hash is completely known, and we can backtrack from the final hash
|
|
to the round 8 state by solving each round individually. here's the annotated code for this
|
|
|
|
```python
|
|
# i set up round 63 with the final addition that occurs after all rounds are complete
|
|
solver = claripy.Solver()
|
|
state = [claripy.BVS(f"state_{i}", 32) for i in range(8)]
|
|
out_state = compression_step(state, w[63], init_k[63])
|
|
for i in range(8):
|
|
solver.add(out_state[i] + init_h[i] == real_hash[i])
|
|
|
|
# this is a helper function that evaluates the input state of a round that produces a known output
|
|
# state and reassigns a new wanted_state for that
|
|
def get_wanted_state():
|
|
global solver
|
|
global state
|
|
wanted_state = [None]*8
|
|
for i in range(8):
|
|
res = solver.eval(state[i], 10)
|
|
if len(res) != 1:
|
|
print("ERROR", i, res)
|
|
raise SystemExit()
|
|
res = res[0]
|
|
wanted_state[i] = res
|
|
return wanted_state
|
|
|
|
# calculate the input to round 63
|
|
wanted_state = get_wanted_state()
|
|
|
|
# emulate 62 downto 8
|
|
# each round, we create a symbolic state array for the input, add constraints for the previously
|
|
# calculated output, and evaluate
|
|
for round in range(62, 8 - 1, -1):
|
|
print("emulating round", round)
|
|
|
|
solver = claripy.Solver()
|
|
state = [claripy.BVS(f"state_{round}_{i}", 32) for i in range(8)]
|
|
out_state = compression_step(state, w[round], init_k[round])
|
|
for i in range(8):
|
|
solver.add(out_state[i] == wanted_state[i])
|
|
|
|
wanted_state = get_wanted_state()
|
|
|
|
# now we have the exact state that round 8 started with
|
|
print("round 8 input state: ", [hex(x) for x in wanted_state])
|
|
```
|
|
|
|
now there should be a lot less work for z3 to guess on, it only needs to work through rounds 0-7
|
|
with the symbolic round constants
|
|
|
|
```python
|
|
state = init_h[:]
|
|
out_state = compression_round_0_7(state, w)
|
|
state = [x + y for x, y in zip(state, out_state)]
|
|
|
|
for i in range(8):
|
|
solver.add(state[i] == wanted_state[i])
|
|
|
|
for i in range(8):
|
|
print(solver.eval(secrets[i], 1)[0])
|
|
```
|
|
|
|
this took absolutely ages to complete and when it did the result was `unsat`. yikes
|
|
|
|
i'm still not sure what exactly the issue was. it's likely i just made a typo somewhere and didn't
|
|
see it. but i deleted this and replaced it with a new strategy, which is to continue going backwards
|
|
from round 7 to round 0 instead of going forwards from 0 to 7. what this does is creates a larger
|
|
number of constraints -- instead of a single set of 8 constraints on the output we have constraints
|
|
for each intermediate state in between each round. but each constraint is individually less complex,
|
|
which results in the solving going faster and actually working (!!)
|
|
|
|
```python
|
|
solver = claripy.Solver()
|
|
|
|
# go from round 7 down to 0
|
|
for round in range(7, -1, -1):
|
|
print("emulating weird round", round)
|
|
# create symbolic variables for this intermediate state
|
|
state = [claripy.BVS(f"test_r{round}_{i}", 32) for i in range(8)]
|
|
out_state = compression_step(state, w[round], init_k[round])
|
|
for i in range(8):
|
|
# constrain the output to the wanted input state for the next round
|
|
solver.add(out_state[i] == wanted_state[i])
|
|
|
|
# transfer input state to next round's wanted state
|
|
wanted_state = [None]*8
|
|
for i in range(8):
|
|
wanted_state[i] = state[i]
|
|
|
|
# add the initial state constraints
|
|
for i in range(8):
|
|
solver.add(state[i] == init_h[i])
|
|
|
|
print("answers")
|
|
for i in range(len(secrets)):
|
|
res = solver.eval(secrets[i], 1)
|
|
sys.stdout.write(hex(res[0]))
|
|
if i < len(secrets) - 1:
|
|
sys.stdout.write(",")
|
|
sys.stdout.flush()
|
|
|
|
sys.stdout.write("\n")
|
|
sys.stdout.flush()
|
|
print("SHONKS")
|
|
```
|
|
|
|
this works. for some inputs it takes a long time but most of the time it comes up with the answer
|
|
quickly. i was too lazy for an automatic script so i simply pasted the answer into netcat
|
|
|
|
```
|
|
$ python solve.py abdc6d366dd37bb452b56335cd8bfbc043e2284001891d358e04a9e76c3b2a49
|
|
...
|
|
answers
|
|
0x9da77f78,0x24597709,0x4fbc375,0xaa5cd315,0xcd73dc57,0x12dafbf7,0x7e206bb4,0xb9097fba
|
|
SHONKS
|
|
```
|
|
|
|
the lesson here is if you want z3 to terminate sometime this millenium it's better to provide a
|
|
larger number of small constraints than a small number of Giant constraints even if they're
|
|
semantically equivalent. i would probably know this if i had ever been formally trained in symbolic
|
|
execution,
|
|
|
|
---
|
|
|
|
since my team is [BLÅHAJ](https://blahaj.awoo.systems) and this challenge is named `sharky` it was
|
|
mandatory to get this flag, so i'm happy i figured it out even though crypto isn't really my strong
|
|
suit (yet,)
|