writeups/2023/csawq/01-crypto-blocky.md

394 lines
14 KiB
Markdown

## copying and pasting code you don't entirely understand for fun and profit: *blocky noncense*
> I designed this foolproof signing blockchain. I'll even let you sign as many signatures as you
> want!
it's time for everyone's favorite part of crypto challenges, *source review*
### `blocks.sage`
```python
from Crypto.Cipher import AES
import sig_sage as sig # this is generated from sig.sage
import hashlib
class Chain:
def __init__(self, seed):
self.flag = b"csaw{[REDACTED]}"
self.ecdsa = sig.ECDSA(seed)
self.blocks = {0: [hashlib.sha1(self.flag).digest(), self.ecdsa.sign(self.flag)]}
def commit(self, message, num):
formatted = self.blocks[num-1][0] + message
sig = self.ecdsa.sign(formatted)
self.blocks[num] = [hashlib.sha256(message).digest(), sig]
def view_messages(self):
return self.blocks
def verify_sig(self, r, s, message):
t = self.ecdsa.verify(r, s, message)
return t
```
this is the definition of the blockchain, we can see that the first block encodes the SHA-1 hash of
the flag, and every subsequent block is the SHA-256 (!) hash of the previous block hash combined
with the message, and a signature of the same data
### `sig.sage`
```python
from Crypto.Util.number import *
from Crypto.Cipher import AES
import random
import hashlib
def _hash(msg):
return bytes_to_long(hashlib.sha1(msg).digest())
```
the hashes used for signature calculations are SHA-1. tbh i'm not sure why some hashes are SHA-1 and
others are SHA-256, and it's not relevant for the solution but we need to keep it in mind to be able
to replicate operations in the solution code
```python
class LCG:
def __init__(self, seed, q):
self.q = q
self.a = random.randint(2,self.q)
self.b = random.randint(2,self.a)
self.c = random.randint(2,self.b)
self.d = random.randint(2,self.c)
self.seed = seed
def next(self):
seed = (self.a*self.seed^3 + self.b*self.seed^2 + self.c*self.seed + self.d) % self.q
self.seed = seed
return seed
```
this defines a Linear Congruential Generator, a type of PRNG that uses a polynomial over previous
state to generate new state. this one uses a cubic polynomial with random coefficients
```python
class ECDSA:
def __init__(self, seed):
self.p = 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFC2F
self.E = EllipticCurve(GF(self.p), [0, 7])
self.G = self.E.lift_x(55066263022277343669578718895168534326250603453777594175500187360389116729240)
self.order = 115792089237316195423570985008687907852837564279074904382605163141518161494337
self.priv_key = random.randint(2,self.order)
self.pub_key = self.G*self.priv_key
self.lcg = LCG(seed, self.order)
```
it's always useful to double check elliptic curve params, so if we look up these numbers we can find
out that this is the curve `secp256k1` (also used in bitcoin). so it's just a standard curve, no
cheese going on
the other interesting bit here is the LCG is initialized with a modulus equal to the order of
`secp256k1` - this is sort of convenient for later
```python
def sign(self, msg):
nonce = self.lcg.next()
hashed = _hash(msg)
r = int((self.G*nonce)[0]) % self.order
assert r != 0
s = (pow(nonce,-1,self.order)*(hashed + r*self.priv_key)) % self.order
return (r,s)
```
the sign function generates an ECDSA signature using the SHA-1 hash of the given message, and a
nonce value is computed using the LCG
one important fact you might know about ECDSA is that it's very dangerous to reuse nonces because it
allows recovery of the private key. however, nonce reuse doesn't seem to be an issue here, so we
have to find something else
also i cut out the `verify` function because we don't actually care
### `chall.sage`
it basically just makes a menu to interact with the challenge, it's not super interesting
### elliptic curve crypto
[skip this section](#duckduckbing-solve-this-for-me)
i'm going to actually go over ECC even though you really don't actually need to know what's going on
at all for this. just for context
**Group**: it's kind of like a generalization of addition (or multiplication??? don't think about it
too hard). for example, the integers with the integer addition operation are a group. specifically,
groups are a **set of elements** with a **binary operator** that takes two elements and produces a
third element that's in the set, and there's an **identity element** and an **inverse for every
element**. also the operation doesn't have to be commutative
**Field**: a generalization of addition, subtraction, multiplication, and division. something we're
interested in here is a *finite field* which is also called a *Galois field* (after this weird
french revolutionary who solved all of math in his teens and then died in a duel -- fucking
frenchmen am i right). a convenient type of finite field is integers modulo a prime number `p`. you
can take a moment to convince yourself that all arithmetic still works kinda like normal when it's
all modulo a prime number (also think of division as a multiplicative inverse). or don't, i'm not a
textbook author. this is called `GF(p)`
**Elliptic Curve**: an elliptic curve is defined by the equation `y^2 = x^3 + ax + b`. elliptic
curves can also be defined over *Galois fields*, in which case all the coordinates of points on the
curve will be integers. elliptic curves can conveniently form *groups* with the group operation of
point multiplication. don't worry about point multiplication except knowing that it takes two curve
points and produces a third point
when considering some elliptic curve `E` on `GF(p)`, a group can be formed using a generator point
`G` that is on the curve, and that is capable of producing every other point in the group by using
the group operator some number of times. so we say that `G` "multiplied" by some scalar `s` as in
`G*s` means that you are applying the group operation on `G` with itself `s` times. confusingly, we
notate this `G*s` despite the fact that it's really more analogous to "raising to a power". but
sometimes, if you have two elliptic curve points `A` and `B`, then `A*B` means simply doing the
group operation on `A` and `B` once. math people are normal, don't worry about it
the elliptic curve group also has an *order* `q`, which is the number of elements in the group. the
order is not necessarily the same as the size of the field that the curve is defined on (`p != q`)
there are a bunch of standard curves which are predefined for use in cryptography, so you don't have
to generate your own curves from scratch, which is hard. NIST produces several, and `secp256k1` is
one of the NIST curves
the property that makes this useful for cryptography is the difficulty of the *Discrete Logarithm
Problem*. basically, if you do `G*s` for some secret `s`, it's computationally Very Hard™ to recover
what `s` was
and now, finally, we get to signatures. first, generate a keypair. `d` is the private key, a scalar
in the range of `0..q`, and the public key is `Q = G*d`, a curve point
to sign a message `m`, take its hash `H(m)`. then, generate a random nonce `k` (the RNG must be very
strong). produce two values
```text
r = (G*k).x % q
s = (k^-1) * (H(m) + r*d) % q
```
(btw here `(expr).x` means the x-coordinate of a curve point. also sorry for the confusing mixture
of group operation and regular arithmetic in `Zmod(q)`. for further questions, consult the nearest
wikipedia and/or crypto girl infodump)
the signature consists of the pair `(r, s)`, which you can also see in the challenge code
next, to verify a signature
```text
u1 = H(m)*s^-1 % n
u2 = r*s^-1 % n
r == (u1*G + u2*Q).x
```
extracting the private key from a signature would require solving the discrete log problem, and
signatures cannot be forged without the private key
and that's basically it. now you know about elliptic curves
### duckduckbing solve this for me
top 10 CTF skills you need to have in 2023 -- search online for the solution to a problem
so i immediately went and searched for something like "ECDSA LCG attack", y'know just based on the
algorithms that are present here and what we want to do, and find this 2023 paper:
<https://eprint.iacr.org/2023/305.pdf>
that's pretty convenient, but we don't have time to read a crypto paper, so scroll down to where
they list the PoC repo:
<https://github.com/kudelskisecurity/ecdsa-polynomial-nonce-recurrence-attack>
(the key takeaway is that you can construct a polynomial out of the signatures, knowing the
structure of the recurrence relation for the nonces, and then one of the roots of the polynomial is
the private key used for the signatures)
we need to adapt this to the challenge implementation of #blockchain, so next we take a look at the
basic implementation of the attack, ignoring all the stuff that is specific to bitcoin and ethereum,
which is in `original-attack/recurrence_nonces.py`
first, there's a comment explaining how many signatures we need to collect for our case
```python
# N = the number of signatures to use, N >= 4
# the degree of the recurrence relation is N-3
# the number of unknown coefficients in the recurrence equation is N-2
# the degree of the final polynomial in d is 1 + Sum_(i=1)^(i=N-3)i
```
since the LCG has 4 unknown coefficients, we need `N=6`
now, we simply need to do the pro gamer strategy of copying and pasting code we barely understand
```python
N = 6
order=115792089237316195423570985008687907852837564279074904382605163141518161494337
Z = GF(order)
R = PolynomialRing(Z, names=('dd',))
(dd,) = R._first_ngens(1)
# the polynomial we construct will have degree 1 + Sum_(i=1)^(i=N-3)i in dd
# our task here is to compute this polynomial in a constructive way starting from the N signatures in the given list order
# the generic formula will be given in terms of differences of nonces, i.e. k_ij = k_i - k_j where i and j are the signature indexes
# each k_ij is a first-degree polynomial in dd
# this function has the goal of returning it given i and j
def k_ij_poly(i, j):
hi = Z(h[i])
hj = Z(h[j])
s_invi = Z(s_inv[i])
s_invj = Z(s_inv[j])
ri = Z(r[i])
rj = Z(r[j])
poly = dd*(ri*s_invi - rj*s_invj) + hi*s_invi - hj*s_invj
return poly
# the idea is to compute the polynomial recursively from the given degree down to 0
# the algorithm is as follows:
# for 4 signatures the second degree polynomial is:
# k_12*k_12 - k_23*k_01
# so we can compute its coefficients.
# the polynomial for N signatures has degree 1 + Sum_(i=1)^(i=N-3)i and can be derived from the one for N-1 signatures
# let's define dpoly(i, j) recursively as the dpoly of degree i starting with index j
def dpoly(n, i, j):
if i == 0:
return (k_ij_poly(j+1, j+2))*(k_ij_poly(j+1, j+2)) - (k_ij_poly(j+2, j+3))*(k_ij_poly(j+0, j+1))
else:
left = dpoly(n, i-1, j)
for m in range(1,i+2):
left = left*(k_ij_poly(j+m, j+i+2))
right = dpoly(n, i-1, j+1)
for m in range(1,i+2):
right = right*(k_ij_poly(j, j+m))
return (left - right)
```
### rest of the fucking owl
now with the "hard" stuff out of the way we're actually gaming
let's build the rest of the solution
```python
def _hex(x):
return binascii.hexlify(x).decode()
def _hash(msg):
return bytes_to_long(hashlib.sha1(msg).digest())
conn = nclib.Netcat(("crypto.csaw.io", 5002))
```
so the strategy is: we make 6 messages with known contents, and then we retrieve the contents of the
blockchain, which includes the message hashes (note: SHA-256 hashes, the ones used for block
chaining, not the SHA-1 hashes used for the signature itself) and signatures. this also retrieves
the original block containing the flag, but since we don't know the flag we don't include that in
the attack algorithm
```python
h = []
sgns = []
msgs = []
def get_messages(count):
all_msgs = []
conn.recv_until(b": ")
conn.send_line(b"2")
for i in range(count):
conn.recv_until(f"Block {i}\n".encode())
msg_line = conn.recv_line()
print(msg_line)
msg = binascii.unhexlify(msg_line.strip().split(b" ")[1])
sig_line = conn.recv_line()
sig_rs = sig_line.decode().split("(")[1].split(")")[0]
r, s = sig_rs.split(", ")
r = int(r)
s = int(s)
all_msgs.append((msg, r, s))
return all_msgs
for i in range(N):
msg = f"meow{i}".encode()
msgs.append(msg)
conn.recv_until(b": ")
conn.send_line(b"1")
conn.send_line(_hex(msg))
all_msgs = get_messages(N+1)
```
next, we process the data into a form suitable for the paper's PoC code
```python
for i in range(N):
(_, r, s) = all_msgs[i+1]
prev_msg = all_msgs[i][0]
real_hash = _hash(prev_msg + msgs[i])
h.append(real_hash)
sgns.append((r,s))
s_inv = []
s = []
r = []
for i in range(N):
s.append(sgns[i][1])
r.append(sgns[i][0])
s_inv.append(ecdsa.numbertheory.inverse_mod(s[i], order))
```
finally, do the attack
```python
poly_target = dpoly(N-4, N-4, 0)
d_guesses = poly_target.roots()
print("results!!!!!\n\n\n")
print(d_guesses)
```
and send it to the server, which should then give back the flag
```python
if len(d_guesses) == 1:
conn.recv_until(b": ")
conn.send_line(b"4")
res = str(d_guesses[0][0]).encode()
print("SENDING", res)
conn.send_line(res)
print(conn.recv_all(timeout=5))
```
let's run this on the local copy and you can see it prints the flag
```bash
ncat --exec "/usr/bin/sage chall.sage" -lp 1337 &
sage attack.sage
b'Message 0058d5ebdef0730f71787e6ff2617ffdd3d0060f\n'
b'Message 5be17c92f7ecb50ab5a44969957c3c036122f7dbd8e2cd4fd7a5c1974ba22f66\n'
b'Message fec07ac858649497dedb5136d9c3eff1e61394c549286888fb0a893d74f035f2\n'
b'Message 1dbbae23136fd579d032d52cb6c5b516ead212103f0d2285092b9c5f1412bced\n'
b'Message 4231d9b332338107def060b2a271bf6d6aded7b643b54744538e99901b38886a\n'
b'Message 2014ed4b6e2ba09e8d3c5be90d81881d4ac30531adc7644436122386d2ee42b3\n'
b'Message 844005fbfdd684f7e9e07f751a48d2550b288f702bb3ed0004735471a41b1229\n'
(((k12*k12-k23*k01)*k13*k23-(k23*k23-k34*k12)*k01*k02)*k14*k24*k34-((k23*k23-k34*k12)*k24*k34-(k34*k34-k45*k23)*k12*k13)*k01*k02*k03)results!!!!!
[(109627846673527107412168645315617142862787762347616623632546915862323406286018, 1)]
SENDING b'109627846673527107412168645315617142862787762347616623632546915862323406286018'
b"So you think you can get the flag huh? Try your luck.\nPrivate Key: You must be our admin. Here's the flag b'csaw{[REDACTED]}'\n"
```
full attack script:
<https://git.lain.faith/haskal/writeups/raw/branch/main/2023/csawq/attack_blocky.sage>