10
2
Fork 0
has-writeup/payload/leakycrypto
xenia 553522c799 huge major cleanup for pdf generation 2020-05-26 06:27:26 -04:00
..
README.md huge major cleanup for pdf generation 2020-05-26 06:27:26 -04:00
attack.py added leakycrypto 2020-05-26 01:19:24 -04:00

README.md

Leaky Crypto

Many optimized implementations of AES utilize lookup tables to combine all steps of each round of the algorithm (SubBytes, ShiftRows, MixColumns, AddKey) into a single operation. For some X (the plaintext or the result from the previous round) and some K (the round key), they are split bytewise and the XOR product of each respective byte pair is used as the index into a lookup table. During the first round of AES, X is the plaintext of the message, and K is the original message key. Accordingly, given some known plaintext, leaking the index into the lookup table for a particular character leaks the corresponding key byte. There are four lookup tables which are used in each iteration of AES (besides the last round) and which is used is determined by the index of the byte MOD 4. We utilized this paper as a reference for both our understanding of AES and the attack we will detail below.

Many CPUs cache RAM accesses so as to speed up subsequent accesses to the same address. This is done because accessing RAM is quite slow, and accessing cache is quite fast. This behavior would imply that on systems which implement such caching methods, there is a correlation between the amount of time it takes to encrypt a particular plaintext and the occurrences of repeated values of a plaintext byte XORd with a key byte. Accordingly, for every i, j, pi ⊕ pj in a family (with i, j being byte indexes, p being the plaintext, and families corresponding to which lookup table is being used), we calculate the average time to encrypt such a message over all messages. We then determine if for any pair of characters pi, pj there is a statistically significant shorter encryption time compared to the average. If so, we can conclude that i ⊕ ki = pj ⊕ kj => pi ⊕ pj = ki ⊕ kj. From this information, we gain a set of redundant system of equations relating different key bytes at different indexes with each other. It is important to note that in order for this attack to work, we must know at least one key byte in each family in order to actually solve each system of equations. Additionally, due to how cache works, this attack only leaks the most significant q bits (q being related to the number of items in a cache line). Once the set of possible partial keys (accounting for the ambiguity in the least significant bits of each derived byte) has been obtained by the above method, an attacker may brute force the remaining unknown key bytes.

In the case of Leaky Crypto, a set of 100,000 plaintexts and corresponding encryption times is provided along with the first six bytes of the encryption key. We ran an analyzer program[^1] against these plaintexts to obtain the probable correlation between different indexes in the key with respect to the XOR product of those bytes with plaintext bytes. Per the above, the plaintexts and timing data provided enough information to derive the systems of equations which may be used to solve for key bytes, and the first 6 bytes of the key provided enough information to actually solve said systems of equations. Given the ambiguity of the low bits of each derived key byte, we obtained 214 partial keys with three unknown bytes each. Thus, we reduced the problem of guessing 2128 bits to guessing only 238 bits. We fed our derived partial keys into Hulk to brute force the remaining bytes for each candidate partial key. After 30 minutes had passed, we successfully brute forced the key.

    from itertools import combinations
    import matplotlib.pyplot as plt
    import numpy as np

    def find_outliers(corpus, num_samps, i, j):
        idxs = corpus[i][j].argsort()[:num_samps]
        return idxs
    
    def guess_bytes(corpus, known_keybytes, num_samps, avg):
        candidates = []
        for base in range(4):
            family = [base, base + 4, base + 8, base + 12]
            for combo in combinations(family, 2):
                i,j = combo
                guesses = find_outliers(corpus, num_samps, i, j)
                guesses2 = []
                for guess in guesses:
                    cnt = corpus[i][j][guess]
                    if cnt-avg < -10:
                        guesses2.append((i, j, guess, cnt-avg))
                    print(i, j, guess, cnt - avg)
                candidates.append(tuple(guesses2))
        print(candidates)
            
    if __name__ == '__main__':
        known_keybytes = bytes.fromhex("64c7072487f2")
        secret_data = "c1a5fe7beb2c70bfab98926627dcff8b9671edc52441....."
    
        data = set()
        with open("test.txt", "r") as fp:
            for line in fp:
                pt, timing = line.strip().split(',')
                pt = bytes.fromhex(pt)
                timing = int(timing)
                data.add((pt, timing))
            
        tavg = sum((d[1] for d in data)) / len(data)
        print("tavg: %d" % tavg)
    
        known_tly = np.zeros((16, 16, 256))
    
        for base in range(4):
            print("Building corpus for family %d" % base)
            family = [base, base + 4, base + 8, base + 12]
            for combo in combinations(family, 2):
                times = np.zeros(256)
                counts = np.zeros(256)
                i,j = combo
                print("Working on %d, %d" % (i, j))
                for d in data:
                    n = d[0][i] ^ d[0][j]
                    c = d[1]
                    times[n] += c
                    counts[n] += 1
                for c in range(256):
                    cnorm = times[c] / counts[c]
                    known_tly[i][j][c] = cnorm
                    known_tly[j][i][c] = cnorm
                
        guess_bytes(known_tly, known_keybytes, 4, tavg)