Triss 057f41dcd2 | ||
---|---|---|
img | ||
src | ||
.gitignore | ||
CMakeLists.txt | ||
README.md | ||
pico_sdk_import.cmake |
README.md
rorand
Ring oscillator-based TRNG for the RP2040
What is it
rorand
provides a true random number generator (TRNG) using only hardware
that's available inside the
RP2040
chip. It can be used either in its 'raw' form, or with an extra
whitening/compression step based on Ascon-Xofa.
Comparison
Name | Throughput | Entropy assurance | Extra components | Whitening | URL |
---|---|---|---|---|---|
rorand (this work) | 40 kbit/s | passes Dieharder (see below) | none | Ascon-Xofa | you are here! |
Adafruit Trinkey TRNG | (40 kbit/s comms. limit) | Common Criteria EAL6+ | Infineon Trust M | ? | https://learn.adafruit.com/trinkey-qt2040-true-random-number-generator/overview |
caprand | ? | ? | one 10nF capacitor | ChaCha20 | https://github.com/mkj/caprand |
rp2040-entropy | 300 kbit/s? | fails a few dieharder tests | none (ADC noise) | none | https://github.com/hashky/rp2040-entropy |
rprand | ~10 kbit/s | ? | none (ROSC+DMA) | CRC32 | https://github.com/alastairpatrick/rprand |
pico-sdk | ? (probably >1Mbit/s) | ? (seed only, RNG is Xorshiro) | none (ROSC seed) | Xorshiro128 | https://github.com/raspberrypi/pico-sdk/blob/master/src/rp2_common/pico_rand/rand.c |
Cornell ECE4760 | ? | convergence to normal distribution | none (ROSC seed) | newlib/RAND48 | https://people.ece.cornell.edu/land/courses/ece4760/RP2040/C_SDK_random/index_random.html |
Requirements
- Pico C SDK
- System must not run from ROSC
- System clock must be at least 48 MHz (80 MHz or higher preferred, 125 MHz or higher ideal)
- SysTick not used by something else
How it works
The RP2040 includes a ring oscillator (unimaginatively called ROSC). Such an oscillator is nothing more than a chain of inverters wired up in a loop. If the number of inverters is odd, the circuit will oscillate.
Ring oscillators (just like many other oscillators) exhibit a phenomenon called jitter: the length of the clock period varies ever so slightly from one period to the next. This stems from various forms of noise (thermal noise, shot noise, etc.) in the semiconductor, which are ultimately caused by the inherent randomness of quantum physics.
The RP2040 ROSC has a register called "RANDOMBIT", which is basically a way of sampling the current state of the oscillator. On first sight, this sounds like it would be random. However, much of the perceived 'randomness' would actually stem from aliasing, because the code reading this register would be undersampling.
Instead, let's use the jitter phenomenon slightly differently: the phase
(jitter) noise from multiple oscillator cycles accumulates. Thus, the actual
real-world time it takes to complete N
clock cycles has a larger variance
than the jitter of just one clock cycle. This is visualized in the image below:
By using the COUNT register of the RP2040 ROSC, we can time how long it takes for it to reach zero (compared to eg. the ARM SysTick timer running from the CPU clock). We can use the least-significant bit of this timing difference (counted in CPU cycles) as the randomness output!
(As a sidenote: 'real professional' ring oscillator-based TRNGs typically use a "transient effect ring oscillator" (TERO) design, which is a bit more robust. Though, it also needs multiple oscillators and counters, and we don't have that available here. So, the basic RO TRNG it is.)
Well, almost. We still need to do a crucial final step: entropy extraction. There are two types of entropy extractors used most often: the parity filter (which XORs the last N output bits together), and the Von Neumann filter (described in the link). The Von Neumann filter is able to remove all statistical bias, but has more requirements on the probability distribution of the inputs (they must be "Independent and identically distributed", or IID for short). Meanwhile, the parity filter has fewer requirements (only independence of inputs), but it cannot remove all bias in practice (though it converges to this for larger and larger N).
For a ring oscillator, one would expect the random bits (generated as described above) to be IID, so we can use the Von Neumann extractor. (This assumption seems to hold in practice.) Oh, and we can still XOR this bit with the RANDOMBIT register, because it's there.
This is essentially how the "raw" output is generated. rorand
also provides a
"cooked" output, which feeds the output of the "raw" TRNG into Ascon-Xofa. This
creature is a cryptographic PRNG based on what cryptographers call a "sponge
construction". Sponge
constructions have all sorts of cool properties (which I won't talk about here
because it'd detract from the article, sadly), but one of them is that you can
build an extensible-output function (XOF) from them. XOFs are like hash
functions, but the output length can be made arbitrarily long. This makes it
rather easy to make a cryptographically-secure reseedable PRNG.
Ascon is the finalist for the "lightweight portfolio" of the CAESAR competition. It's a rather new (but promising, and built from solid foundations) cryptographic cipher, specifically designed for slow embedded devices, such as the RP2040. It should thus be faster than other options like ChaCha20/BLAKE2 or ARC4Random (which is now considered insecure). AEGIS and Deoxys-II, the other two CAESAR finalists for other purposes, are based on the AES round transformation, and would thus be quite a bit slower.
How good is it
Entropy
The randomness obtained from this TRNG seems to pass the Dieharder test suite. The output is given below.
Note that a few entries are marked as WEAK
, but those seem to appear and
disappear when I try and retry different runs. This seems to happen due to the
used sample size: one megabyte of random data takes a few minutes to generate,
so much larger samples are quite a bit of a hassle to obtain. Running output
from /dev/random
with the same sizes also has the WEAK
warnings pop up and
disappear at random, so it should be fine.
WARNING: However, these WEAK
warnings seem to pop up more often with
decreasing RP2040 system clock speed (and, slightly, voltage as well). While no
FAILED
errors have popped up, depending on your needs, only use rorand
with
a high enough system clock frequency (≥80 MHz preferred, ≥125 MHz ideal).
Dieharder isn't the most stringent test suite, though. TestU01's BigCrush suite is much more comprehensive, but requires much more data (many gigabytes). Due to the limited throughput of a hardware TRNG, using this test isn't very feasible. Real evaluations such as FIPS 140-2, or Common Criteria EAL4 or higher are completely out of my means.
In conclusion, don't trust someone's life --- neither your own nor someone else's --- with this. But otherwise, you should be fine. If you need the assurance (or if you're paranoid), please use a FIPS- or Common Criteria-certified hardware module, though.
Dieharder output (raw output, RP2040 running at 250 MHz, core voltage 1.15 V):
#=============================================================================#
test_name |ntup| tsamples |psamples| p-value |Assessment
#=============================================================================#
diehard_birthdays| 0| 100| 100|0.22538373| PASSED
diehard_operm5| 0| 1000000| 100|0.63794545| PASSED
diehard_rank_32x32| 0| 40000| 100|0.73090642| PASSED
diehard_rank_6x8| 0| 100000| 100|0.99254745| PASSED
diehard_bitstream| 0| 2097152| 100|0.58114397| PASSED
diehard_opso| 0| 2097152| 100|0.18319256| PASSED
diehard_oqso| 0| 2097152| 100|0.93914840| PASSED
diehard_dna| 0| 2097152| 100|0.74871515| PASSED
diehard_count_1s_str| 0| 256000| 100|0.20413208| PASSED
diehard_count_1s_byt| 0| 256000| 100|0.56507750| PASSED
diehard_parking_lot| 0| 12000| 100|0.43004782| PASSED
diehard_2dsphere| 2| 8000| 100|0.68150341| PASSED
diehard_3dsphere| 3| 4000| 100|0.50961304| PASSED
diehard_squeeze| 0| 100000| 100|0.73895828| PASSED
diehard_sums| 0| 100| 100|0.89067279| PASSED
diehard_runs| 0| 100000| 100|0.88549264| PASSED
diehard_runs| 0| 100000| 100|0.17826705| PASSED
diehard_craps| 0| 200000| 100|0.74363795| PASSED
diehard_craps| 0| 200000| 100|0.92809716| PASSED
marsaglia_tsang_gcd| 0| 10000000| 100|0.21033260| PASSED
marsaglia_tsang_gcd| 0| 10000000| 100|0.90826850| PASSED
sts_monobit| 1| 100000| 100|0.66704052| PASSED
sts_runs| 2| 100000| 100|0.82376608| PASSED
sts_serial| 1| 100000| 100|0.40789077| PASSED
sts_serial| 2| 100000| 100|0.11533197| PASSED
sts_serial| 3| 100000| 100|0.57906405| PASSED
sts_serial| 3| 100000| 100|0.88247272| PASSED
sts_serial| 4| 100000| 100|0.35109410| PASSED
sts_serial| 4| 100000| 100|0.06001064| PASSED
sts_serial| 5| 100000| 100|0.24597035| PASSED
sts_serial| 5| 100000| 100|0.42776664| PASSED
sts_serial| 6| 100000| 100|0.34671248| PASSED
sts_serial| 6| 100000| 100|0.63443378| PASSED
sts_serial| 7| 100000| 100|0.34901523| PASSED
sts_serial| 7| 100000| 100|0.87482945| PASSED
sts_serial| 8| 100000| 100|0.83333396| PASSED
sts_serial| 8| 100000| 100|0.98096154| PASSED
sts_serial| 9| 100000| 100|0.18613150| PASSED
sts_serial| 9| 100000| 100|0.98222632| PASSED
sts_serial| 10| 100000| 100|0.06064720| PASSED
sts_serial| 10| 100000| 100|0.53659409| PASSED
sts_serial| 11| 100000| 100|0.48783888| PASSED
sts_serial| 11| 100000| 100|0.44013196| PASSED
sts_serial| 12| 100000| 100|0.89056741| PASSED
sts_serial| 12| 100000| 100|0.55084354| PASSED
sts_serial| 13| 100000| 100|0.55991536| PASSED
sts_serial| 13| 100000| 100|0.79919790| PASSED
sts_serial| 14| 100000| 100|0.99559967| WEAK
sts_serial| 14| 100000| 100|0.60230091| PASSED
sts_serial| 15| 100000| 100|0.91600221| PASSED
sts_serial| 15| 100000| 100|0.29385991| PASSED
sts_serial| 16| 100000| 100|0.80170010| PASSED
sts_serial| 16| 100000| 100|0.89169608| PASSED
rgb_bitdist| 1| 100000| 100|0.25864662| PASSED
rgb_bitdist| 2| 100000| 100|0.41244241| PASSED
rgb_bitdist| 3| 100000| 100|0.80837541| PASSED
rgb_bitdist| 4| 100000| 100|0.93133416| PASSED
rgb_bitdist| 5| 100000| 100|0.70903269| PASSED
rgb_bitdist| 6| 100000| 100|0.29717638| PASSED
rgb_bitdist| 7| 100000| 100|0.71361547| PASSED
rgb_bitdist| 8| 100000| 100|0.03450956| PASSED
rgb_bitdist| 9| 100000| 100|0.14545557| PASSED
rgb_bitdist| 10| 100000| 100|0.90299368| PASSED
rgb_bitdist| 11| 100000| 100|0.53321027| PASSED
rgb_bitdist| 12| 100000| 100|0.65243210| PASSED
rgb_minimum_distance| 2| 10000| 1000|0.29295560| PASSED
rgb_minimum_distance| 3| 10000| 1000|0.28825100| PASSED
rgb_minimum_distance| 4| 10000| 1000|0.84249889| PASSED
rgb_minimum_distance| 5| 10000| 1000|0.53807064| PASSED
rgb_permutations| 2| 100000| 100|0.72182964| PASSED
rgb_permutations| 3| 100000| 100|0.98132372| PASSED
rgb_permutations| 4| 100000| 100|0.32426721| PASSED
rgb_permutations| 5| 100000| 100|0.94033412| PASSED
rgb_lagged_sum| 0| 1000000| 100|0.78964883| PASSED
rgb_lagged_sum| 1| 1000000| 100|0.82881744| PASSED
rgb_lagged_sum| 2| 1000000| 100|0.90805811| PASSED
rgb_lagged_sum| 3| 1000000| 100|0.21428921| PASSED
rgb_lagged_sum| 4| 1000000| 100|0.67964064| PASSED
rgb_lagged_sum| 5| 1000000| 100|0.64463634| PASSED
rgb_lagged_sum| 6| 1000000| 100|0.94268050| PASSED
rgb_lagged_sum| 7| 1000000| 100|0.33288741| PASSED
rgb_lagged_sum| 8| 1000000| 100|0.79565468| PASSED
rgb_lagged_sum| 9| 1000000| 100|0.26291492| PASSED
rgb_lagged_sum| 10| 1000000| 100|0.23887951| PASSED
rgb_lagged_sum| 11| 1000000| 100|0.75511918| PASSED
rgb_lagged_sum| 12| 1000000| 100|0.44806331| PASSED
rgb_lagged_sum| 13| 1000000| 100|0.57950849| PASSED
rgb_lagged_sum| 14| 1000000| 100|0.37109436| PASSED
rgb_lagged_sum| 15| 1000000| 100|0.89758378| PASSED
rgb_lagged_sum| 16| 1000000| 100|0.06000037| PASSED
rgb_lagged_sum| 17| 1000000| 100|0.87051454| PASSED
rgb_lagged_sum| 18| 1000000| 100|0.47509380| PASSED
rgb_lagged_sum| 19| 1000000| 100|0.34523765| PASSED
rgb_lagged_sum| 20| 1000000| 100|0.98418490| PASSED
rgb_lagged_sum| 21| 1000000| 100|0.17652983| PASSED
rgb_lagged_sum| 22| 1000000| 100|0.17374595| PASSED
rgb_lagged_sum| 23| 1000000| 100|0.63170066| PASSED
rgb_lagged_sum| 24| 1000000| 100|0.28953394| PASSED
rgb_lagged_sum| 25| 1000000| 100|0.64849257| PASSED
rgb_lagged_sum| 26| 1000000| 100|0.78468683| PASSED
rgb_lagged_sum| 27| 1000000| 100|0.36484261| PASSED
rgb_lagged_sum| 28| 1000000| 100|0.66712700| PASSED
rgb_lagged_sum| 29| 1000000| 100|0.48604229| PASSED
rgb_lagged_sum| 30| 1000000| 100|0.17591412| PASSED
rgb_lagged_sum| 31| 1000000| 100|0.32433586| PASSED
rgb_lagged_sum| 32| 1000000| 100|0.12971163| PASSED
rgb_kstest_test| 0| 10000| 1000|0.47803450| PASSED
dab_bytedistrib| 0| 51200000| 1|0.29203091| PASSED
dab_dct| 256| 50000| 1|0.06993184| PASSED
Preparing to run test 207. ntuple = 0
dab_filltree| 32| 15000000| 1|0.38387488| PASSED
dab_filltree| 32| 15000000| 1|0.55295440| PASSED
Preparing to run test 208. ntuple = 0
dab_filltree2| 0| 5000000| 1|0.24922510| PASSED
dab_filltree2| 1| 5000000| 1|0.98227801| PASSED
Preparing to run test 209. ntuple = 0
dab_monobit2| 12| 65000000| 1|0.71153312| PASSED
At 48 MHz and a core voltage of 0.95 V, there would be three or four WARN
ings
(but still no FAILED
messages).
Throughput
The TRNG is rather CPU-intensive (due to its IRQ-blocking busyloop) and not very fast. The "cooked" (whitened) throughput is about 10 times higher than the "raw" throughput.
System clock | Raw TRNG throughput |
---|---|
48 MHz | 10 kbit/s |
125 MHz | 40 kbit/s |
250 MHz | 50 kbit/s |
What is missing?
- Online health tests: I don't know how to do this. I don't know what a "chi-squared" is. Health tests are rather important for TRNGs (to make sure nothing weird is happening), so this is not a small downside.
- DMA+PIO-based implementation: to relieve the CPU from some of its load. Currently, it's 100% a spinloop on the CPU, with interrupts disabled.
- Multicore mutex: currently, both CPU cores could run this code, which
would then modify the ROSC
COUNT
register in parallell. This is bad and will lead to bad results. Don't do this. (Or alternatively, add mutex guards, or bug me to do so.) - More useful
rand()
functions: currently, the code just gives you bits. Turning these bits into a random integer in a range, or a random float, ..., is currently left as an exercise for the reader. (Read: I totally forgot about these.) - Protections against physical attacks: if an adversary is in the physical proximity of the device, it's game over. They could attach an SWD debugger to the RP2040 (it has no protection against this), use a power side channel or EM emanations to look at the ROSC doing its oscillate-y thing, or tamper with the environment (e.g. freeze spray or harmonic EM emissions) to slow down the ROSC or make it lock onto a different frequency. I don't plan to add any defenses against such attacks. (Please get a properly certified hardware module if such attacks are part of your threat model, I'm begging you, don't use this.)
Usage
Here's a quick example:
#include <stdint.h>
#include <limits.h>
#include "rorand.h" /* raw rng */
#include "rourand.h" /* 'cooked'/whitened rng */
static void health_alarm_callback(const char* msg) {
// TRNG health alarm raised!
iprintf("TRNG health alarm! %s\n", msg);
}
int main() {
/** raw randomness **/
enum rourand_error r = rorand_init(health_alarm_callback /* can be NULL */);
if (r != rr_ok) {
panic("rorand failed to initialize!");
}
// get 128 bits of raw randomness
uint8_t data[16];
rorand_get(data, sizeof(data)*CHAR_BIT); // specified in bits
/** 'cooked'/whitened randomness **/
int rate = 0; // rate at which the CSPRNG will reseed with true random data. 0 is default.
// you (the user) must call rorand_init first before calling rourand_init!
struct rourand_state* ur = rourand_init(rorand_get, rate);
if (!ur) {
panic("rourand failed to initialize!");
}
rourand_get(ur, data, sizeof(data)); // specified in bytes
rourand_free(ur);
}
License
License is TBD.
The Ascon reference implementation is licensed under CC0 1.0.
The Raspberry Pico C SDK is licensed under the BSD-3-Clause license.