add readme

This commit is contained in:
Triss 2023-08-24 04:13:04 +02:00
parent 1117391aa6
commit 6aa14a75d4
3 changed files with 356 additions and 0 deletions

354
README.md
View File

@ -1 +1,355 @@
# rorand
Ring oscillator-based TRNG for the RP2040
## What is it
`rorand` provides a true random number generator (TRNG) using only hardware
that's available inside the
[RP2040](https://www.raspberrypi.com/documentation/microcontrollers/rp2040.html)
chip. It can be used either in its 'raw' form, or with an extra
whitening/compression step based on [Ascon](https://ascon.iaik.tugraz.at/)-Xofa.
## Comparison
| Name | Throughput | Entropy assurance | Extra components | Whitening | URL |
|:--------------------- |:------------------------ |:------------------------------ |:---------------- |:---------- |:------------- |
| rorand (this work) | 40 kbit/s | passes Dieharder (see below) | none | Ascon-Xofa | you are here! |
| Adafruit Trinkey TRNG | (40 kbit/s comms. limit) | Common Criteria EAL6+ | Infineon Trust M | ? | https://learn.adafruit.com/trinkey-qt2040-true-random-number-generator/overview |
| caprand | ? | ? | one 10nF capacitor | ChaCha20 | https://github.com/mkj/caprand |
| rp2040-entropy | 300 kbit/s? | fails a few dieharder tests | none (ADC noise) | none | https://github.com/hashky/rp2040-entropy |
| rprand | ~10 kbit/s | ? | none (ROSC+DMA) | CRC32 | https://github.com/alastairpatrick/rprand |
| pico-sdk | ? (probably >1Mbit/s) | ? (seed only, RNG is Xorshiro) | none (ROSC seed) | Xorshiro128 | https://github.com/raspberrypi/pico-sdk/blob/master/src/rp2_common/pico_rand/rand.c |
| Cornell ECE4760 | ? | convergence to normal distribution | none (ROSC seed) | newlib/RAND48 | https://people.ece.cornell.edu/land/courses/ece4760/RP2040/C_SDK_random/index_random.html |
## Requirements
* Pico C SDK
* System must not run from ROSC, and the system clock must be at least 48 MHz
(80 MHz or higher preferred, 125 MHz or higher ideal).
* SysTick not used by something else
## How it works
The RP2040 includes a [ring
oscillator](https://en.wikipedia.org/wiki/Ring_oscillator) (unimaginatively
called ROSC). Such an oscillator is nothing more than a chain of inverters
wired up in a loop. If the number of inverters is odd, the circuit will
oscillate.
Ring oscillators (just like many other oscillators) exhibit a phenomenon called
*jitter*: the length of the clock period varies ever so slightly from one
period to the next. This stems from various forms of noise (thermal noise, shot
noise, etc.) in the semiconductor, which are ultimately caused by the inherent
randomness of quantum physics.
The RP2040 ROSC has a register called
"[RANDOMBIT](https://datasheets.raspberrypi.com/rp2040/rp2040-datasheet.pdf#reg-rosc-RANDOMBIT)",
which is basically a way of sampling the current state of the oscillator. On
first sight, this *sounds* like it would be random. However, much of the
perceived 'randomness' would actually stem from
[aliasing](https://en.wikipedia.org/wiki/Aliasing), because the code reading
this register would be undersampling.
Instead, let's use the jitter phenomenon slightly differently: the phase
(jitter) noise from multiple oscillator cycles accumulates. Thus, the actual
real-world time it takes to complete `N` clock cycles has a larger variance
than the jitter of just one clock cycle. This is visualized in the image below:
![A diagram of multiple jittery squarewaves overlaid. They all start at the
same time, but their ending time varies a lot (while the first few edges all
happen at mostly the same moment). Image from slides by Adriaan Petermans.](img/jitter.png)
By using the [COUNT](https://datasheets.raspberrypi.com/rp2040/rp2040-datasheet.pdf#reg-rosc-COUNT)
register of the RP2040 ROSC, we can time how long it takes for it to reach zero
(compared to eg. the ARM SysTick timer running from the CPU clock). We can use
the least-significant bit of this timing difference (counted in CPU cycles) as
the randomness output!
(As a sidenote: 'real professional' ring oscillator-based TRNGs typically use a
"transient effect ring oscillator" (TERO) design, which is a bit more robust.
Though, it also needs multiple oscillators and counters, and we don't have that
available here. So, the basic RO TRNG it is.)
Well, almost. We still need to do a crucial final step: [entropy
extraction](https://en.wikipedia.org/wiki/Randomness_extractor). There are two
types of entropy extractors used most often: the parity filter (which XORs the
last N output bits together), and the Von Neumann filter (described in the link).
The Von Neumann filter is able to remove all [statistical
bias](https://en.wikipedia.org/wiki/Bias_(statistics)), but has more
requirements on the probability distribution of the inputs (they must be
"Independent and identically distributed", or IID for short). Meanwhile, the
parity filter has fewer requirements (only independence of inputs), but it
cannot remove all bias in practice (though it converges to this for larger and
larger N).
For a ring oscillator, one would expect the random bits (generated as described
above) to be IID, so we can use the Von Neumann extractor. (This assumption
seems to hold in practice.) Oh, and we can still XOR this bit with the
RANDOMBIT register, because it's there.
This is essentially how the "raw" output is generated. `rorand` also provides a
"cooked" output, which feeds the output of the "raw" TRNG into Ascon-Xofa. This
creature is a cryptographic PRNG based on what cryptographers call a "[sponge
construction](https://en.wikipedia.org/wiki/Sponge_function)". Sponge
constructions have all sorts of cool properties (which I won't talk about here
because it'd detract from the article, sadly), but one of them is that you can
build an *extensible-output function* (XOF) from them. XOFs are like hash
functions, but the output length can be made arbitrarily long. This makes it
rather easy to make a cryptographically-secure *reseedable* PRNG.
Ascon is the finalist for the "lightweight portfolio" of the [CAESAR
competition](https://competitions.cr.yp.to/caesar-submissions.html). It's a
rather new (but promising, and built from solid foundations) cryptographic
cipher, specifically designed for slow embedded devices, such as the RP2040. It
should thus be faster than other options like ChaCha20/BLAKE2 or ARC4Random
(which is now considered insecure). AEGIS and Deoxys-II, the other two CAESAR
finalists for other purposes, are based on the AES round transformation, and
would thus be quite a bit slower.
## How good is it
### Entropy
The randomness obtained from this TRNG seems to pass the
[Dieharder](https://webhome.phy.duke.edu/~rgb/General/dieharder.php) test
suite. The output is given below.
Note that a few entries are marked as `WEAK`, but those seem to appear and
disappear when I try and retry different runs. This seems to happen due to the
used sample size: one megabyte of random data takes a few minutes to generate,
so much larger samples are quite a bit of a hassle to obtain. Running output
from `/dev/random` with the same sizes also has the `WEAK` warnings pop up and
disappear at random, so it should be fine.
**WARNING**: However, these `WEAK` warnings seem to pop up more often with
decreasing RP2040 system clock speed (and, slightly, voltage as well). While no
`FAILED` errors have popped up, depending on your needs, only use `rorand` with
a high enough system clock frequency (≥80 MHz preferred, ≥125 MHz ideal).
Dieharder isn't the most stringent test suite, though.
[TestU01](http://simul.iro.umontreal.ca/testu01/tu01.html)'s BigCrush suite is
much more comprehensive, but requires ***much*** more data (many gigabytes).
Thanks to the limited throughput of a hardware TRNG, using this test isn't very
feasible. Real evaluations such as FIPS 140-2, or Common Criteria EAL4 or
higher, is completely out of my means.
In conclusion, don't trust someone's life --- neither your own nor someone
else's --- with this. But otherwise, you should be fine. If you need the
assurance (or if you're paranoid), please use a FIPS- or Common
Criteria-certified hardware module, though.
Dieharder output (raw output, RP2040 running at 250 MHz, core voltage 1.15 V):
```
#=============================================================================#
test_name |ntup| tsamples |psamples| p-value |Assessment
#=============================================================================#
diehard_birthdays| 0| 100| 100|0.22538373| PASSED
diehard_operm5| 0| 1000000| 100|0.63794545| PASSED
diehard_rank_32x32| 0| 40000| 100|0.73090642| PASSED
diehard_rank_6x8| 0| 100000| 100|0.99254745| PASSED
diehard_bitstream| 0| 2097152| 100|0.58114397| PASSED
diehard_opso| 0| 2097152| 100|0.18319256| PASSED
diehard_oqso| 0| 2097152| 100|0.93914840| PASSED
diehard_dna| 0| 2097152| 100|0.74871515| PASSED
diehard_count_1s_str| 0| 256000| 100|0.20413208| PASSED
diehard_count_1s_byt| 0| 256000| 100|0.56507750| PASSED
diehard_parking_lot| 0| 12000| 100|0.43004782| PASSED
diehard_2dsphere| 2| 8000| 100|0.68150341| PASSED
diehard_3dsphere| 3| 4000| 100|0.50961304| PASSED
diehard_squeeze| 0| 100000| 100|0.73895828| PASSED
diehard_sums| 0| 100| 100|0.89067279| PASSED
diehard_runs| 0| 100000| 100|0.88549264| PASSED
diehard_runs| 0| 100000| 100|0.17826705| PASSED
diehard_craps| 0| 200000| 100|0.74363795| PASSED
diehard_craps| 0| 200000| 100|0.92809716| PASSED
marsaglia_tsang_gcd| 0| 10000000| 100|0.21033260| PASSED
marsaglia_tsang_gcd| 0| 10000000| 100|0.90826850| PASSED
sts_monobit| 1| 100000| 100|0.66704052| PASSED
sts_runs| 2| 100000| 100|0.82376608| PASSED
sts_serial| 1| 100000| 100|0.40789077| PASSED
sts_serial| 2| 100000| 100|0.11533197| PASSED
sts_serial| 3| 100000| 100|0.57906405| PASSED
sts_serial| 3| 100000| 100|0.88247272| PASSED
sts_serial| 4| 100000| 100|0.35109410| PASSED
sts_serial| 4| 100000| 100|0.06001064| PASSED
sts_serial| 5| 100000| 100|0.24597035| PASSED
sts_serial| 5| 100000| 100|0.42776664| PASSED
sts_serial| 6| 100000| 100|0.34671248| PASSED
sts_serial| 6| 100000| 100|0.63443378| PASSED
sts_serial| 7| 100000| 100|0.34901523| PASSED
sts_serial| 7| 100000| 100|0.87482945| PASSED
sts_serial| 8| 100000| 100|0.83333396| PASSED
sts_serial| 8| 100000| 100|0.98096154| PASSED
sts_serial| 9| 100000| 100|0.18613150| PASSED
sts_serial| 9| 100000| 100|0.98222632| PASSED
sts_serial| 10| 100000| 100|0.06064720| PASSED
sts_serial| 10| 100000| 100|0.53659409| PASSED
sts_serial| 11| 100000| 100|0.48783888| PASSED
sts_serial| 11| 100000| 100|0.44013196| PASSED
sts_serial| 12| 100000| 100|0.89056741| PASSED
sts_serial| 12| 100000| 100|0.55084354| PASSED
sts_serial| 13| 100000| 100|0.55991536| PASSED
sts_serial| 13| 100000| 100|0.79919790| PASSED
sts_serial| 14| 100000| 100|0.99559967| WEAK
sts_serial| 14| 100000| 100|0.60230091| PASSED
sts_serial| 15| 100000| 100|0.91600221| PASSED
sts_serial| 15| 100000| 100|0.29385991| PASSED
sts_serial| 16| 100000| 100|0.80170010| PASSED
sts_serial| 16| 100000| 100|0.89169608| PASSED
rgb_bitdist| 1| 100000| 100|0.25864662| PASSED
rgb_bitdist| 2| 100000| 100|0.41244241| PASSED
rgb_bitdist| 3| 100000| 100|0.80837541| PASSED
rgb_bitdist| 4| 100000| 100|0.93133416| PASSED
rgb_bitdist| 5| 100000| 100|0.70903269| PASSED
rgb_bitdist| 6| 100000| 100|0.29717638| PASSED
rgb_bitdist| 7| 100000| 100|0.71361547| PASSED
rgb_bitdist| 8| 100000| 100|0.03450956| PASSED
rgb_bitdist| 9| 100000| 100|0.14545557| PASSED
rgb_bitdist| 10| 100000| 100|0.90299368| PASSED
rgb_bitdist| 11| 100000| 100|0.53321027| PASSED
rgb_bitdist| 12| 100000| 100|0.65243210| PASSED
rgb_minimum_distance| 2| 10000| 1000|0.29295560| PASSED
rgb_minimum_distance| 3| 10000| 1000|0.28825100| PASSED
rgb_minimum_distance| 4| 10000| 1000|0.84249889| PASSED
rgb_minimum_distance| 5| 10000| 1000|0.53807064| PASSED
rgb_permutations| 2| 100000| 100|0.72182964| PASSED
rgb_permutations| 3| 100000| 100|0.98132372| PASSED
rgb_permutations| 4| 100000| 100|0.32426721| PASSED
rgb_permutations| 5| 100000| 100|0.94033412| PASSED
rgb_lagged_sum| 0| 1000000| 100|0.78964883| PASSED
rgb_lagged_sum| 1| 1000000| 100|0.82881744| PASSED
rgb_lagged_sum| 2| 1000000| 100|0.90805811| PASSED
rgb_lagged_sum| 3| 1000000| 100|0.21428921| PASSED
rgb_lagged_sum| 4| 1000000| 100|0.67964064| PASSED
rgb_lagged_sum| 5| 1000000| 100|0.64463634| PASSED
rgb_lagged_sum| 6| 1000000| 100|0.94268050| PASSED
rgb_lagged_sum| 7| 1000000| 100|0.33288741| PASSED
rgb_lagged_sum| 8| 1000000| 100|0.79565468| PASSED
rgb_lagged_sum| 9| 1000000| 100|0.26291492| PASSED
rgb_lagged_sum| 10| 1000000| 100|0.23887951| PASSED
rgb_lagged_sum| 11| 1000000| 100|0.75511918| PASSED
rgb_lagged_sum| 12| 1000000| 100|0.44806331| PASSED
rgb_lagged_sum| 13| 1000000| 100|0.57950849| PASSED
rgb_lagged_sum| 14| 1000000| 100|0.37109436| PASSED
rgb_lagged_sum| 15| 1000000| 100|0.89758378| PASSED
rgb_lagged_sum| 16| 1000000| 100|0.06000037| PASSED
rgb_lagged_sum| 17| 1000000| 100|0.87051454| PASSED
rgb_lagged_sum| 18| 1000000| 100|0.47509380| PASSED
rgb_lagged_sum| 19| 1000000| 100|0.34523765| PASSED
rgb_lagged_sum| 20| 1000000| 100|0.98418490| PASSED
rgb_lagged_sum| 21| 1000000| 100|0.17652983| PASSED
rgb_lagged_sum| 22| 1000000| 100|0.17374595| PASSED
rgb_lagged_sum| 23| 1000000| 100|0.63170066| PASSED
rgb_lagged_sum| 24| 1000000| 100|0.28953394| PASSED
rgb_lagged_sum| 25| 1000000| 100|0.64849257| PASSED
rgb_lagged_sum| 26| 1000000| 100|0.78468683| PASSED
rgb_lagged_sum| 27| 1000000| 100|0.36484261| PASSED
rgb_lagged_sum| 28| 1000000| 100|0.66712700| PASSED
rgb_lagged_sum| 29| 1000000| 100|0.48604229| PASSED
rgb_lagged_sum| 30| 1000000| 100|0.17591412| PASSED
rgb_lagged_sum| 31| 1000000| 100|0.32433586| PASSED
rgb_lagged_sum| 32| 1000000| 100|0.12971163| PASSED
rgb_kstest_test| 0| 10000| 1000|0.47803450| PASSED
dab_bytedistrib| 0| 51200000| 1|0.29203091| PASSED
dab_dct| 256| 50000| 1|0.06993184| PASSED
Preparing to run test 207. ntuple = 0
dab_filltree| 32| 15000000| 1|0.38387488| PASSED
dab_filltree| 32| 15000000| 1|0.55295440| PASSED
Preparing to run test 208. ntuple = 0
dab_filltree2| 0| 5000000| 1|0.24922510| PASSED
dab_filltree2| 1| 5000000| 1|0.98227801| PASSED
Preparing to run test 209. ntuple = 0
dab_monobit2| 12| 65000000| 1|0.71153312| PASSED
```
At 48 MHz and a core voltage of 0.95 V, there would be three or four `WARN`ings.
### Throughput
The TRNG is rather CPU-intensive (due to its IRQ-blocking busyloop) and not
very fast. The "cooked" (whitened) throughput is about 10 times higher than the
"raw" throughput.
| System clock | Raw TRNG throughput |
|:------------ |:------------------- |
| 48 MHz | 10 kbit/s |
| 125 MHz | 40 kbit/s |
| 250 MHz | 50 kbit/s |
## What is missing?
* **Online health tests**: I don't know how to do this. I don't know what a
"chi-squared" is. Health tests are rather important for TRNGs (to make sure
nothing weird is happening), so this is not a small downside.
* **DMA+PIO-based implementation**: to relieve the CPU from some of its load.
Currently, it's 100% a spinloop on the CPU, with interrupts disabled.
* **Multicore mutex**: currently, both CPU cores could run this code, which
would then modify the ROSC `COUNT` register in parallell. This is bad and
will lead to bad results. Don't do this. (Or alternatively, add mutex guards,
or bug me to do so.)
* **More useful `rand()` functions**: currently, the code just gives you bits,
and turning these bits into a random integer in a range, or a random float,
..., is left as an exercise for the reader.
* **Protections against physical attacks**: if an adversary is in the physical
proximity of the device, it's game over. They could attach an SWD debugger to
the RP2040 (it has no protection against this), use a power side channel or
EM emanations to look at the ROSC doing its oscillate-y thing, or tamper with
the environment (e.g. freeze spray or harmonic EM emissions) to slow down the
ROSC or make it lock onto a different frequency. I don't plan to add any
defenses against such attacks. (Please get a properly certified hardware
module if such attacks are part of your threat model, I'm begging you, don't
use this.)
## Usage
Here's a quick example:
```c
#include <stdint.h>
#include <limits.h>
#include "rorand.h" /* raw rng */
#include "rourand.h" /* 'cooked'/whitened rng */
static void health_alarm(const char* msg) {
// TRNG health alarm raised!
iprintf("TRNG health alarm! %s\n", msg);
}
int main() {
/** raw randomness **/
enum rourand_error r = rorand_init(health_alarm);
if (r != rr_ok) {
panic("rorand failed to initialize!");
}
// get 128 bits of raw randomness
uint8_t data[16];
rorand_get(data, sizeof(data)*CHAR_BIT); // specified in bits
/** 'cooked'/whitened randomness **/
int rate = 0; // rate at which the CSPRNG will reseed with true random data. 0 is default.
// you (the user) must call rorand_init first before calling rourand_init!
struct rourand_state* ur = rourand_init(rorand_get, rate);
if (!ur) {
panic("rourand failed to initialize!");
}
rourand_get(ur, data, sizeof(data)); // specified in bytes
rourand_free(ur);
}
```
## License
License is TBD.
The [Ascon reference implementation](https://github.com/ascon/ascon-c) is
licensed under CC0 1.0.
The [Raspberry Pico C SDK](https://github.com/raspberrypi/pico-sdk) is licensed
under the BSD-3-Clause license.

BIN
img/jitter.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.7 KiB

View File

@ -27,6 +27,8 @@ static uint8_t data[32];
int main() {
vreg_set_voltage(VREG_VOLTAGE_1_15);
set_sys_clock_khz(250*1000, true);
// high freq, high voltage
//vreg_set_voltage(VREG_VOLTAGE_1_25);
//set_sys_clock_khz(250*1000, true);