mspbsldump/README.md

277 lines
13 KiB
Markdown
Raw Normal View History

2022-04-09 01:56:33 +00:00
# MSP430FR BSL dumper
Tools to try to dump the MSP430FR BSL, mainly targetting the [MSP430FR5994
](https://www.ti.com/product/MSP430FR5994) (on an MSP-EXP430FR5994 devboard).
2022-04-12 21:50:20 +00:00
## Why
In 2009, Travis Goodspeed and Aurélien Francillon discovered that the [ROM BSL
in flash-based MSP430 units](https://www.ti.com/lit/ug/slau319ae/slau319ae.pdf)
can be used as a source of shellcode, ROP gadgets, and even called from
software to enable the BSL interface without authentication to read out an
otherwise protected firmware.
Since then, TI has made the mask ROM BSL in newer MSP430 models, such as the
MSP430FR5994, an execute-only area of memory. Furthermore, it can only be
called from the "Z-area", which is the first 8 bytes of the BSL memory area.
Jumping to other locations causes a reset of the microcontroller. Read accesses
to the BSL are likewise also forbidden. When JTAG/SBW is enabled, the BSL is
not usable at all.
However, it is still unclear whether these countermeasures are enough to stop
attacks that use the BSL as a shellcode, ROP gadget, or readout backdoor
source. Hence this project.
2022-04-09 19:55:52 +00:00
## The idea
2022-04-09 01:56:33 +00:00
2022-04-09 20:10:12 +00:00
The MSP430FR bootloader ('BSL') resides at `0x1000`. This memory cannot be
read, and user code can only jump to `0x1000` or `0x1002`, called the "Z-area",
to run certian functions of the BSL. Though, it is very likely that when the
CPU is running from inside this memory region, it can access this memory as
data, as that is often needed to store eg. structs, lookup tables, and so on.
Several other "execute-only" memory implementations function in a similar way,
such as the Nintendo GameBoy Advance and DS boot ROMS ("BIOS"es, citation
below), as well as in other systems analyzed by Schink and Obermaier,
publication also linked below.
2022-04-09 01:56:33 +00:00
The BSL (according to [the datasheet
](https://www.ti.com/lit/ug/slau550aa/slau550aa.pdf)) doesn't disable
2022-04-09 20:10:12 +00:00
interrupts. That means that, while the BSL is executing, it is possible to
interrupt this execution flow to
jump to code controlled by the user. An interrupt can inspect and modify
2022-04-09 01:56:33 +00:00
the registers of the BSL code at the time when the interrupt happened, as well
as the stack contents. Having a timer at the same frequency as the CPU, and
2022-04-09 20:10:12 +00:00
having it dump the register and stack contents after a certain interval increasing
by one cycle every iteration, it is possible to trace the instruction flow of
the CPU, as well as which registers and stack contents it is accessing, and how,
even though the code itself is not visible. Furthermore, the MSP430 CPU uses a
variable-length instruction set and instructions can use a variable amount of
cycles, therefore these traces can also be used to infer more infromation about
which instructions are executed, as the `pc` CPU register will never point to
the middle of an instruction, and will only advance to the next instruction
depending on how long the current instruction takes to execute.
Function epilogues typically first pop a number of values off the stack and
load these back into registers, and then return. By controlling the stack
pointer value, these can be used as a way to perform arbitrary reads. However,
as we are targetting nonwritable memory, an interrupt needs to happen before
the return occurs, otherwise CPU execution becomes very unpredictable.
2022-04-09 01:56:33 +00:00
You can find these epilogues by staring at many, many execution traces
(obtained from these timer interrupts) and thinking really hard (this is the
hard, time-consuming and labor-intensive part).
2022-04-09 20:10:12 +00:00
Alternatively, by timing an interrupt or a DMA transfer such that it happens
after a function is called but before it returns, it is possible to overwrite
the memory popped off the stack when an epilogue executes, thereby gaining
control of a few register values as well as the program counter. Then, CPU
execution can be redirected to another code snippet performing the memory read
before returning. With control over the address it reads from, this can be used
as an arbitrary read to read one word of the BSL, then return to use code to do
the next iteration.
2022-04-09 01:56:33 +00:00
The "using interrupts to figure out what execute-only code is doing" trick was
first (afaik) used by Martin Korth to find such a gadget inside the Nintendo DS
ARM7 boot ROM to read it out (and dump some keys), see [here
](http://problemkaputt.de/gbatek-bios-dumping.htm) and [here
](http://problemkaputt.de/gbatek-ds-memory-control-bios.htm), but is also
described in the academic literature, eg. [here
](https://www.usenix.org/system/files/woot19-paper_schink.pdf).
The "use DMA to get ROP" trick comes from [here
](https://hexkyz.blogspot.com/2021/11/je-ne-sais-quoi-falcons-over-horizon.html),
described near the end, the article is quite large.
2022-04-09 19:55:52 +00:00
## What has been implemented correctly
2022-04-09 01:56:33 +00:00
2022-04-09 19:55:52 +00:00
1. Memory in the BSL region cannot be read using data accesses from user code.
Reads come back as `3f ff`, which decodes as an infinite loop.
1. Arbitrary code in the BSL region cannot be jumped to from user code, the
CPU execution path has to go through the Z-area. Doing this will cause an
infinite loop or a reset.
1. Even when returning from an interrupt serviced during BSL execution, it is
not possible to return from this interrupt directly back to BSL code, as
this counts as a jump-to-arbitrary-BSL-location.
2022-04-15 00:23:52 +00:00
1. DMA transfers cannot read from the BSL region at all (not from the Z-area,
not from the BSL region during BSL execution).
2022-04-09 01:56:33 +00:00
2022-04-09 19:55:52 +00:00
## Vulnerabilities of the BSL against a readout attack
2022-04-09 01:56:33 +00:00
2022-04-09 19:55:52 +00:00
1. When the CPU is executing the BSL, it can perform data accesses to other BSL
areas. Thus, if an arbitrary read gadget is found, it can be used to dump
2022-04-09 20:10:12 +00:00
the entire BSL region. This is the same issue as present in the Nintendo
DS ARM7 boot ROM.
2022-04-09 19:55:52 +00:00
1. The routine at `0x1002` provides such a gadget, *as indicatd in SLAU550AA*.
1. The BSL execution is allowed to be interrutped, thus the instruction flow
can be traced by dumping CPU register values throughout the BSL execution.
This allows for finding arbitrary read gadgets.
2022-04-12 21:50:20 +00:00
1. The routine at `0x1002` can also be used to return from interrupts, thus
also bypassing that protection.
2022-04-09 19:55:52 +00:00
## Vulnerabilities of the BSL against use as a source of ROP gadgets
1. The routine at `0x1002` returns quickly, *as indicatd in SLAU550AA*.
Therefore, it can be used as an easy ROP entrypoint. This bypasses the "only
call code from the Z-area" limitation.
1. Potentially, DMA transfers can also be used to change the stack contents,
including return addresses, while the BSL is executing.
## Inaccurracies of the datasheets
1. The BSL clears all RAM from `0x1C00` to `0x3FC7`, not just `0x1C00` to
`0x1FFF`.
1. The BSL also clears Tiny RAM and some "reserved" low addresses, from `6` to
`0x1F`.
2022-04-09 21:06:02 +00:00
1. ~~The BSL sets up Timer A, while the datasheet only mentions Timer B usage
in *other* BSLs, and nothing about this one.~~ This is wrong, it changes the
clock settings, which has an influence on which clock source a timer uses.
2022-04-10 00:43:28 +00:00
1. The BSL communication method does not depend on the part number (eg. 5994 vs
59941), only the values in TLV are checked.
1. While the code has paths for other UART baudrate settings for the
communication interface, only one is available.
1. The memory area from `0x1b00` to `0x1bff` also contains ROM code, with its
own Z-area (also at the beginning, also 8 bytes in size). It has three
entrypoints, the fourth is an infinite loop. (`0x3c00..0x3fff` looks like
the same type of execute-only memory at first, but actually contains nothing,
at least not according to the techniques used here.) The first, documented
BSL region cannot access the second region directly, it must also go through
the corresponding Z-area.
2022-04-12 21:50:20 +00:00
1. The BSL command "RX Data Block Fast" has the exact same implementation as
the regular "RX Data Block" command. The name is a lie.
2022-04-09 19:55:52 +00:00
## What has not been checked
1. Pipelining: can code running at `0x0FFE` (or a similar address) access the
BSL memory, (mis)using the possibility that the effective value of `pc`
might differ from the executed address due to pipelining effects? (cf.
2022-04-10 17:13:38 +00:00
[MerryMage's GBA BIOS dump](https://mary.rs/lab/gbabios/)) NOTE: `0x0FFE` is
not backed by anything and always reads as 0, so getting this to work will
2022-04-12 21:50:20 +00:00
be tricky. The MSP430FR5994 does not seem to show open bus properties.
2022-04-09 19:55:52 +00:00
1. DMA: can a DMA transfer be used to change the stack contents during BSL
execution? (Most likely, just like interrupts can, I simply haven't checked.)
2022-04-10 00:43:28 +00:00
1. Dumping of the `0x1b00`..`0x1bff` region still needs to happen.
2022-04-09 01:56:33 +00:00
2022-04-09 20:10:12 +00:00
## Hashes
This is the hash of the memory region `0x1000` to `0x17FF`, on an MSP430FR5994,
with BSL 00.08.35.B3:
| Hash function | value |
2022-04-09 20:11:50 +00:00
| ----- | ----- |
2022-04-09 20:10:12 +00:00
| MD5 | `4bb3bb753face80fffe1fef7a762884a` |
| SHA-1 | `1b4c13e006121a9b1c1ebcd4fbc6ec7c96cc017f` |
| SHA-256 | `e4d0d171013f847a357eebe5467bcd413ecb41dc01424b7e4ee636538d820766` |
| SHA-512 | `fed28a7e9643a551789075b79d9b04fa6e8cdca74d783c1c3830ece07e5c9141dda9532b3c442416a1ddab90d752e679c6918c0d5333ac6da9fd23ab6c33d1bb` |
2022-04-10 00:43:28 +00:00
## Region 2 WIP stuff
* `0x1b00` entrypoint: basically halts the CPU. Not very useful.
2022-04-12 21:50:20 +00:00
* `0x1b02` jumps to `0x1bc2` which almost immediately disables interrupts. This
code implements the "Mass Erase" command.
2022-04-10 00:43:28 +00:00
* `0x1b04` jumps to `0x1bd6` which almost immediately disables interrupts.
2022-04-12 21:50:20 +00:00
None of these functions return. As there is no known usable gadget yet from in
the second BSL region, the "interrupt disable" instructions cannot be skipped,
and thus timer interrupts cannot be used to trace the execution flow. However,
the NMI pin can still be used as an interrupt source, by sending
carefully-timed signals from another device, where the timer interrupt would
otherwise happen. This requires a bit more setup, but it is able to work just
fine.
Do note that, unlike with regular interrupts, the MSP430X CPU does need to
execute a `reti` instruction to reenable nonmaskable interrupts. As this would
normally return into the BSL from user code, doing this would cause a reset.
Luckily, it is still possible to change the program counter on the stack of the
address returned to, so the experiment can be restarted from the beginning.
2022-04-10 00:43:28 +00:00
2022-04-15 00:23:52 +00:00
`0x1b00` halts the CPU by writing some code into tiny RAM (at address `0`) that
sets the CPU in a low power mode and performs an infinite loop. It then jumps
to that code.
`0x1b02` and `0x1b04` eventually converge to address `0x1bea`, where NMI
tracing starts failing for a yet unknown reason. It does not seem to be an "NMI
disable" instruction (by writing to an SFR or SYS register), as using DMA to
continuously rewrite these registers with values that enable these signals,
doesn't seem to work. It does not disable the UART, as LEDs used in debugging
stop blinking at this moment as well.
At this point in, the second BSL code has not performed any stack accesses or
memory-to-register reads.
As `0x1b02` is called from the main BSL code as part of the "mass-erase FRAM"
command, it most likely implements this functionality, and hardly anything
else.
## Proof of concept
The code in `src/main.c` will dump the content of the BSL to `eUSCI_A0` in UART
2022-04-15 00:23:52 +00:00
mode, at 9600 baud. Tested on an MSP430FR5994, but no other chips.
By setting the `DUMP_MODE` preprocessor definition to 0, it can instead be used
2022-04-15 00:23:52 +00:00
as an instruction tracer, accompanied by `logtracer.py`. Setting `USE_NMI` to 1
will use an NMI-based tracer instead of a `Timer_A`-based one. The address to
jump to (to trace) will still have to be changed manually in the `do_trace()`
function body.
2022-04-12 21:50:20 +00:00
## Useful shellcode
### Jump to any location in the BSL
Works only for the first BSL region (`0x1000..0x1800`). `r14` will have fixed
value `0xBEEF`.
```asm
pushx.a #address_to_jump_to
push.w r12
push.w r13
mov.w #0xdead, r13
mov.w #0xbeef, r14
br #0x1002
```
### Return from an interrupt into the BSL
Works only for the first BSL region (`0x1000..0x1800`). Destroys `r14`.
```asm
push.w r12
push.w r13
mov.w #0xdead, r13
mov.w #0xbeef, r14
@ restore status register
mov.w 4(sp), sr
@ this will restore r12 and r13 and the perform a reta (discarding the sr
@ value which reti would preserve)
br #0x1002
```
### Enter bootloader mode, bypassing the password check, without clearing RAM
Do *not* send the BSL password authentication command.
Cf. Travis Goodspeed's BSL-reenable-shellcode ([a
](https://www.usenix.org/legacy/event/woot09/tech/full_papers/goodspeed.pdf),
[b](https://archive.org/details/Pocorgtfo02):5)
```asm
@ unlock the BSL
mov.w #0xa5a5, 0x1c00
@ this jumps to the initialization phase of the BSL, after the RAM clear
@ and BSL password-reset-to-not-yet-checked phase
@ this probably does need the clock set to 8 MHz to function correctly
pushx.a #0x16d4
push.w r12
push.w r13
mov.w #0xdead, r13
mov.w #0xbeef, r14
br #0x1002
```
Compare with the original, from PoC||GTFO 2:5:
```asm
mov #0xFFFF, r11 ;; Disable BSL password protection.
br &0x0c02 ;; Branch to the BSL Soft Entry Point
```