Dumping the MSP430FR BSL

Go to file

Triss 72f557244e add more info in readme		2022-04-15 02:38:45 +02:00
include	initial stuff	2022-04-09 03:48:43 +02:00
nmigen	fix some stuff, but still not good enough	2022-04-15 02:06:50 +02:00
src	more info in readme, fix dump mode	2022-04-15 02:23:52 +02:00
.gitignore	initial stuff	2022-04-09 03:48:43 +02:00
Makefile	initial stuff	2022-04-09 03:48:43 +02:00
README.md	add more info in readme	2022-04-15 02:38:45 +02:00
logtracer.py	working pc/sr modifier, but that doesn't work in the bsl, uuugh	2022-04-10 19:11:45 +02:00
msp430fr5962.ld	initial stuff	2022-04-09 03:48:43 +02:00
msp430fr5962_symbols.ld	initial stuff	2022-04-09 03:48:43 +02:00
msp430fr5994.ld	NMI-based tracing works	2022-04-12 23:50:20 +02:00
msp430fr5994_symbols.ld	initial stuff	2022-04-09 03:48:43 +02:00
ttypipe.py	fix some stuff, but still not good enough	2022-04-15 02:06:50 +02:00

README.md

MSP430FR BSL dumper

Tools to try to dump the MSP430FR BSL, mainly targetting the MSP430FR5994 (on an MSP-EXP430FR5994 devboard).

Why

In 2009, Travis Goodspeed and Aurélien Francillon discovered that the ROM BSL in flash-based MSP430 units can be used as a source of shellcode, ROP gadgets, and even called from software to enable the BSL interface without authentication to read out an otherwise protected firmware.

Since then, TI has made the mask ROM BSL in newer MSP430 models, such as the MSP430FR5994, an execute-only area of memory. Furthermore, it can only be called from the "Z-area", which is the first 8 bytes of the BSL memory area. Jumping to other locations causes a reset of the microcontroller. Read accesses to the BSL are likewise also forbidden. When JTAG/SBW is enabled, the BSL is not usable at all.

However, it is still unclear whether these countermeasures are enough to stop attacks that use the BSL as a shellcode, ROP gadget, or readout backdoor source. Hence this project.

The idea

The MSP430FR bootloader ('BSL') resides at 0x1000. This memory cannot be read, and user code can only jump to 0x1000 or 0x1002, called the "Z-area", to run certian functions of the BSL. Though, it is very likely that when the CPU is running from inside this memory region, it can access this memory as data, as that is often needed to store eg. structs, lookup tables, and so on. Several other "execute-only" memory implementations function in a similar way, such as the Nintendo GameBoy Advance and DS boot ROMS ("BIOS"es, citation below), as well as in other systems analyzed by Schink and Obermaier, publication also linked below.

The BSL (according to the datasheet ) doesn't disable interrupts. That means that, while the BSL is executing, it is possible to interrupt this execution flow to jump to code controlled by the user. An interrupt can inspect and modify the registers of the BSL code at the time when the interrupt happened, as well as the stack contents. Having a timer at the same frequency as the CPU, and having it dump the register and stack contents after a certain interval increasing by one cycle every iteration, it is possible to trace the instruction flow of the CPU, as well as which registers and stack contents it is accessing, and how, even though the code itself is not visible. Furthermore, the MSP430 CPU uses a variable-length instruction set and instructions can use a variable amount of cycles, therefore these traces can also be used to infer more infromation about which instructions are executed, as the pc CPU register will never point to the middle of an instruction, and will only advance to the next instruction depending on how long the current instruction takes to execute.

Function epilogues typically first pop a number of values off the stack and load these back into registers, and then return. By controlling the stack pointer value, these can be used as a way to perform arbitrary reads. However, as we are targetting nonwritable memory, an interrupt needs to happen before the return occurs, otherwise CPU execution becomes very unpredictable.

You can find these epilogues by staring at many, many execution traces (obtained from these timer interrupts) and thinking really hard (this is the hard, time-consuming and labor-intensive part).

Alternatively, by timing an interrupt or a DMA transfer such that it happens after a function is called but before it returns, it is possible to overwrite the memory popped off the stack when an epilogue executes, thereby gaining control of a few register values as well as the program counter. Then, CPU execution can be redirected to another code snippet performing the memory read before returning. With control over the address it reads from, this can be used as an arbitrary read to read one word of the BSL, then return to use code to do the next iteration.

The "using interrupts to figure out what execute-only code is doing" trick was first (afaik) used by Martin Korth to find such a gadget inside the Nintendo DS ARM7 boot ROM to read it out (and dump some keys), see here and here , but is also described in the academic literature, eg. here .

The "use DMA to get ROP" trick comes from here , described near the end, the article is quite large.

What has been implemented correctly

Memory in the BSL region cannot be read using data accesses from user code. Reads come back as 3f ff, which decodes as an infinite loop.
Arbitrary code in the BSL region cannot be jumped to from user code, the CPU execution path has to go through the Z-area. Doing this will cause an infinite loop or a reset.
Even when returning from an interrupt serviced during BSL execution, it is not possible to return from this interrupt directly back to BSL code, as this counts as a jump-to-arbitrary-BSL-location.
DMA transfers cannot read from the BSL region at all (not from the Z-area, not from the BSL region during BSL execution).

Vulnerabilities of the BSL against a readout attack

When the CPU is executing the BSL, it can perform data accesses to other BSL areas. Thus, if an arbitrary read gadget is found, it can be used to dump the entire BSL region. This is the same issue as present in the Nintendo DS ARM7 boot ROM.
The routine at 0x1002 provides such a gadget, as indicatd in SLAU550AA.
The BSL execution is allowed to be interrutped, thus the instruction flow can be traced by dumping CPU register values throughout the BSL execution. This allows for finding arbitrary read gadgets.
The routine at 0x1002 can also be used to return from interrupts, thus also bypassing that protection.

Vulnerabilities of the BSL against use as a source of ROP gadgets

The routine at 0x1002 returns quickly, as indicatd in SLAU550AA. Therefore, it can be used as an easy ROP entrypoint. This bypasses the "only call code from the Z-area" limitation.
Potentially, DMA transfers can also be used to change the stack contents, including return addresses, while the BSL is executing.

Inaccurracies of the datasheets

The BSL clears all RAM from 0x1C00 to 0x3FC7, not just 0x1C00 to 0x1FFF.
The BSL also clears Tiny RAM and some "reserved" low addresses, from 6 to 0x1F.
~~The BSL sets up Timer A, while the datasheet only mentions Timer B usage in other BSLs, and nothing about this one.~~ This is wrong, it changes the clock settings, which has an influence on which clock source a timer uses.
The BSL communication method does not depend on the part number (eg. 5994 vs 59941), only the values in TLV are checked.
While the code has paths for other UART baudrate settings for the communication interface, only one is available.
The memory area from 0x1b00 to 0x1bff also contains ROM code, with its own Z-area (also at the beginning, also 8 bytes in size). It has three entrypoints, the fourth is an infinite loop. (0x3c00..0x3fff looks like the same type of execute-only memory at first, but actually contains nothing, at least not according to the techniques used here.) The first, documented BSL region cannot access the second region directly, it must also go through the corresponding Z-area.
The BSL command "RX Data Block Fast" has the exact same implementation as the regular "RX Data Block" command. The name is a lie.

What has not been checked

Pipelining: can code running at 0x0FFE (or a similar address) access the BSL memory, (mis)using the possibility that the effective value of pc might differ from the executed address due to pipelining effects? (cf. MerryMage's GBA BIOS dump) NOTE: 0x0FFE is not backed by anything and always reads as 0, so getting this to work will be tricky. The MSP430FR5994 does not seem to show open bus properties.
DMA: can a DMA transfer be used to change the stack contents during BSL execution? (Most likely, just like interrupts can, I simply haven't checked.)
Dumping of the 0x1b00..0x1bff region still needs to happen.

Hashes

This is the hash of the memory region 0x1000 to 0x17FF, on an MSP430FR5994, with BSL 00.08.35.B3:

Hash function	value
MD5	`4bb3bb753face80fffe1fef7a762884a`
SHA-1	`1b4c13e006121a9b1c1ebcd4fbc6ec7c96cc017f`
SHA-256	`e4d0d171013f847a357eebe5467bcd413ecb41dc01424b7e4ee636538d820766`
SHA-512	`fed28a7e9643a551789075b79d9b04fa6e8cdca74d783c1c3830ece07e5c9141dda9532b3c442416a1ddab90d752e679c6918c0d5333ac6da9fd23ab6c33d1bb`

Region 2 WIP stuff

0x1b00 entrypoint: basically halts the CPU. Not very useful.
0x1b02 jumps to 0x1bc2 which almost immediately disables interrupts. This code implements the "Mass Erase" command.
0x1b04 jumps to 0x1bd6 which almost immediately disables interrupts.

None of these functions return. As there is no known usable gadget yet from in the second BSL region, the "interrupt disable" instructions cannot be skipped, and thus timer interrupts cannot be used to trace the execution flow. However, the NMI pin can still be used as an interrupt source, by sending carefully-timed signals from another device, where the timer interrupt would otherwise happen. This requires a bit more setup, but it is able to work just fine.

Do note that, unlike with regular interrupts, the MSP430X CPU does need to execute a reti instruction to reenable nonmaskable interrupts. As this would normally return into the BSL from user code, doing this would cause a reset. Luckily, it is still possible to change the program counter on the stack of the address returned to, so the experiment can be restarted from the beginning.

0x1b00 halts the CPU by writing some code into tiny RAM (at address 0) that sets the CPU in a low power mode and performs an infinite loop. It then jumps to that code.

0x1b02 and 0x1b04 eventually converge to address 0x1bea, where NMI tracing starts failing for a yet unknown reason. It does not seem to be an "NMI disable" instruction (by writing to an SFR or SYS register), as using DMA to continuously rewrite these registers with values that enable these signals, doesn't seem to work. It does not disable the UART, as LEDs used in debugging stop blinking at this moment as well.

At this point in, the second BSL code has not performed any stack accesses or memory-to-register reads.

As 0x1b02 is called from the main BSL code as part of the "mass-erase FRAM" command, it most likely implements this functionality, and hardly anything else.

Raspberry Pico NMI signal generator

To generate the well-timed NMI signals, a Raspberry Pico is used. From a trigger signal from P1.4 to GP14, it will wait a specific amount of time, then lower GP15, which should be connected to NMI/#RST, for a few microseconds. This delay between the trigger and NMI is configured over a UART interface on P6.0 and GP17. P1.5/GP16 is used as an "ack" signal from the Pico to the MSP430 to signal that the serial command has been received and processed. This is done to avoid having the MSP430 start sending a trigger signal before a new delay setting has been applied properly.

Some other pins are used as handshaking: GP19 is connected to P3.2 to have the MSP430 wait until this pin is high to start tracing, which is needed because the NMI pin is also shared between the actual NMI signal, and the Spy-Bi-Wire debugging interface, which is used to upload new code. Using a switch to select which line (SBWTDIO or NMI/GP15) is connected to the NMI/#RESET pin, this can be mitigated. However, then P3.2 needs to be also pulled low for as long as the NMI/#RST pin is connected to SBWTDIO. For this, another switch can be used to have it toggle between GND and GP19/3V3.

The Raspberry Pico code can be found in the nmigen/ folder (not to be confused with the HDL language that had this name in the past).

Full connection table:

MSP430 pin	Intermediate pin	Raspberry Pico pin
GND		GND
P1.4		GP14
P1.5		GP16
	SW1.LEFT	GP19 or 3V3
P3.2	SW1.MID
GND	SW1.RIGHT	GND
P6.0		GP17
	SW2.LEFT	GP15
NMI/#RST	SW2.MID
SBWTDIO (from eZ-FET)	SW2.RIGHT

Proof of concept

The code in src/main.c will dump the content of the BSL to eUSCI_A0 in UART mode, at 9600 baud. Tested on an MSP430FR5994, but no other chips.

By setting the DUMP_MODE preprocessor definition to 0, it can instead be used as an instruction tracer, accompanied by logtracer.py. Setting USE_NMI to 1 will use an NMI-based tracer instead of a Timer_A-based one. The address to jump to (to trace) will still have to be changed manually in the do_trace() function body.

Useful shellcode

Jump to any location in the BSL

Works only for the first BSL region (0x1000..0x1800). r14 will have fixed value 0xBEEF.

	pushx.a #address_to_jump_to
	push.w r12
	push.w r13
	mov.w #0xdead, r13
	mov.w #0xbeef, r14
	br #0x1002

Return from an interrupt into the BSL

Works only for the first BSL region (0x1000..0x1800). Destroys r14.

	push.w r12
	push.w r13
	mov.w #0xdead, r13
	mov.w #0xbeef, r14
	@ restore status register
	mov.w 4(sp), sr
	@ this will restore r12 and r13 and the perform a reta (discarding the sr
	@ value which reti would preserve)
	br #0x1002

Enter bootloader mode, bypassing the password check, without clearing RAM

Do not send the BSL password authentication command.

Cf. Travis Goodspeed's BSL-reenable-shellcode (a , b:5)

	@ unlock the BSL
	mov.w #0xa5a5, 0x1c00
	@ this jumps to the initialization phase of the BSL, after the RAM clear
	@ and BSL password-reset-to-not-yet-checked phase
	@ this probably does need the clock set to 8 MHz to function correctly
	pushx.a #0x16d4
	push.w r12
	push.w r13
	mov.w #0xdead, r13
	mov.w #0xbeef, r14
	br #0x1002

Compare with the original, from PoC||GTFO 2:5:

	mov #0xFFFF, r11   ;; Disable BSL password protection.
	br  &0x0c02        ;; Branch to the BSL Soft Entry Point