Break, Remap, Debug: Using the FPB Unit on ARM cortex M

This content originally appeared on Level Up Coding - Medium and was authored by Wadix Technologies

Break, Remap, Debug: Using the FPB Unit on ARM Cortex M

FPB (Flash Patch Breakpoint Unit) Hardware Postion Overview

1. Introduction

When you debug code that lives in Flash, “software” breakpoints aren’t always enough. The Flash Patch and Breakpoint (FPB) unit is a tiny hardware block in the ARM debug fabric that lets you do two powerful things without touching your binary: set true hardware breakpoints on instruction fetches, and — optionally — replace the fetched instruction with one you choose. In practice, that means you can halt exactly at a function in Flash, or even redirect the first instruction to a small veneer to tweak behavior for tests and hotfixes.

2. FPB Features

What is FPB?

The Flash Patch & Breakpoint (FPB) is a CoreSight debug component in ARMv7-M that watches instruction/literal fetches and can either raise a debug event or substitute the fetched word, enabling non-intrusive breakpoints and lightweight patches in code that runs from Flash.

2.1 Hardware breakpoints

The FPB’s comparators watch instruction fetches in the code region. When a comparator matches an address, the core raises a debug event — either halting in place (halt-mode) or entering the DebugMonitor exception — so you can stop in Flash without changing your binary. To work, you must set both enables: the global FP_CTRL.ENABLE and the per-comparator FP_COMP[n].ENABLE. Matches are at halfword granularity, so the address’s bit1 selects the lower/upper halfword you’re targeting. This is the simplest, most reliable way to break on code in Flash, and it’s what most debuggers use under the hood.

Figure 1: FPB hardware Breakpoint

2.2 Flash Patch (remap) — lightweight instruction replacement

Flash Patch (remap) lets FPB swap one fetched instruction word with a word you pre-load in RAM. You point FP_REMAP at a 32-byte table (8×32-bit) in SRAM — it must be 32-byte aligned and lives in SRAM because hardware forces FP_REMAP[31:29]=0b001. Arm comparator n at the target address; when the CPU fetches there, FPB substitutes table[n] for the original 32-bit fetch. In Thumb, each 32-bit fetch contains two 16-bit halves: the lower halfword at address A and the upper at A+2. The REPLACE field chooses what you override: LOWER (just bits 15:0 at A), UPPER (bits 31:16 at A+2), or BOTH (all 32 bits — use this for 32-bit Thumb-2 ops like B.W). Typical uses are swapping a single instruction, tweaking a literal, or replacing the first instruction with a branch veneer in Flash (needed because a single B.W only reaches ±16 MB, so long jumps to RAM require a veneer

Figure 2: FPB remap flow(credit-the definitive guide to arm cortex-m3 and cortex-m4)

3. How to use FPB

3.1 Hardware breakpoint:

To halt when test() is fetched from Flash without touching the app, first enable CoreSight access (DEMCR.TRCENA=1) and turn on the FPB (FP_CTRL.ENABLE=1). Then program one comparator (e.g., FP_COMP[0]) with the first-instruction address of test(): clear the Thumb bit (bit0) but preserve bit1 so the comparator matches the correct halfword; set the comparator’s ENABLE bit. Issue DSB; ISB barriers, and call test(). On the next fetch, the FPB match raises a debug event — your debugger halts immediately (halt mode) or the DebugMonitor exception fires — giving you a true hardware breakpoint in Flash with zero code changes.

typedef struct
{
  volatile uint32_t FP_CTRL, FP_REMAP, FP_COMP[8];
} FPB_Type;

#define FPB ((FPB_Type*)0xE0002000UL)
#define FPB_CTRL_ENABLE (1u<<0)
#define FPB_COMP_ENABLE (1u<<0)

void test(void)
{
  /*debugger halts here*/
  __NOP();
}

static inline void fpb_enable(void)
{
  /* allow FPB/DWT/ITM */
  CoreDebug->DEMCR |= CoreDebug_DEMCR_TRCENA_Msk;
  FPB->FP_CTRL = FPB_CTRL_ENABLE; __DSB(); __ISB();
}

void demo_breakpoint(void)
{
  fpb_enable();
  uint32_t a = ((uint32_t)(uintptr_t)test) & ~1u;
  FPB->FP_COMP[0] = (a & 0x1FFFFFFCu) | (a & 0x2u)
  | FPB_COMP_ENABLE;
  __DSB();
  __ISB();
  test();
}

int main(void)
{
  HAL_Init();
  SystemClock_Config();
  demo_breakpoint();
  while (1)
  {
  }
}

3.2 Flash patch (remap):

To change behavior at runtime without touching Flash, use FPB’s remap path: instead of executing the word fetched from Flash, the core can substitute a word you preload in a tiny table in SRAM. Point FP_REMAP at a 32-byte (8×32-bit) table that’s 32-byte aligned, then arm a comparator on the first instruction address of the function you want to intercept (clear the T-bit, but preserve bit1 so the correct halfword is selected).

For a 32-bit replacement (e.g., injecting a B.W), set REPLACE_BOTH; *after a DSB; ISB, the next fetch at that address will read your replacement word from the table, not from Flash. A practical pattern is to inject a 32-bit branch to a tiny veneer in Flash (keeps within the ±16 MB range of B.W), do your tweak there, then BX LR back.

/* 3.2 Flash patch (remap): replace first fetch of test() with a 32-bit B.W to a veneer */
#define FPB_COMP_REPLACE_LOWER (1u<<0)
#define FPB_COMP_REPLACE_UPPER (2u<<0)
#define FPB_COMP_REPLACE_BOTH (3u<<0)
__attribute__((aligned(32))) static uint32_t fpb_table[8];

__attribute__((noinline)) static void veneer_dec(void) {
  /* example: adjust state, then return */
  __NOP();
  __asm volatile("bx lr");
}

/* Encode Thumb-2 unconditional B.W (T4) */
static inline uint32_t encode_bw(uint32_t pc, uint32_t target_t) {
  uint32_t tgt = target_t | 1u; // ensure Thumb bit
  int32_t imm = (int32_t)tgt - (int32_t)pc; // bytes
  imm >>= 1; // halfword scale
  uint32_t S=(imm>>20)&1u, imm10=(imm>>11)&0x3FFu, imm11=imm&0x7FFu;
  uint16_t hi=(uint16_t)(0xF000 | (S<<10) | 0x0800 | imm10);
  uint16_t lo=(uint16_t)(0xF800 | (S<<10) | imm11);
  return ((uint32_t)hi<<16) | lo;
}

void demo_patch(void) {
  /* 1) Enable CoreSight + FPB */
  CoreDebug->DEMCR |= CoreDebug_DEMCR_TRCENA_Msk;
  FPB->FP_CTRL = FPB_CTRL_ENABLE; __DSB(); __ISB();
  /* 2) Program FP_REMAP to our 32-byte table in SRAM */
  uint32_t base = ((uint32_t)(uintptr_t)fpb_table) & ~0x1Fu; // align
  FPB->FP_REMAP = base;
  /* 3) Build the replacement word: a 32-bit branch to a small veneer in FLASH */
  uint32_t entry = ((uint32_t)(uintptr_t)test) & ~1u; // clear T-bit
  uint32_t pc_for_b = entry + 4u; // PC during 32-bit fetch
  uint32_t bw = encode_bw(pc_for_b, (uint32_t)(uintptr_t)veneer_dec);
  /* 4) Write the same replacement into all 8 slots (keeps it simple) */
  for (uint32_t s=0; s<8; ++s)
  ((volatile uint32_t*)(base + s*4u))[0] = bw;
  __DSB(); __ISB();
  /* 5) Arm comparator 0 on the exact entry halfword, REPLACE_BOTH for 32-bit */
  FPB->FP_COMP[0] = (entry & 0x1FFFFFFCu) | (entry & 0x2u) // keep bit1 (halfword)
  | FPB_COMP_REPLACE_BOTH | FPB_COMP_ENABLE;
  __DSB();
  __ISB();
  /* Call: first fetch of test() is replaced by our B.W → veneer runs, then returns */
  test();
}

4. Conclusion:

FPB shines when you need true hardware breakpoints in Flash without touching the binary and when you want fast, reversible hot patches — swap a single instruction, tweak a literal, or redirect a function prologue to a small veneer. It’s especially useful for production debugging and in-field diagnostics where re-flashing is risky.

If you enjoyed this article, keep learning with structured embedded systems courses at Wadix Technologies.

Checkout our embedded systems online courses today!

Break, Remap, Debug: Using the FPB Unit on ARM cortex M was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.

This content originally appeared on Level Up Coding - Medium and was authored by Wadix Technologies

Print Share Comment Cite Upload Translate Updates

APA

Wadix Technologies | Sciencx (2025-08-25T02:24:37+00:00) Break, Remap, Debug: Using the FPB Unit on ARM cortex M. Retrieved from https://www.scien.cx/2025/08/25/break-remap-debug-using-the-fpb-unit-on-arm-cortex-m/

MLA

" » Break, Remap, Debug: Using the FPB Unit on ARM cortex M." Wadix Technologies | Sciencx - Monday August 25, 2025, https://www.scien.cx/2025/08/25/break-remap-debug-using-the-fpb-unit-on-arm-cortex-m/

HARVARD

Wadix Technologies | Sciencx Monday August 25, 2025 » Break, Remap, Debug: Using the FPB Unit on ARM cortex M., viewed ,<https://www.scien.cx/2025/08/25/break-remap-debug-using-the-fpb-unit-on-arm-cortex-m/>

VANCOUVER

Wadix Technologies | Sciencx - » Break, Remap, Debug: Using the FPB Unit on ARM cortex M. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2025/08/25/break-remap-debug-using-the-fpb-unit-on-arm-cortex-m/

CHICAGO

" » Break, Remap, Debug: Using the FPB Unit on ARM cortex M." Wadix Technologies | Sciencx - Accessed . https://www.scien.cx/2025/08/25/break-remap-debug-using-the-fpb-unit-on-arm-cortex-m/

IEEE

" » Break, Remap, Debug: Using the FPB Unit on ARM cortex M." Wadix Technologies | Sciencx [Online]. Available: https://www.scien.cx/2025/08/25/break-remap-debug-using-the-fpb-unit-on-arm-cortex-m/. [Accessed: ]

rf:citation

» Break, Remap, Debug: Using the FPB Unit on ARM cortex M | Wadix Technologies | Sciencx | https://www.scien.cx/2025/08/25/break-remap-debug-using-the-fpb-unit-on-arm-cortex-m/ |

Please log in to upload a file.

There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.

Break, Remap, Debug: Using the FPB Unit on ARM Cortex M

1. Introduction

2. FPB Features

3. How to use FPB

4. Conclusion:

Related Posts