https://medium.com/@ly.lee/stm32-blue-pill-analyse-and-optimise-your-ram-and-rom-9fc805e16ed7
Someday
our Blue Pill development tools will get so smart… And automatically
flag out seemingly innocuous changes we made (like adding
sprintf()
) that caused our Blue Pill program to bloat beyond Blue Pill’s built-in 64 KB of ROM and 20 KB of RAM.
But until that day comes, we have to stay vigilant. And learn the tips and tools in this article to prevent Blue Pill Bloat…
- We’ll create a sample Blue Pill program with Visual Studio Code and PlatformIO
- Study the Linker Script that was used to create the Blue Pill executable, and the Memory Layout that it enforces
- Learn what’s inside the Text, Data and BSS Sections of a Blue Pill executable
- Understand how the Stack and Heap are organised
- Analyse RAM and ROM usage with Google Sheets and the Linker Map
- Peek at the Assembly Code generated by the compiler, as well as the Vector Map and
reset_handler()
used during Blue Pill startup - Lastly, some tips I have learnt from optimising a huge Blue Pill program
So
if you’re giving up on Blue Pill because you thought 64 KB of ROM and
20 KB of RAM can’t do much… this article will amaze you!
💎 When you see sections marked by a diamond… If you’re new to STM32 Blue Pill programming, skip these sections because they will be hard to understand. In these sections I’ll explain some advanced Blue Pill features that we’ll be seeing in future articles, as we stack up more functions and optimise them.
Create A Sample Blue Pill Program
We’ll use Visual Studio Code and the PlatformIO Extension to create a simple LED blink program for analysing the RAM and ROM usage.
Follow
the steps in the video below to install the PlatformIO Extension and
create the sample program. Please copy the updated content from my GitHub repository for the following files:
If you don’t have a Blue Pill and ST Link, you may skip the Build and Upload
steps in the video. This article doesn’t require a Blue Pill to run the
demos. Click “CC” to view the instructions in the video…
Blue Pill Project Build
What happens when you click the Build button to build a Blue Pill executable?
1️⃣ The C compiler
arm-none-eabi-gcc
compiles main.c
into main.o
2️⃣ The Linker
arm-none-eabi-ld
links main.o with some code libraries (Standard C Library, Math Library, libopencm3) to resolve the functions called by main.o
3️⃣ The Linker generates the Blue Pill executable
firmware.elf
and the Blue Pill ROM image firmware.bin
, ready to be flashed into the Blue Pill
For our simple blink demo, the Linker generates a tiny Blue Pill executable file (744 bytes of ROM, 8 bytes of RAM)…
Linking .pioenvs/bluepill_f103c8/firmware.elf
Memory region Used Size Region Size %age Used
rom: 744 B 64 KB 1.14%
ram: 8 B 20 KB 0.04%
How does the Linker know what to put into RAM and ROM? It uses a Linker Script that contains a list of rules about what functions and variables to put into RAM or ROM. Let’s look at the Linker Script…
Linker Script
To
understand the above memory layout, let’s look at the Linker Script
that was used to generate the Blue Pill executable. Open the following
file in Visual Studio Code…
For Windows:
%userprofile%\.platformio\packages\framework-libopencm3\lib\stm32\f1\stm32f103x8.ld
For Mac and Linux:
~/.platformio/packages/framework-libopencm3/lib/stm32/f1/stm32f103x8.ld
Or open the web version of the Linker Script (split into 2 files):
MEMORY /* Define memory regions. */
{
rom (rx) : ORIGIN = 0x08000000, LENGTH = 64K
ram (rwx) : ORIGIN = 0x20000000, LENGTH = 20K
}
This defines the overall memory layout for Blue Pill: 64 KB ROM starting at address
0x0800 0000
, 20 KB RAM starting at 0x2000 0000
. So all addresses on the Blue Pill look like 0x08…
or 0x20…
Each Memory Region (RAM, ROM) consists of multiple Sections, which are defined as follows…
SECTIONS /* Define sections. */ { .text : { ... /* Vector table */ ... /* Program code */ ... /* Read-only data */ } >rom .data : { /* Read-write initialised data */ ... } >ram AT >rom .bss : { /* Read-write zero initialised data */ ... } >ram
Text Section
The Text Section is stored in ROM (read only) and contains:
- Vector Table: Defines the program start address and the interrupt routines, to be explained in a while
- Executable Program Code: The compiled machine code for each function
- Read-Only Data: Constant strings and other constant data that cannot be changed
Data and BSS Sections
Global Variables in the Blue Pill program will be allocated in the Data Section or BSS Section (both are in RAM). What’s the difference?
int bss_var; // Will be allocated in BSS Section (RAM).
int data_var = 123; // Will be allocated in Data Section (RAM and ROM).
Let’s say we have the above two global variables defined. The first variable
bss_var
is automatically initialised to 0 (or NULL
for pointers) according to the C specification. Global variables initialised to 0 or NULL
are allocated in the BSS Section. (We’ll see later that the initialisation is done in reset_handler()
.)
As for
data_var
, it has an initial non-zero value, 123
. data_var
will have 2 copies in memory:- One read-only copy in the ROM, within the Data Section, that remembers the initial value
123
permanently, across restarts. - Another read-write copy in RAM, within the Data Section, that is set to
123
upon startup. But as the program runs, the value may change.
That’s
why we see two copies of Data Section in the Memory Layout (think of it
as a Reference Copy vs a Working Copy). Also we see this peculiar rule
in the Linker Script…
.data : { /* Read-write initialised data */
...
} >ram AT >rom
The directive
>ram AT >rom
simply means “allocate in RAM and in ROM”. If you’re wondering why it’s called BSS… here’s the story.Stack
PROVIDE(
_stack = ORIGIN(ram) + LENGTH(ram)
);
The Stack keeps track of the local variables within each function call. It starts at the top of RAM (
0x2000 4FFF
) and grows downwards. So for this program…void gpio_setup(void) { ... }int main(void) { int stack_var = bss_var++ + data_var++; gpio_setup(); ...
We will see in the Stack (from high address to low address)…
- One Stack Frame for
main()
. It contains the value ofstack_var
. - One Stack Frame for
gpio_setup()
. It contains the values of the local variables ingpio_setup()
. - Plus Stack Frames for other functions called by
gpio_setup()
.
What
happens when the Stack hits the BSS Section? That’s when our program
crashes with an “Out Of Stack Space” error. We should never allow this
to happen — be careful with recursive functions.
Heap
If we use
new
in C++ and malloc()
in C, the dynamic memory storage will be allocated from the Heap. The
Heap lies between the BSS Section and the Stack. Yes it’s a tight
squeeze in 20 KB of shared RAM and might cause problems with the Stack.
That’s why I avoid using the Heap wherever possible.💎 Can we create additional Memory Regions and Sections? Yes we can! I created custom Memory Regionsbootrom
andbootram
, each with its own Sections, in this Linker Script (this allows me to partition the Blue Pill RAM and ROM for the Bootloader and for the Application):https://github.com/lupyuen/codal-libopencm3/blob/master/ld/stm32f103x8.ld
Linker Map
Now
that we understand how RAM and ROM are used in Blue Pill, let’s learn
to analyse the memory usage of a simple program — our blink demo.
After building the project, open the
firmware.map
file located at the top folder of the project. Or browse the web version.firmware.map
is the Linker Map
that tells us all the RAM and ROM allocated for the functions and
variables in our program. So if we run out of RAM or ROM, the Linker Map
firmware.map
is the right file to check. It was created when we specified this Linker Command-Line Option in platformio.ini
: -Wl,-Map,firmware.map
firmware.map
is very dense with lots of numbers. Later we’ll use Google Sheets to
analyse the Linker Map. But meanwhile have a look at this to understand
how the Linker Map is structured (skip to line 90 in firmware.map
)…
Remember the variables
bss_var, data_var, stack_var
in main.c
?
Let’s use the Linker Map to verify that
bss_var
is indeed allocated in the BSS Section and data_var
is allocated in the Data Section. And stack_var
should not appear in the Linker Map because it’s allocated on the Stack only when the main()
function executes…Disassemble the Blue Pill Executable
We’ll
get back to the Linker Map in a while. There are times when you’ll have
to account for every single byte of code or data in your program. Like
when you’re investigating why a function takes up so much code or data.
For such situations, inspecting the Assembly Code generated by the Blue Pill compiler may be helpful.
Let’s set up a Task in Visual Studio Code to dump (or disassemble) the Assembly Code into
firmware.dump
…- Click
Terminal → Configure Tasks
- In
tasks.json
file that appears, replace the contents by the contents of this file:https://github.com/lupyuen/stm32bluepill-blink/blob/master/.vscode/tasks.json
- Click
Terminal → Run Task → 🔎 Disassemble STM32 Blue Pill
This generates a
firmware.dump
file from the firmware.elf
executable that was created by the PlatformIO build step. In case you’re curious, the command looks like this…~/.platformio/packages/toolchain-gccarmnoneeabi/bin/arm-none-eabi-objdump --wide --syms --source --line-numbers .pioenvs/bluepill_f103c8/firmware.elf >firmware.dump
To see the Assembly Code, open the
firmware.dump
file located at the top folder of your project. Or open the web copy: https://github.com/lupyuen/stm32bluepill-blink/blob/master/firmware.dump
Here we see every single byte of Machine Code generated by the C compiler (the green column). To the right is the Assembly Code that corresponds to the Machine Code. The lines in yellow refer to the C Source Code that was used to generate the Assembly Code .
Fortunately the Blue Pill uses a RISC-based processor (Arm Cortex-M3) so the Assembly Code is easier to understand:
ldr
for Load Register, str
for Store Register, … For details, check out the STM32 Cortex®-M3 programming manual.
Every Blue Pill program begins execution at the ROM start address
0x0800 0000
… but don’t take me word for it, take a peek at firmware.dump
at address 0x0800 0000
…
That
doesn’t look like executable Machine Code. But when we group the bytes
into 32 bits (4 bytes each), familiar addresses begin to emerge (Hint:
Blue Pill addresses all begin with
0x08
or 0x20
)…8000000: 0x2000 5000 → start of stack in RAM (grows downwards)8000004: 0x0800 0255 → reset_handler() function in ROM8000008: 0x0800 0251 → null_handler() function in ROM800000C: 0x0800 024F → blocking_handler() function in ROM8000010: 0x0800 024F → blocking_handler() function in ROM8000014: 0x0800 024F → blocking_handler() function in ROM
If you cross-reference the 32-bit numbers with
firmware.dump
, you’ll realise that this block of numbers actually contains important info like…0x2000 5000
: Start address of the stack (which grows downwards). It also marks the end of RAM.0x0800 0255
: Address of thereset_handler()
function, which is the entry point for the program and initialises the global variables.0x0800 0251
: Address of thenull_handler()
function. This is an interrupt service routine that does nothing.0x0800 024F
: Address of theblocking_handler()
function. This is an interrupt service routine that loops forever (because the Blue Pill can’t recover from the exception that has occurred).
This list of addresses at
0x0800 0000
is known as the Vector Table.Vector Table
Every Blue Pill program must have a Vector Table at the start of ROM,
0x0800 0000
. Because without it, the Blue Pill won’t know where in RAM to allocate the Stack, and which function to call to execute the program (i.e. the reset_handler()
function).
The Vector Table structure is defined here…
For Windows:
%userprofile%\.platformio\packages\framework-libopencm3\lib\stm32\f1\vector_nvic.c
For Mac and Linux:
~/.platformio/packages/framework-libopencm3/lib/stm32/f1/vector_nvic.c
The Vector Table is defined by Arm (not STM). The Vector Table also includes a complete list of Interrupt Service Routines that will be called when an interrupt is triggered. For example,
rtc_alarm_isr()
is the interrupt service routine that will be called when the Real-Time Clock triggers an Alarm Interrupt on the Blue Pill.
What
happens when you don’t define any interrupt service routines?
libopencm3 will provide one of the following default interrupt service
routines for the interrupt, depending on the nature of the interrupt…
null_handler():
This is an interrupt service routine that does nothing.null_handler()
is the default interrupt service routine for non-critical interrupts, like the Real-Time Clock Alarm Interrupt.blocking_handler()
: This is an interrupt service routine that loops forever and never returns.blocking_handler()
is the default interrupt service routine for critical exceptions that will prevent Blue Pill from operating correctly. Hard Fault is a serious exception that uses theblocking_handler()
by default.
reset_handler() Function
reset_handler()
is the function that’s executed when the Blue Pill starts up. libopencm3 provides a default reset_handler()
function. According to the code above…reset_handler()
copies the Data Section from ROM to RAM. The Data Section contains variables that are initialised to non-zero values. (Remember that the Data Section exists in ROM and RAM?)reset_handler()
initialises variables in the BSS Section to 0 or NULL (for pointers).reset_handler()
calls ourmain()
function after initialisation. So all variables in the Data and BSS Sections will be set to their proper initial values when ourmain()
function runs.- For C++ programs,
reset_handler()
also calls the C++ Constructor Methods to create global C++ objects. It calls the C++ Destructor Methods to destroy global C++ objects whenmain()
returns. But this rarely happens because for most Blue Pill programs, themain()
function runs forever in a loop handling events.
So
reset_handler()
(like the Vector Table) is essential for Blue Pill operation. Don’t
tamper with it, just make sure it’s always in ROM. Be careful when
coding C++ Constructor Methods — they are called before main()
.💎 Can we override the defaultreset_handler()
and use our own? Yes we can! I used a customreset_handler()
here to start either the Baseloader, Bootloader or Application, depending on the settings stored in battery-backed memory:https://github.com/lupyuen/codal-libopencm3/blob/master/stm32/hal/reset_handler.c
💎 Why does the Vector Table point to0x0800 0255
instead of0x0800 0254
, the actual address ofreset_handler()
? Because Arm says “the least-significant bit of each vector must be 1, indicating that the exception handler is Thumb code”. This applies to the addresses of interrupt service routines in the Vector Table too.💎 Can we use Multiple Vector Tables? Yes we can! Here’s the Baseloader Code that I created to flash the updated Bootloader Code in ROM. Before overwriting the Bootloader Code (which includes the Vector Table) it installs a temporary Vector Table in RAM by pointing theSCB_VTOR
register to the new Vector Table (because we don’t want the Blue Pill to get confused if an exception occurs during flashing). This is also known as “Relocating the Vector Table”…https://github.com/lupyuen/codal-libopencm3/blob/master/stm32/baseloader/baseloader.c
Use Google Sheets To Analyse The Linker Map
As you recall, the Linker Map file
firmware.map
can be hard to analyse because it’s full of details. What if we used a
Google Sheets spreadsheet to crunch the file and show us only the
details that we need, to guide us in trimming down our RAM and ROM
usage?
Check out this Blue Pill Memory Map that was generated from our sample blink program…
The spreadsheet contains formulas to parse the lines from
firmware.map
, extract the columns that we need, and sort the functions and variables by size. Here are the highlights of the spreadsheet…
You can use the Google Sheets Template to analyse your own Linker Map files…
1️⃣ Click this link to open the Memory Map template…
2️⃣ Click
File → Make A Copy
to copy the file into your Google Drive storage
3️⃣ Paste the contents of your Linker Map file
firmware.map
into the Import
sheet
4️⃣ Click the
Symbols
sheet
5️⃣ Click
Data → Filter Views → All Objects By Size
This video shows you the steps. Click “CC” to view the instructions in the video…
What Happens When We Add sprintf()?
sprintf()
is a common C function
used for formatting numbers as strings. What happens to the RAM and ROM
usage when we add it to our Blue Pill program? Try it out yourself…
1️⃣ Edit
main.c
. Insert the lines below marked // Added line
2️⃣ Build the project in Visual Studio Code
3️⃣ Copy the contents of
firmware.map
into a new copy of the Memory Map Template
4️⃣ Click the
Symbols
sheet
5️⃣ Click
Data → Filter Views → All Objects By Size
Before adding
sprintf()
, the build output show this as the RAM and ROM usage…Linking .pioenvs/bluepill_f103c8/firmware.elf
Memory region Used Size Region Size %age Used
rom: 744 B 64 KB 1.14%
ram: 8 B 20 KB 0.04%
But after adding
sprintf()
, the ROM usage has increased from 1% to 33%!Linking .pioenvs/bluepill_f103c8/firmware.elf
Memory region Used Size Region Size %age Used
rom: 22,132 B 64 KB 33.77%
ram: 2,576 B 20 KB 12.58%
What
cause the ROM usage to jump so much? Take a look at the Memory Map you
have created. Or check out my copy of the Memory Map…
Did you notice the changes in the Memory Map after adding
sprintf()
? Here’s a summary…
Because of the added baggage, I don’t use
sprintf()
and the stdio
library in my Blue Pill programs. For logging, I wrote a library that performs simple formatting of numbers: https://github.com/lupyuen/codal-libopencm3/tree/master/stm32/logger
So
now we have learnt how to use Google Sheets to analyse the Linker Map
to find large functions and variables. This should help fix most of our
Blue Pill memory problems… though sometimes we will need to use special
tricks to cut down on memory usage. Read on for some tips…
Optimising A Huge Memory Map
The Memory Map above is from a complicated project that I’m doing in my spare time… Porting the MakeCode visual programming tool (used in the BBC micro:bit) to Blue Pill. I used this Memory Map to optimise the MakeCode runtime for Blue Pill. Here’s the file for the above Memory Map…
50
KB of code and read-only data - Why is it so huge? This happens when we
port over code from a higher-capacity microcontroller (micro:bit) to
the Blue Pill. (The same thing happened when I attempted to port
MicroPython to Blue Pill.) Compare the BBC micro:bit specs to Blue Pill…
BBC micro:bit 256 KB ROM 16 KB RAM
Blue Pill 64 KB ROM 20 KB RAM
Can we really squeeze a micro:bit program into Blue Pill?
Yes
we can! I used the Memory Map to identify the large objects to be
fixed. After optimisation, the program is a lot smaller (shown above).
Here is the optimised Memory Map file…
Using Google Sheets to analyse the Linker Map is perfect for such porting projects, it really helps to pinpoint the bloat so we can decide how to trim it down.
What’s Next
Porting
the MakeCode visual programming tool to Blue Pill is incredibly
complicated, but I learnt a lot from the process. This is the first
article that documents one of the complicated porting tasks: memory
optimisation. Coming up: More articles from my Blue Pill porting
experience…
1️⃣ Replacing the standard math functions by smaller
qfplib
library
2️⃣ Unit testing the math functions with the QEMU Blue Pill Emulator
3️⃣ Creating a WebUSB Bootloader for MakeCode
4️⃣ Updating the Bootloader with a tiny Baseloader
So don’t throw out the Blue Pill yet, there’s so much more you can do with it!
UPDATE: I have written a new series of articles on Blue Pill programming with Apache Mynewt realtime operating system, check this out…
I used the memory map spreadsheet to analyse the ROM usage of my Apache Mynewt application…
Comments
Post a Comment