![]() |
px-fwlib 0.10.0
Cross-platform embedded library and documentation for 8/16/32-bit microcontrollers generated with Doxygen 1.9.2
|
KEEP CALM and DON'T PANIC :)
As part of the journey to becoming a true embedded hero you have to take a peek under the hood to see what is really happening on the bare metal level. It may sound crazy to dive right into the deep end on the assembly / machine code level first, but it will provide insight on what the C code is trying to achieve and how to debug tricky situations like optimized code, hard faults, stack overflows, etc. You do not need to be proficient at writing assembly at an expert level, just know enough to single step debug and follow the assembly flow.
Reference(s):
This example flashes the LED at 1 Hz (500 ms on; 500 ms off). On the PX-HER0 Board, the LED is wired to Port H pin 0. PH0 must be configured as a digital output pin. Set PH0 high to enable the LED and set PH0 low to disable the LED.
Go right ahead and start a debugging session! There's no better teacher than single stepping through the code and observing the processor core and peripheral registers:
If you need more info regarding an instruction, search for it in [2]. For example, on line 121, the instruction subs r0, #1
is used. Searching for "subs", an explanation is found in [2] "3.5.1 ADC, ADD, RSB, SBC, and SUB" (page 54).
The "sub" part of the instruction means that it is a subtraction operation and the suffix "s" means that the condition code flags will be updated on the result of the operation. For example, if the result is zero, the Zero Flag (Z) will be set.
The condition flags (N = Negative Flag, Z = Zero Flag, C = Carry Flag, V = Overflow Flag) are stored in the Application Program Status Register (APSR) and can also be viewed in the Registers window during debugging (you may need to scroll down to view it).
On line 122 the instruction "bne _delay_loop" is used. The "b" part means that it is a branch instruction and the suffix "ne" (not equal) means that the branch should be taken if the Zero Flag (Z) is not set. See [2] "Table 17. Condition code suffixes" (page 44) for a summary.
When inspecting the Extended Listing File "flashing_led.lss" generated by the dissasembler you will notice that some instructions have a ".n" or ".w" suffix, for example:
800003e: e7f6 b.n 800002e <_main_loop>
The ".n" suffix simply means that the narrow 16-bit version of the branch instruction has been used. The ".w" means that the wide 32-bit version of the instruction has been used. Observe that the narrow version of the instruction is encoded in two bytes of machine code: 0xe7 and 0xf6.
The project is built with an introductory Makefile to demonstrate that it's not that hard to understand and use. For a gentle introduction to Make, see 7.2 How to understand and modify Makefiles.
The linker places the code, data and variables into the right memory locations using the introductory linker script "stm32l072xb.ld".
See [1] "2.2 Memory organization" (page 57) for more information.
The final executable "flashing_led.bin" is only 108 bytes of machine code:
Offset(h) 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 00000000 00 50 00 20 15 00 00 08 10 00 00 08 12 00 00 08 00000010 FE E7 FE E7 0E 48 01 68 0E 4A 11 43 01 60 0E 48 00000020 01 68 0E 4A 11 40 0E 4A 11 43 01 60 0D 48 0C 49 00000030 01 60 00 F0 05 F8 0C 49 01 60 00 F0 01 F8 F6 E7 00000040 01 B5 04 20 00 04 C0 46 01 38 FC D1 01 BD 00 00 00000050 2C 10 02 40 80 00 00 00 00 1C 00 50 FD FF FF FF 00000060 01 00 00 00 18 1C 00 50 00 00 01 00
The example has been crafted to demonstrate specific concepts and it could have been made even smaller!
The first 4 bytes (00 50 00 20) assembled in little-endian order as a 32-bit value (0x2000_5000) tells the processor core to set the Stack Pointer register (SP / R13) to the end of SRAM on reset. The STM32L072RB has 20kB of SRAM (20 x 1024 = 2048 = 0x5000) and starts at 0x2000_0000.
The next 32-bit value (0x0800_0B15) tells the processor core to set the Program Counter register (PC / R15) to the start of the main()
function on reset. This is the address of the first instruction that will be executed after reset. Flash memory starts at 0x0800_0000, but is also mapped to 0x0000_0000.
See [2] "2.3.4 Vector table" (page 29) for more info.
The address of main()
is actually 0x0800_0B14, but the least significant bit is set to indicate to the processor core that it is jumping to a 16-bit Thumb assembly instruction and not a 32-bit ARM instruction. This is done to make the code compatible with other ARM processor cores that do support switching between ARM and Thumb instructions.
It will result in a Hard Fault if the least significant bit is not set.
The communication interface with peripherals is memory mapped: the processor core communicates with a peripheral by writing to or reading from specific memory addresses.
[1] "9. General-purpose I/Os (GPIO)" (page 234) documents how to use the GPIO peripheral. Feel free to skip ahead and scan [1] "9.4 GPIO registers" (page 243) first. After all, this part documents the actual interface to the peripheral (the "buttons & lights" section).
The memory map of peripherals is listed in [1] "2.2.2 Memory map and register boundary addresses" (page 58). The base address of GPIOH is 0x5000_1C00. The offset of GPIOx_MODER register is 0x00, so the address of GPIOH_MODER is 0x5000_1C00. The address of GPIOH_BSRR register is 0x5000_1C18 (offset = 0x18).
Assume that a peripheral's clock is disabled on start up, unless the documentation proves otherwise.
As a general rule you always need to enable a peripheral's clock before you can use it. Imagine an ARM Cortex microcontroller being a mansion with a large number of rooms. A frugal person would not leave all the lights on, but only switch the lights on in the rooms that is used, otherwise the electricity bill would be huge. Likewise the clocks to peripherals are disabled by default to save power. It is also important to know that a higher clock frequency uses more power and therefor it is desirable to run a peripheral at the lowest acceptable frequency. The lower peripheral frequency may incur a communication penalty as the faster processor core has to wait for the slower peripheral register to return a valid value.
The first step is thus to enable the clock to GPIOH by setting bit 7 in the RCC_IOPENR register (address 0x4002_102C). See [1] "7.3.12 GPIO clock enable register (RCC_IOPENR)" (page 202).
PH0's mode must be changed from analog to digital output by setting GPIOH_MODER[1:0] to 01. See [1] "9.4.1 GPIO port mode register (GPIOx_MODER) (x =A..E and H)" on page 243.
The LED is enabled by setting PH0 high. This is acomplished by writing a 1 to GPIOH_BSRR[0].
The LED is disabled by setting PH0 low. This is acomplished by writing a 1 to GPIOH_BSRR[16].
A ~500 ms delay is achieved by wasting a large number of instruction clock cycles in an empty count down loop. The counter value can be calculated if the processor core clock frequency is known as well as how many clock cycles each instruction takes. See HERE for an instruction clock cycle summary.
The Delay()
function also provides an opportunity to observe the registers and stack when it is called using a "bl" instruction (branch link):
When the branch is taken, the return address is stored in the Link Register (LR / R13). The initial value of the Stack Pointer (SP / R13) is 0x2000_5000. The push {r0, lr}
instruction stores the two register values on the stack (in little-endian order) and SP decreases to 0x2000_4FF8:
See [2] "2.1.2 Stacks" (page 12) and also [2] "3.6 "Branch and control instructions" (page 65) for more info.
File(s):