kucu's notes

Dynamic CPU clock frequency scaling with FreeRTOS and STM32

04 May 2021

In this post I’ll show a relatively easy way to scale the MCU clock dynamically.

There is a section in the STM32L4 MCU’s datasheet called “Supply current characteristics”, and it has an interesting table detailing power consumption for various payloads as µA/MHz.

STM32L433 power consumption per MHz

Such a table suggest to clock the MCU to a reasonably high frequency, and go to SLEEP mode as soon as CPU is not used. Unfortunately, having the CPU in SLEEP mode at 8 MHz consumes ~250 µA (range 2, 25 °C), while SLEEP mode at 48 Mhz sets you back by ~1.34 mA (range 1). There is more than 1 mA difference for the same task (doing nothing), and it increases significantly at higher ambient temperatures.

In general, the way to long battery life are the STOPx and STANDBY modes. However, if your project needs fast clocks only for a short period of time, and works fine at lower speeds in the rest of the time, this trick will come handy.

A typical scenario for frequency scaling

In my current project (environmental sensor):

  • During measurement the ADC has to run at as high sample rate as possible (memory bus must be clocked high).
  • The measurement must be synced to the low-jitter (low phase noise) clock of the RF subsystem.
  • Most of the time I have to wait for transients to settle (1-5 ms, but several times).
  • The actual measurement takes ~1-2 ms, while setting up the measurement and reporting the results takes over 100 ms.
  • LoRaWAN communication requires long wait times (1000 ms) between TX and RX.

This problem begs for frequency scaling: during measurement and post-processing, I can clock the MCU to a high frequency, while the init/deinit phases and LoRaWAN communication works fine at lower clock speeds.

The problem and a possible solution

It’s probably obvious, but changing the clock frequency behind FreeRTOS’s ticks is causing the RTOS to lose tracking of time. Before showing my approach, here is a very generic solution.

The STM32 port of FreeRTOS by default use the SYSTICK timer, which is a 24-bit low power timer, clocked from the CPU clock (HCLK). Other clocks (TIM1/2/…) run either from APB1 or APB2 peripheral clocks. Unfortunately all of these clock domains change as SYSCLK changes. Even if the configSYSTICK_CLOCK_HZ configuration key is turned to a run-time parameter, there are way too many moving parts that has to be re-configured as well. See vPortSetupTimerInterrupt and vPortSuppressTicksAndSleep to get some insights on the work needed…

However, there are the LPTIMx timers, that accept external clock sources (making it independent of HCLK).

Advantages:

  • LPTIMx runs (and wakes up from) in STOP modes, allowing to achieve sleep currents in the 1-10 µA neighborhood (as opposed to regular sleep’s 200+ µA).
  • People have already shared implementations: JayKickliter and jefftenney

Disadvantages:

  • The clock source is either a fast clock causing timer overflow every ~4 ms (HSI16, 155 µA typ.), or an imprecise clock (LSI, somewhere around 32 kHz), or a clock that can not provide 1000 ticks per second without timing issues (LSE using 32768 Hz XO).
  • You use an LPTIM for a 1 ms timebase; however LoRaWAN applications (LMIC) require 100 µs (or better) resolution. Only LPTIM1 can wake up from STOP2 mode.

Workaround: SYSTICK prescaler

It’s not a particularly well documented feature, but the SYSTICK clock has a prescaler, which is either /1 or /8. If you can accept the limitation that you can not pick any frequency, which my project can, this is a relatively pain-free workaround.

  • I run LSE at 32768 Hz XO to run the RTC (~1 µA STANDBY current).
  • Normally, the project runs from MSI at 8 Mhz (MSI 18.5 µA typ; RUN mode: ~1 mA; SLEEP mode: ~0.3 mA), synced to LSE. The SYSTICK prescaler is /1. Most of the time the MCU is in SLEEP mode (waits for I2C, SPI, etc).
  • During measurement, I switch to an external 24 MHz XO, and tune PLLCLK to 64 MHz (PLLM:/3, PLLN:x16, PLLR:/2, VCO: 128 Mhz). I change the prescaler to /8. The measurement and clock change takes a few ms; MCU consumption is ~10 mA in RUN mode.
  • When the device is USB-powered, CLK48 mux can use PLLSAIQ (M:/3, N:x12, R:/2) and SYSCLK runs at 64 MHz the same way as above.
  • LPTIM1 is reserved for LoRaWAN, where sleep states are STOP2 (~2 µA for 1000 ms).

Implementation details

Whenever I change clock speed, I also call LL_SetSystemCoreClock(), which eventually updates the SystemCoreClock CMSIS variable. It’s a nice to have, as I can set up various peripherals (ie. I2C) according to this global variable. configSYSTICK_CLOCK_HZ is always 8 MHz, and this is a great simplification!

/* FreeRTOSConfig.h: */
extern uint32_t SystemCoreClock;        // this is a CMSIS variable

#define configUSE_TICKLESS_IDLE         1
#define configCPU_CLOCK_HZ              (SystemCoreClock)
#define configSYSTICK_CLOCK_HZ          8000000
#define configTICK_RATE_HZ              1000

Switching to 8 MHz clock:

LL_RCC_SetSysClkSource(LL_RCC_SYS_CLKSOURCE_MSI);
while (LL_RCC_GetSysClkSource() != LL_RCC_SYS_CLKSOURCE_STATUS_MSI);

// setup SYSTICK:
portNVIC_SYSTICK_CLK_BIT = SYSTICK_CLK_IS_HCLK;
SysTick->CTRL = (SysTick->CTRL & SYSTICK_CLK_MASK) | portNVIC_SYSTICK_CLK_BIT;

// update cmsis:
LL_SetSystemCoreClock(8000000);

// Disable not used parts:
LL_RCC_PLL_Disable();
LL_RCC_HSE_Disable();

// Voltage regulator
// SCALE2 max: MSI:24M, HSI16:16M, HSE:26M, PLL:16M, VCO:128M
LL_PWR_SetRegulVoltageScaling(LL_PWR_REGU_VOLTAGE_SCALE2);

// Setup flash latency:
// SCALE2 <6M: 0 CPU cycles; <12M: 1 cycles; <18: 2cycles...
LL_FLASH_SetLatency(LL_FLASH_LATENCY_1);
if(LL_FLASH_GetLatency() != LL_FLASH_LATENCY_1) {
    Error_Handler();
}

Switching to 64 MHz clock:

// Voltage regulator
// SCALE1 max: MSI:48M, HSI16:16M, HSE:48M, PLL:80M, VCO:344M
LL_PWR_SetRegulVoltageScaling(LL_PWR_REGU_VOLTAGE_SCALE1);

// SCALE1 <16M: 0 CPU cycles; <32M: 1 cycles; <48: 2 cycles..
LL_FLASH_SetLatency(LL_FLASH_LATENCY_3);
if(LL_FLASH_GetLatency() != LL_FLASH_LATENCY_3) {
    Error_Handler();
}

// enable 24MHz HSE:
LL_RCC_HSE_Enable();
while (!LL_RCC_HSE_IsReady());

// prepare PLL tuning:
if (LL_RCC_PLL_IsReady()) {
    LL_RCC_PLL_Disable();
    while (LL_RCC_PLL_IsReady());
}

// enable PLL:
// The PLLs input frequency must be between 4 and 16 MHz (24/3 = 8 Mhz)
// VCO output frequency is between 64 and 344 MHz
LL_RCC_PLL_ConfigDomain_SYS(LL_RCC_PLLSOURCE_HSE, LL_RCC_PLLM_DIV_3, 16, LL_RCC_PLLR_DIV_2);
LL_RCC_PLL_EnableDomain_SYS();
LL_RCC_PLL_Enable();
while (LL_RCC_PLL_IsReady() != 1U);

// set 64M as clock source:
LL_RCC_SetSysClkSource(LL_RCC_SYS_CLKSOURCE_PLL);
while (LL_RCC_GetSysClkSource() != LL_RCC_SYS_CLKSOURCE_STATUS_PLL);

// systick is 64M/8=8M
portNVIC_SYSTICK_CLK_BIT = SYSTICK_CLK_IS_HCLK_P8;
SysTick->CTRL = (SysTick->CTRL & SYSTICK_CLK_MASK) | portNVIC_SYSTICK_CLK_BIT;

// update CMSIS
LL_SetSystemCoreClock(64000000);

Generalizing the workaround

You can try even harder by changing SYSTICK timer to a generic one (TIMx), and using their prescaler at higher frequencies. However, I think the /8 prescaler is good enough for most projects, as STOP modes offer a better solution for long wait cycles than low-frequency SLEEP mode.