ARM bit banding feature for core M85

K y 2 months ago

we have design ARM compiler for core M4( which has bit banding which mad my execution easy and faster)) now we want to port it to core M85 which does not support bit banding .. so that why i am asking .. how can i achieve this portability with ARM bit banding

Top replies

Ben Clark 2 months ago +1 suggested

Although not AI specific, and outside my domain, I've asked experts internally, and it appears we don't have a guide for porting or any particular resources sorry. This feature was after Cortex-M4 because...

Parents

0 Ben Clark 2 months ago

Although not AI specific, and outside my domain, I've asked experts internally, and it appears we don't have a guide for porting or any particular resources sorry.

This feature was after Cortex-M4 because it is not compatible with caches, as it uses two different address to access the same data. This increased interrupt latency as a bit banding write need to be converted into a locked read-modify-write sequence. (If an interrupt arrived after a bit-band write started, it must wait until the read-modify-write sequence is completed.) The locked access sequence potentially can also affect access latency for other bus managers on the the bus.

So the easiest fix is to convert it into atomic read-write - either with uint8 per flag (might be fast enough with M85) or explicit mask ops - but be careful where you need atomicity.

But you should go through and look at use-cases - those that were convenience, and don't need concurrency can become normal bit operations, and then split rest between SRAM “bit variables” and Peripheral control bits (GPIO, UART control, etc.)

Peripheral control bit best practice is to use dedicated SET/CLR/TOGGLE registers if the peripheral provides them. If there is only one data register, you again need to consider atomicity.

Where you need atomicity, there are options:

1: Peripheral-provided atomic set/clear registers (preferred). If the hardware offers SET/CLR style registers, use those. They’re designed to avoid RMW races.

2: Critical section around the RMW (common in bare-metal/RTOS). Good when the protected region is tiny and timing allows.

3: Use exclusive access (LDREX/STREX) for atomic bit updates in SRAM (and sometimes peripherals). Armv8-M supports exclusive instructions; CMSIS exposes them as intrinsics (__LDREXW, __STREXW, __CLREX). This is the closest conceptual replacement for “atomic bit set/clear in a packed word”.

Useful docs (sorry not specific) are:

- Arm v8-M manual for M85: https://developer.arm.com/documentation/ddi0553/by

- Arm v7-M manual for what bit-banding was doing: https://developer.arm.com/documentation/ddi0403/d?lang=en

- CMSIS Core docs for exclusive access intrinsics like __LDREXW, __STREXW etc https://arm-software.github.io/CMSIS_6/latest/Core/index.html
Cancel
Vote up +1 Vote down

Reply

Accept answer

Reject answer

Cancel

Reply

0 Ben Clark 2 months ago

Although not AI specific, and outside my domain, I've asked experts internally, and it appears we don't have a guide for porting or any particular resources sorry.

This feature was after Cortex-M4 because it is not compatible with caches, as it uses two different address to access the same data. This increased interrupt latency as a bit banding write need to be converted into a locked read-modify-write sequence. (If an interrupt arrived after a bit-band write started, it must wait until the read-modify-write sequence is completed.) The locked access sequence potentially can also affect access latency for other bus managers on the the bus.

So the easiest fix is to convert it into atomic read-write - either with uint8 per flag (might be fast enough with M85) or explicit mask ops - but be careful where you need atomicity.

But you should go through and look at use-cases - those that were convenience, and don't need concurrency can become normal bit operations, and then split rest between SRAM “bit variables” and Peripheral control bits (GPIO, UART control, etc.)

Peripheral control bit best practice is to use dedicated SET/CLR/TOGGLE registers if the peripheral provides them. If there is only one data register, you again need to consider atomicity.

Where you need atomicity, there are options:

1: Peripheral-provided atomic set/clear registers (preferred). If the hardware offers SET/CLR style registers, use those. They’re designed to avoid RMW races.

2: Critical section around the RMW (common in bare-metal/RTOS). Good when the protected region is tiny and timing allows.

3: Use exclusive access (LDREX/STREX) for atomic bit updates in SRAM (and sometimes peripherals). Armv8-M supports exclusive instructions; CMSIS exposes them as intrinsics (__LDREXW, __STREXW, __CLREX). This is the closest conceptual replacement for “atomic bit set/clear in a packed word”.

Useful docs (sorry not specific) are:

- Arm v8-M manual for M85: https://developer.arm.com/documentation/ddi0553/by

- Arm v7-M manual for what bit-banding was doing: https://developer.arm.com/documentation/ddi0403/d?lang=en

- CMSIS Core docs for exclusive access intrinsics like __LDREXW, __STREXW etc https://arm-software.github.io/CMSIS_6/latest/Core/index.html
Cancel
Vote up +1 Vote down

Reply

Accept answer

Reject answer

Cancel

Children

No data