The volatile modifier adds instructions to expand the variable to a 32-bit value (UXTB, UXTH, SXTB, SXTH).It doesn't make sense. checking code online
#include "stdint.h" struct _st { uint8_t a; volatile uint8_t b; }st; uint32_t test(uint8_t c) { uint8_t out; if(st.a > c) out = st.a; else out = st.b; return out; };
test: ldr r2, .L3 ldrb r3, [r2] //reading st.a cmp r3, r0 bhi .L2 ldrb r3, [r2, #1] //reading st.b uxtb r3, r3 // <<<<< .L2: mov r0, r3 bx lr .L3: .word .LANCHOR0 st:
> it must do something with the "rest" of an uint8_tldrb already sets the high 24 bits of the register to zero; there is nothing left to be done as long as the result is unsigned.The godbolt link shows the uxtb instruction used regardless of optimization level (well, -O0 does does something different but awful, as expected.)There's an interesting comment that shows up in the disassembly:
ldrb r3, [r2, #1] @ zero_extendqisi2 uxtb r3, r3
ldrb r3, [r2, #1] @ zero_extendqisi2
uxtb r3, r3
And there's a clue here: stackoverflow.com/.../meaning-of-zero-extendqisi2I guess the intermediate language has a generic "fetch and zero-extend" internal instruction, and it isn't smart enough to realize that in the case that the fetch is from memory, the extend has effectively already happened. (if the source of the fetch was a register, it would have been a mov instruction, and the uxtb would be necessary because there is no mov variant for bytes to 32bit register.)
I changed the example a little, and it turned out to be even more interesting.The structure now contains int8_t, which are read in two different ways.Simple variable "a" is read by ldrsb instruction, automatically expanded to int32_t. Because after it the "cmp" comparison instruction is used, which does not know how to work otherwise, and is a natural barrier.And then "b" is read with the "volatile" modifier. An instruction to read an unsigned variable is used, followed by a separate extension.
#include "stdint.h" /// -Os -mcpu=cortex-m7 struct _st { int8_t a; volatile int8_t b; }st; int32_t test(int32_t c) { int32_t out; if(st.a > c) out = st.a; else out = st.b; return out; };
test: ldr r2, .L3 ldrsb r3, [r2] cmp r3, r0 bgt .L1 ldrb r3, [r2, #1] @ zero_extendqisi2 sxtb r3, r3 .L1: mov r0, r3 bx lr .L3: .word .LANCHOR0 st: