This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

The volatile modifier adds instructions

The volatile modifier adds instructions to expand the variable to a 32-bit value (UXTB, UXTH, SXTB, SXTH).
It doesn't make sense. checking code online

#include "stdint.h"

struct _st
{
     uint8_t       a;
    volatile  uint8_t    b;
}st;

uint32_t test(uint8_t c)
{
    uint8_t out;
    if(st.a > c) out = st.a;
    else out = st.b; 
    return out;
};
код
test:
  ldr r2, .L3
  ldrb r3, [r2] //reading st.a
  cmp r3, r0
  bhi .L2
  ldrb r3, [r2, #1] //reading st.b
  uxtb r3, r3    // <<<<< 
.L2:
  mov r0, r3
  bx lr
.L3:
  .word .LANCHOR0
st:
  • Dear AVI_crak,

    On arm 32bit, we are working with 32bit registers, so it must do something with the "rest" of an uint8_t (unsigned int on 8bits) that you're playing with.

    Technically, if you see that you're using it always as an 8bit value, you can discard the forcing of extending it to a 32bit value, which is what it's done when you return st.a. However, by using the volatile qualifier, you tell the compiler to not optimize this variable. Thus, the expansion of the 8bit variable to fill a 32bit register remain.

    Best Regards,
    Willy

  • Hello, thanks for your attention to the problem.
    I just want this business to get off the ground. You are saying that the legacy "volatile" extends to the holding register. For the safe operation of code below the load level.
    Okay, so ARM probably doesn't have auto-expanding boot instructions? Oops, there is "ldrsb".
    New piece of code, old problems: godbolt.org/.../cWza83Wrz

    I am still confident that "volatile" adds extra operations that can be omitted. The Clang compiler successfully uses the correct load instructions. This means that GCC should be able to do that.
    In the online compiler, you can play around with the type and version of the compiler, the type of processor used, and the compilation options. Everything at once and in one place. This is more convenient than copying to the forum.
    It will be great if this problem is resolved.

  • Hello, can you be more specific regarding the processor being used, and the optimization level?

    The default optimization (-O0) for armclang is highly un-optimized. It is generally recommended to use at least -O2.

  • > it must do something with the "rest" of an uint8_t

    ldrb already sets the high 24 bits of the register to zero; there is nothing left to be done as long as the result is unsigned.

    The godbolt link shows the uxtb instruction used regardless of optimization level (well, -O0 does does something different but awful, as expected.)

    There's an interesting comment that shows up in the disassembly:


      ldrb r3, [r2, #1] @ zero_extendqisi2
      uxtb r3, r3

    And there's a clue here: stackoverflow.com/.../meaning-of-zero-extendqisi2

    I guess the intermediate language has a generic "fetch and zero-extend" internal instruction, and it isn't smart enough to realize that in the case that the fetch is from memory, the extend has effectively already happened. (if the source of the fetch was a register, it would have been a mov instruction, and the uxtb would be necessary because there is no mov variant for bytes to 32bit register.)

  • I changed the example a little, and it turned out to be even more interesting.
    The structure now contains int8_t, which are read in two different ways.
    Simple variable "a" is read by ldrsb instruction, automatically expanded to int32_t. Because after it the "cmp" comparison instruction is used, which does not know how to work otherwise, and is a natural barrier.
    And then "b" is read with the "volatile" modifier. An instruction to read an unsigned variable is used, followed by a separate extension.

    #include "stdint.h"
    /// -Os -mcpu=cortex-m7 
    struct _st
    {
        int8_t       a;
        volatile  int8_t    b;
    }st;
    
    int32_t test(int32_t c)
    {
        int32_t out;
        if(st.a > c) out = st.a;
        else out = st.b; 
        return out;
    };
    код
    test:
      ldr r2, .L3
      ldrsb r3, [r2]
      cmp r3, r0
      bgt .L1
      ldrb r3, [r2, #1] @ zero_extendqisi2
      sxtb r3, r3
    .L1:
      mov r0, r3
      bx lr
    .L3:
      .word .LANCHOR0
    st: