Weird SPSR behaviour

I was trying to write a register context saving/restoring when I came across a weird behaviour.

My code (sorry, tried to format tens of times, but the editor WANTS to make asm a table):

asm volatile (
...
"pop {r0 - r3}"
"push {r0 - r3}"
"mov r0, r3"
"bl dbg_out" - outputs 60000013
"pop {r0 - r3}"
"msr cpsr_fsxc, r2"
"@dsb"
"@isb"
"msr spsr_fsxc, r3" - set value
"@dsb"
"@isb"
"mov lr, r1"
"mov sp, r0"
"push {r0 - r4, lr}"
"mov r0, lr"
"bl dbg_out"
"mov r0, sp"
"bl dbg_out"
"mrs r2, cpsr"
"mrs r3, spsr" - read value
"mov r0, r2"
"bl dbg_out"
"mov r0, r3" - outputs 00000002
"bl dbg_out"
...
);

When the exception is returned from, the calling function:

asm volatile ("svc #0\n\t");

    msg = "returned from SVC\r\n";

    serial_io.put_string(msg, util_str_len(msg)+1);

    asm volatile (

"mrs %[retreg], cpsr\n\t"
:[retreg] "=r" (tmp1) ::

    );

    msg = "cpsr = ";

    serial_io.put_string(msg, util_str_len(msg)+1);

    util_word_to_hex(scratchpad, tmp1);

    serial_io.put_string(scratchpad, 9);

    serial_io.put_string("\r\n", 3);

outputs "returned from SVC" and "cpsr = 60000013".

Why the "00000002"? the barriers don't seem to have any effect.

Parents Reply Children
  • It's hard to believe that LDM saves clock cycles compared to POP - at least on Cortex-A7:

    c    c    c    c    1    0    0    0    1    0    W    1    n    n    n    n    r    r    r    r    r    r    r    r    r    r    r    r    r    r    r    r    arm_cldstm_ldm    arm_core_ldstm    LDM<c> <Rn>{!}, <registers>    A1    A8.8.58

    c    c    c    c    1    0    0    0    1    0    1     1    1    1    0    1    r    r    r    r    r    r    r    r    r    r    r    r    r    r    r    r    arm_cldstm_pop    arm_core_ldstm    POP<c> <registers>

      (The 'W' is writeback, and nnnn is the base register - SP is 1101.)

    The pushing and popping is about debug.

    I have stuff in stack, and if I suspect the registers may have been corrupted, I pop and push back to "fix" the registers.

    And actually, the 'dbg_out' corrupts r0 and r1.

    Another weirdness in today's debug:

       "@ align SP\n\t"
       "mov r0, sp\n\t"
       "bl dbg_out\n\t"
       "mov r0, sp\n\t"
       "and r1, r0, #7\n\t"
       "push {r0, r1}\n\t"
       "mov r0, r1\n\t"
       "bl dbg_out\n\t"
       "pop {r0, r1}\n\t"
       "sub r0, r1\n\t"
       "mov sp, r0\n\t"
       "push {r0,r1} @ stack correction"
       "mov r0, r1\n\t"
       "bl dbg_out\n\t"

    The first 'gbg_out' prints 1f012658, the second prints 00000000 and the third 1f012658...

    The stack should be 'corrected', because when you call assembly from C, the stack can be 4 but not 8 byte aligned, and GCC requires, that when you call C-code from 'outside (like assembly), the stack should be 8 byte aligned.

  • Uhm, what I mean is that a single LDM is approximately twice as fast as the two instructions POP + PUSH.

    -Thus you will not have to use POP+PUSH, but can just read the contents of the stack and avoid writing to it.

    The registers need to be loaded from the stack, that is correct, because they do not contain any defined value on interrupt entry.

    Thus instead of ...

       pop {r0-r3}

       push {r0-r3}

    ... you can write ...

       ldm sp,{r0-r3}

    ... which only reads the registers.

    dbg_out is allowed to corrupt r2 and r3 as well, which I think is why you see the strange value you mentioned earlier.

    I did not know about the 8-byte alignment requirement.

    -But remember to restore SP to its original value before you return; either by saving the entire value of SP or by adding the difference back, otherwise you'll get a crash.

  • Ah, stupid me. That's what you ment - no writeback -> no write back...

    "dbg_out is allowed to corrupt r2 and r3 as well"

    Yes, allowed, but it doesn't - checked the disassembly.

    "But remember to restore SP..."

    I wrote them as pair. That way you don't forget, and it's easier to write one as "mirror image" of the other.

  • If you wrote the entire dbg_out yourself, it probably won't change r2 and r3.

    What I'm most worried about is that if it's written in C, the C-compiler will optimize it at some later point, so it changes r2 and r3.

    If dbg_out calls another C-routine, which you didn't write, then it's very much in danger of being unpredictable, regarding which registers it uses.

    If you've written dbg_out in assembly language, then begin the routine by saving r0-r3 and r12 (since it's a debugging routine), then you won't have the problem at a later point).

    -But it's great to see you found the error; this one is difficult to spot.

  • The dbg_out was written in C, but only used for debugging that situation. It's already removed.