I was trying to write a register context saving/restoring when I came across a weird behaviour.
My code (sorry, tried to format tens of times, but the editor WANTS to make asm a table):
When the exception is returned from, the calling function:
msg = "returned from SVC\r\n";
serial_io.put_string(msg, util_str_len(msg)+1);
asm volatile (
);
msg = "cpsr = ";
util_word_to_hex(scratchpad, tmp1);
serial_io.put_string(scratchpad, 9);
serial_io.put_string("\r\n", 3);
outputs "returned from SVC" and "cpsr = 60000013".
Why the "00000002"? the barriers don't seem to have any effect.
I think that 'bl dbg_out' is free to change r0-r3, r12 and lr.
Popping and then pushing r0-r3 looks strange to me; probably because a LDM sp,{r0-r3} would do the job just as well, but save a bunch of clock cycles.
I recommend to avoid pushing the values back onto the stack; if you need to change a single register, I think it would be better to just do an indexed store (STR rN,[sp,#offset])
It's hard to believe that LDM saves clock cycles compared to POP - at least on Cortex-A7:
c c c c 1 0 0 0 1 0 W 1 n n n n r r r r r r r r r r r r r r r r arm_cldstm_ldm arm_core_ldstm LDM<c> <Rn>{!}, <registers> A1 A8.8.58
c c c c 1 0 0 0 1 0 1 1 1 1 0 1 r r r r r r r r r r r r r r r r arm_cldstm_pop arm_core_ldstm POP<c> <registers>
(The 'W' is writeback, and nnnn is the base register - SP is 1101.)
The pushing and popping is about debug.
I have stuff in stack, and if I suspect the registers may have been corrupted, I pop and push back to "fix" the registers.
And actually, the 'dbg_out' corrupts r0 and r1.
Another weirdness in today's debug:
The first 'gbg_out' prints 1f012658, the second prints 00000000 and the third 1f012658...
The stack should be 'corrected', because when you call assembly from C, the stack can be 4 but not 8 byte aligned, and GCC requires, that when you call C-code from 'outside (like assembly), the stack should be 8 byte aligned.
Uhm, what I mean is that a single LDM is approximately twice as fast as the two instructions POP + PUSH.
-Thus you will not have to use POP+PUSH, but can just read the contents of the stack and avoid writing to it.
The registers need to be loaded from the stack, that is correct, because they do not contain any defined value on interrupt entry.
Thus instead of ...
pop {r0-r3}
push {r0-r3}
... you can write ...
ldm sp,{r0-r3}
... which only reads the registers.
dbg_out is allowed to corrupt r2 and r3 as well, which I think is why you see the strange value you mentioned earlier.
I did not know about the 8-byte alignment requirement.
-But remember to restore SP to its original value before you return; either by saving the entire value of SP or by adding the difference back, otherwise you'll get a crash.
Ah, stupid me. That's what you ment - no writeback -> no write back...
"dbg_out is allowed to corrupt r2 and r3 as well"
Yes, allowed, but it doesn't - checked the disassembly.
"But remember to restore SP..."
I wrote them as pair. That way you don't forget, and it's easier to write one as "mirror image" of the other.
If you wrote the entire dbg_out yourself, it probably won't change r2 and r3.
What I'm most worried about is that if it's written in C, the C-compiler will optimize it at some later point, so it changes r2 and r3.
If dbg_out calls another C-routine, which you didn't write, then it's very much in danger of being unpredictable, regarding which registers it uses.
If you've written dbg_out in assembly language, then begin the routine by saving r0-r3 and r12 (since it's a debugging routine), then you won't have the problem at a later point).
-But it's great to see you found the error; this one is difficult to spot.
The dbg_out was written in C, but only used for debugging that situation. It's already removed.