I came across a weird behaviour when trying out my program on Raspberry Pi 2b (Cortex-A7):
When I try my PABT-handler using BKPT, the handler is entered fine, but on return the program restarts.
The restarted program returns fine from the BKPT and continues as expected.
Any idea what could cause the restart once?
I have MMUs, caches and predictors off, so the memory should be strongly ordered?
And anyway, it's the same code.
void rpi2_pabt_handler() __attribute__ ((naked));
...
void rpi2_pabt_handler(){ asm volatile ( "push {r0 - r12, lr}\n\t" "mov r5, lr\n\t" "bl rpi2_pabt_handler2\n\t" "pop {r0 - r12, lr}\n\t" "subs pc, lr, #0 @ to skip bkpt\n\t" );}#define DEBUG_PABTvoid rpi2_pabt_handler2(){ uint32_t exc_addr; uint32_t exc_cpsr;#ifdef DEBUG_PABT int i; char *pp; static char scratchpad[16];#endif asm volatile ( "mov %[var_reg], r5\n\t" :[var_reg] "=r" (exc_addr) :: ); asm volatile ( "mrs %[var_reg], spsr\n\t" :[var_reg] "=r" (exc_cpsr) :: );#ifdef DEBUG_PABT pp = "\r\nPABT EXCEPTION\r\n"; do {i = serial_raw_puts(pp); pp += i;} while (i); serial_raw_puts("exc_addr: "); util_word_to_hex(scratchpad, exc_addr); serial_raw_puts(scratchpad); serial_raw_puts("\r\nSPSR: "); util_word_to_hex(scratchpad, exc_cpsr); serial_raw_puts(scratchpad); serial_raw_puts("\r\n");#endif // rpi2_trap_handler();}
Just a quick wild guess without thinking too much about it.
I can see that the function prototype already has __attribute__((naked)), have you checked the disassembly ?
Maybe you need __attribute__((naked)) ... like this:
__attribute__((naked)) void rpi2_pabt_handler() { asm volatile ( "push {r0 - r12, lr}\n\t" "mov r5, lr\n\t" "bl rpi2_pabt_handler2\n\t" "pop {r0 - r12, lr}\n\t" "subs pc, lr, #0 @ to skip bkpt\n\t" ); }
-That should remove any prologue/epilogue code, which is normally generated by the compiler.
Thinking a bit more about it; how does the Cortex-A7 return from an exception; eg. the processor's state need to be restored, right ?
-If the mechanism is similar to Cortex-M, the you can't subtract anything from LR. Something's a bit strange about that subs anyway; wouldn't you get the same effect by changing to pop {r0-r12,pc} ?
Yes, I checked, and IRQ and SVC work fine - and so does PABT (BKPT) on the "second try".
(snippets from disassembly of the ELF using objdump)
00009850 <rpi2_pabt_handler>: 9850: e92d5fff push {r0, r1, r2, r3, r4, r5, r6, r7, r8, r9, sl, fp, ip, lr} 9854: e1a0500e mov r5, lr 9858: ebffffa9 bl 9704 <rpi2_pabt_handler2> 985c: e8bd5fff pop {r0, r1, r2, r3, r4, r5, r6, r7, r8, r9, sl, fp, ip, lr} 9860: e25ef000 subs pc, lr, #0
00009850 <rpi2_pabt_handler>:
9850: e92d5fff push {r0, r1, r2, r3, r4, r5, r6, r7, r8, r9, sl, fp, ip, lr}
9854: e1a0500e mov r5, lr
9858: ebffffa9 bl 9704 <rpi2_pabt_handler2>
985c: e8bd5fff pop {r0, r1, r2, r3, r4, r5, r6, r7, r8, r9, sl, fp, ip, lr}
9860: e25ef000 subs pc, lr, #0
I wonder if I need some barrier, but then, the memory should be strongly ordered with MMUs, caches and branch prediction off.
For comparison: the IRQ handler that works fine (many times during one run):
00009654 <rpi2_irq_handler>: 9654: e92d5fff push {r0, r1, r2, r3, r4, r5, r6, r7, r8, r9, sl, fp, ip, lr} 9658: e1a0400d mov r4, sp 965c: e1a0500e mov r5, lr 9660: ebffffea bl 9610 <rpi2_irq_handler2> 9664: e8bd5fff pop {r0, r1, r2, r3, r4, r5, r6, r7, r8, r9, sl, fp, ip, lr} 9668: e25ef004 subs pc, lr, #4
00009654 <rpi2_irq_handler>:
9654: e92d5fff push {r0, r1, r2, r3, r4, r5, r6, r7, r8, r9, sl, fp, ip, lr}
9658: e1a0400d mov r4, sp
965c: e1a0500e mov r5, lr
9660: ebffffea bl 9610 <rpi2_irq_handler2>
9664: e8bd5fff pop {r0, r1, r2, r3, r4, r5, r6, r7, r8, r9, sl, fp, ip, lr}
9668: e25ef004 subs pc, lr, #4
The sp and lr are printed if the IRQ is not serial IRQ - that haven't happened...
SVC handler is already modified for next step (processor context saving and manipulation in the handler before return). It's not quite ready yet, and is not thus comparable.
And yes, the offsets are supposed to be subtracted from lr. For PABT the offset given is 4, but then PABT is used for prefetch abort, and the idea is to re-execute the aborted instruction. I use BKPT, so I don't want to re-execute it. The return instruction is the same as in the other exception handlers (just the the offsets differ) for symmetry. And when flags are involved with arithmetics whose destination is PC, it means simultaneous restoring of CPSR from SPSR.
Comment in the code (where I kind of wrote a reminder):
// for fix for the exception return address // see ARM DDI 0406C.c, ARMv7-A/R ARM issue C p. B1-1173 // Table B1-7 Offsets applied to Link value for exceptions taken to PL1 modes // UNDEF: 4, SVC: 0, PABT: 4, DABT: 8, IRQ: 4, FIQ: 4
// for fix for the exception return address
// see ARM DDI 0406C.c, ARMv7-A/R ARM issue C p. B1-1173
// Table B1-7 Offsets applied to Link value for exceptions taken to PL1 modes
// UNDEF: 4, SVC: 0, PABT: 4, DABT: 8, IRQ: 4, FIQ: 4
Cortex A is somewhat more weird than Cortex M.
I think I could do the 'pop', but a caret is needed too - not sure about the syntax of pop, but I guess "pop {r0 - r12, pc}^", the caret indicating restoring of CPSR from SPSR as well. At least the caret is used with ldm.
I'm not sure, but the place of '__attribute__((naked))' probably depends on whether it's in the function definition or declaration. Should check that...
turboscrew wrote: I think I could do the 'pop', but a caret is needed too - not sure about the syntax of pop, but I guess "pop {r0 - r12, pc}^", the caret indicating restoring of CPSR from SPSR as well. At least the caret is used with ldm.
turboscrew wrote:
OK, then Cortex-A acts much like the previous ARM architectures (eg. ARM7TDMI and ARM9)
The previous architectures also used the S postfix on SUBS with PC as destination for exception return, plus the ^ at the end of the POP instruction.
-Sometimes it's good to do a 'make clean' before you flash your program.
Note on the community editor(s): You can click on "Use Advanced Editor" in the upper right-hand corner, to change to a different editor; this editor allow you to format blocks of text as code for instance. But you'll have to click on the topic (in this case, Funny PABT behaviour - why?, in order to be able to switch editor), before you actually click "Reply".
I can't find 'code'-selection. Did you mean the syntax highlighting? I guess C++ is good for C?
And was there a way to set it as default?
This is the "main function":
void test_main() { int i; uint32_t tmp1, tmp2; // for debugging char *msg; // for debugging io_device serial_io; // for debugging int len; const int dbg_buff_len = 512; static char scratchpad[16]; // scratchpad static char dbg_buff[512]; // message buffer /* initialize rpi2 */ rpi2_init(); /* initialize serial for debugger */ serial_init(&serial_io); // debug-line msg = "Finally! Got into main()\r\n"; i=0; do {i = serial_raw_puts(msg); msg += i;} while ; rpi2_led_blink(1000, 1000, 3); rpi2_delay_loop(3000); msg = "trying SVC\r\n"; serial_io.put_string(msg, util_str_len(msg)+1); // a little delay for serial output rpi2_led_blink(1000, 1000, 3); asm volatile ("svc #0\n\t"); msg = "returned from SVC\r\n"; serial_io.put_string(msg, util_str_len(msg)+1); msg = "trying BKPT\r\n"; serial_io.put_string(msg, util_str_len(msg)+1); // a little delay for serial output rpi2_led_blink(1000, 1000, 3); asm volatile ("bkpt #0\n\t"); msg = "returned from BKPT\r\n"; serial_io.put_string(msg, util_str_len(msg)+1); msg = "\r\nentering main loop\r\n"; serial_io.put_string(msg, util_str_len(msg)+1); while (1) { // echo len = serial_io.read(dbg_buff, 512); if (len > 0) { i = 0; while (i < len) { tmp1 = serial_io.write(dbg_buff+i, len-i); i += tmp1; } } i = (int) serial_get_rx_dropped(); if (i > 0) { util_word_to_hex(scratchpad, i); scratchpad[8]='\0'; // end-nul serial_raw_puts("\r\ndropped: "); serial_raw_puts(scratchpad); serial_raw_puts("\r\n"); } } }
and the output was
Finally! Got into main() trying SVC SVC EXCEPTION exc_addr: 0000907c SPSR: 68000013 returned from SVC trying BKPT PABT EXCEPTION exc_addr: 000090d0 SPSR: 60000013 Finally! Got into main() trying SVC SVC EXCEPTION exc_addr: 0000907c SPSR: 6800001b returned from SVC trying BKPT PABT EXCEPTION exc_addr: 000090d0 SPSR: 6000001b returned from BKPT entering main loop kögvnzkljdb lärtsnb ltjn b ljbv,mdfbn,mxnblmdfz b,mdzf b
The last line is just arbitrary key presses to see that the echoing still works.
Darn, it's past midnight, and I have to wake up early tomorrow.
I'll go to my parent's cottage to do help in some fixing. I'll be there for a couple of days, but I'll be back, like the big Arnold used to say...
turboscrew wrote: I can't find 'code'-selection. Did you mean the syntax highlighting? I guess C++ is good for C? And was there a way to set it as default?
Uhm, that was actually what I meant. Not being an administrator, I have the privilege of being sloppy when I explain things.
Your code appears to work now; perhaps it was an old object-file not being rebuilt ?
turboscrew wrote: I'll go to my parent's cottage to do help in some fixing. I'll be there for a couple of days, but I'll be back, like the big Arnold used to say...
He still says so; and also: "Old, but not obsolete."
No, it doesn't.
See the output, line 13 on. Dejavu?
The SVC and PABT should happen only once.
(Uh, should finish packing and go...)
Last time I was answering while being very sleepy, so no ideas really came to mind.
Uhm, perhaps a double-fault happens, which causes the CPU to reset ?
-Double-fault could happen if the stack pointer is pointing somewhere strange, and an exception occurs.
Since there are no problems when trying SVC, this might be working; but what if a different stack is used after SVC returns.
My suggestion:
Try
SVC
run it and see if you get any funny behaviour
then replace by ...
BKPT
.. rebuild and run; see if it still behaves like earlier.
Hello,
according to the 2nd value of SPSR, the 2nd SVC and PABT came from the undefined mode.
This means that the 1st return from PABT caused Undefined Exception.
I guess that BKPT will make CPU the debug mode and it affected the execution.
Best regards,
Yasuhiko Koumoto.
I totally missed the modes in the last round! You could call it a double fault (in a sense).
Funny - I also had UNDEFINED exception "mined", but no notifications from there.
Anyway, that's back to the manuals about BKPT and debug state (or is it mode...).
#define DEBUG_UNDEF void rpi2_undef_handler2() { uint32_t stack_frame_addr; uint32_t exc_addr; uint32_t exc_cpsr; #ifdef DEBUG_UNDEF static char scratchpad[16]; // scratchpad char *p; int i; #endif // fetch the parameter asm volatile ( "mov %[var_reg], r4\n\t" :[var_reg] "=r" (stack_frame_addr) :: ); asm volatile ( "mov %[var_reg], r5\n\t" :[var_reg] "=r" (exc_addr) :: ); asm volatile ( "mrs %[var_reg], spsr\n\t" :[var_reg] "=r" (exc_cpsr) :: ); #ifdef DEBUG_UNDEF p = "\r\nUNDEFINED EXCEPTION\r\n"; do {i = serial_raw_puts(p); p += i;} while (i); serial_raw_puts("exc_addr: "); util_word_to_hex(scratchpad, exc_addr); serial_raw_puts(scratchpad); serial_raw_puts("\r\nSPSR: "); util_word_to_hex(scratchpad, exc_cpsr); serial_raw_puts(scratchpad); serial_raw_puts("\r\n"); #endif // exception_info = RPI2_EXC_UNDEF; // rpi2_trap_handler(); asm volatile ( "push {r0, r1, r2, r3, r4, r5, r6, r7, r8, r9, r10,lr}\n\t" "1:\n\t" "mov r0, #0x100\n\t" "mov r1, #0x100\n\t" "mov r2, #5\n\t" "bl debug_blink\n\t" "mov r3, #0x2000 @ 2 s pause\n\t" "bl debug_wait\n\t" "mov r0, #0x1000\n\t" "mov r1, #0x1000\n\t" "mov r2, #2\n\t" "bl debug_blink\n\t" "mov r3, #0x5000 @ 5 s pause\n\t" "bl debug_wait\n\t" "b 1b\n\t" "pop {r0, r1, r2, r3, r4, r5, r6, r7, r8, r9, r10, lr}\n\t" "@ for now we can't return\n\t" "push {r0, r1}\n\t" "ldr r0, =exception_info\n\t" "mov r1, #1 @ RPI2_EXC_UNDEF\n\t" "str r1, [r0]\n\t" "pop {r0, r1}\n\t" "bl rpi2_trap_handler\n\t" ); } void rpi2_undef_handler() { // rpi2_undef_handler2() // - No C in naked function asm volatile ( "push {r0 - r12, lr}\n\t" "mov r5, lr\n\t" "mov r4, sp\n\t" "bl rpi2_undef_handler2\n\t" "pop {r0 - r12, lr}\n\t" "subs pc, lr, #4\n\t" ); }
And thanks, guys, I was running in circles...
Looks complicated - this BKPT/debug-state/UNDEF-thing...
Just for the record (readers), the debug state was probably not entered at all, and the problem was in serial I/O.
When the transmitter is idle, the UART interrupts need to be disabled. They are turned back on when data is written to the transmit buffer.
At the end, tha transmitter FIFO is filled from the transmit buffer, and the UART interrupts are turned on. Before that, the interrupts need to be disabled, and enabled after the job not to mess with the transmit interrupt.
The problem was restoring the interrupts:
static inline void restore_ints(uint32_t status) { asm volatile ( "msr cpsr_fsxc, %[var_reg]\n\t" :[var_reg] "r" (status):: ); }
The inline assembly tries to return the status into the parameter, and the function is supposed to be void.
That seemed to have messed up the stack.