This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Does "LDRD" instruction cause "UNDEFINSTR" error on Cortex-M4?

Dear Experts,

    I'm working on a freertos project which is running at Cortex-M4 and I'm being troubled by a problme - hard fault.

The following is my debugging process:

    I dump the registers in the stack when the hardfault happens.

 [Hard fault handler]
R0 = 0x1d8
R1 = 0x2001091f
R2 = 0x398dcd
R3 = 0x0
R4:0x0
R5:0xe4a0
R6:0x0
R7:0x20010928
R8:0xa5a5a5a5
R9:0xa5a5a5a5
R10:0xa5a5a5a5
R11:0xa5a5a5a5
R12 = 0x0
LR = 0x1ffe1987
PC = 0x1ffe199a
PSR = 0x61000000
- FSR/FAR
BFAR = 0xe000ed38
MMFAR = 0xe000ed34
CFSR = 0x10000
HFSR = 0x40000000
DFSR = 0x0
AFSR = 0x0
- Misc
LR/EXC_RETURN = 0xfffffffd

     From the HFSR and CFSR, I can know that hardfault is caused by the "UNDEFINSTR" error.

And the PC is 0x0x1FFE199A

Before executing 0x1FFE199A, the LDRD is executed which is due to one uint64_t variable is used. If I change the variable to uint32_t type and there is no "LDRD" instruction generated, and the programme can run well.

So I suspect, the issue I met is related with the "LDRD", but I don't know the rootcause?

One thing I want to say, the hardfault usually happens after the progaramm running for several hours, sometimes 1 hour, sometimes 5 hours.... 

I'm sure the stack is not fulled and the project is compiled by the arm-none-eabi-gcc(2 018-q4-major).

Can any one give some suggestions for next setp debug?

Thanks a lot!

Parents
  • Hi Expert,

    Thanks your reply and sorry for I'm so late response!

    The program has been running for over 20 hours and it's in good status based on your suggestion about adding __asm("nop") before LDRD.

    But I still have the following questions based your questions;

    1. Are you sure the instruction is called in the normal flow?
    [Gavin]: This is a godd question, but hm... how do I know or check if the instruction is called in the normal flow? Any suggestions?

    2.Could it be there was a jump in the middle of the LDRD (which is 4 bytes) whereas the normal LDR is only 2 bytes.
    [Gavin]: If it is a case, have you met such similar issue before? Meanwhile, is it a potential problem so that we need to pay attention to
    during our programming? How do we avoid this issue in our development?


    3. Why the issue doesn't happen (maybe) after adding the NOP before LDRD?

Reply
  • Hi Expert,

    Thanks your reply and sorry for I'm so late response!

    The program has been running for over 20 hours and it's in good status based on your suggestion about adding __asm("nop") before LDRD.

    But I still have the following questions based your questions;

    1. Are you sure the instruction is called in the normal flow?
    [Gavin]: This is a godd question, but hm... how do I know or check if the instruction is called in the normal flow? Any suggestions?

    2.Could it be there was a jump in the middle of the LDRD (which is 4 bytes) whereas the normal LDR is only 2 bytes.
    [Gavin]: If it is a case, have you met such similar issue before? Meanwhile, is it a potential problem so that we need to pay attention to
    during our programming? How do we avoid this issue in our development?


    3. Why the issue doesn't happen (maybe) after adding the NOP before LDRD?

Children
  • >how do I know or check if the instruction is called in the normal flow?

    If the device that you are using have ETM instruction trace, and the board has trace port connection, and if you have a debug probe that support trace, then you can instruction trace to see what happened to the instruction flow before it crashed.

    >3. Why the issue doesn't happen (maybe) after adding the NOP before LDRD?

    The NOP instruction produced an address offset of two bytes. Assume there is a stack corruption (not necessary stack overflow, could be array with unbounded index or something else) somewhere that caused an incorrect jump into the middle of the LDRD, the jump no longer go into the middle of an instruction after the change. The program execution is still wrong but doesn't necessary cause a crash.

    regards,

    Joseph

  • Hi Joseph,

        Thanks your quick reply, I'll check the instruction flow first.