This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

ARM LPC2378: data/prefetch abort after interrupt

I have a problem with ARM LPC2378 revision B, configured with CPU clock frequency at 72 MHZ.
I have the following configuration:
- an IRQ interrupt function linked to CPU timer 0 (vectored interrupt). It is invoked with a period of 1 ms
- an FIQ interrupt function linked to the half-empty flag of RX queue of MCI bus controller (the bus used to communicate with SD-MMC cards).

During the reading of a file from an SD card CPU switches to "Data Abort" or "Prefetch Abort". While CPU is in this error condition I very often see that IRQ stack pointer is corrupted, i.e. it points to an invalid address. However this doesn't happens every time, but only very often.

I've controlled that stack dimension of each mode is well sized.

Inside the IRQ function some other functions are invoked. If I rewrite the code of these inner functions as "inline code" inside the IRQ function the system doesn't crash!
It seems the problem to be the call of a function inside the IRQ function but I'm not sure of this conclusion.

The problem happens both with MAM fully enabled and MAM partially enabled.

Any suggestions?

Thanks and have a nice day!

Demis

  • Demis,
    It is hard to say anything without seeing some code.

  • Here is the code!

    /***********************************************
    ** This is the IRQ function invoked every 1 ms
    ************************************************/
    void __IRQ__ irq_base_timer(void){
       T0IR = 1; /* reset interrupt flag */
    
       test_fun();
    
    
       VICVectAddr = 0; /* dummy write */
    }
    
    /***********************************************
    ** This is the function invoked inside
    ** irq_base_timer
    **
    ** long sd[32];            // global array
    ** char buffer_rx_fw[512]; // global array
    **
    ************************************************/
    void test_fun(void){
       unsigned long cont;
    /* dummy cycle */
       for(cont = 0; cont < 500; cont++){
          if(sd[31] == cont){
             sd[30]++;
             if(sd[29] == buffer_rx_fw[cont])
                sd[28]++;
          }
       }
    }
    
    
    /***********************************************
    ** This is the FIQ function used to deploy the
    ** MCI RX FIFO
    **
    ** long* p_sd_fifo; // global pointer to FIFO
    ** char* p_sd_data_rx; // global pointer to RAM buffer
    ** long sd_long_count; // global long counter
    **
    ************************************************/
    void __FIQ__ irq_sd_rx(void){
       unsigned char i;
       union{unsigned long L; unsigned int W[2]; unsigned char B[4];} tmp;
    
       for(i = 0; i < 8; i++){
          tmp.L = *p_sd_fifo++;
          *p_sd_data_rx++ = tmp.B[0];
          *p_sd_data_rx++ = tmp.B[1];
          *p_sd_data_rx++ = tmp.B[2];
          *p_sd_data_rx++ = tmp.B[3];
       }    /* end for */
       if(sd_long_count & 1)
          p_sd_fifo = (long*)0xE008C080; // FIFO start address
       sd_long_count--;
    }   /* end irq_sd_rx */
    
    

    If I run this software the crash happens.
    On the opposite If I copy the body of test_fun inside irq_base_timer (and obviously comment the call to test_fun) the crash doesn't happen.

    I don't know why...

    Thank you,

    Demis

  • Demis,
    First, the loop inside your ISR is very loop - to have an ISR excute such code is not a good idea. beter only to set a flag inside the ISR and do the processing in the main loop instead, freeing the processor to handle other interrupts.
    say, what happens if you allow your loop to do 5 cycles rather than 500?

  • I meant that the loop is long, of course...

  • I understand your observation but function duration is not the cause of the problem.
    The loop is only a dummy loop. I mean that I've written that loop only to waste time inside IRQ function.
    I've measured the duration of that function with my oscilloscope and I've found that it lasts 100 microseconds, while it's period is 1 millisecond.

    There are no nested calls of that routine...

  • As already noted, it isn't nice with a huge loop in an ISR. The ISR should be quick. Very quick. Extremely quick. You must always guarantee that you have exited your ISR before the hw needs to generate the next IRQ of the same type. If you can't promise that, then you can no longer guaranttee that you do not loose interrupts.

    When you have multiple ISR, you must make sure that the total runtime of all ISR does not need more than 100% of your CPU capacity, and that no single interrupt source at any time gets delayed enough that the processor trigs a new interrupt before the previous interrupt from that source has been detected and the ISR called.

    Another thing. fast-interrupt handler makes use of a pointer: p_sd_data_rx. Where is it initialized? How do you proove that it doesn't overflow any destination buffer? Your code never tests it's value so every interrupt will step it further and further forward, possibly overwriting unknown data in memory. If the stack (or another pointer is overwritten), your program is lost.

  • Do you have many interrupt source firing at the same time? Do you support nested interupts? I think your stack is 15 levels deep.

  • Are you saying that you intentionally want to loose 10% of your CPU capacity (while running the processor at maximum power consumption)?

    Note that the fast interrupt has higher priority than your delay, so 100us may be the minimum delay time. But if the FIQ happens during the delay, you may get a longer delay. Have you looked into running a second interrupt with 100us offset from the first timer IRQ?