On Cortex M's , you have 3 stage pipelines, while Cortex R's starting from R4 up you have 8 stage, and R7 even 11.
I don't understand, isn't worse for real-time interrupt response to have longer pipelines? I did read that Cortex-R can interrupt long stores/loads, and jump straight to the vector address & the irq number passed, that is great, but doesn't the pipeline get flushed every hw interrupt or exception/fault, and needs to be re-filled with instructions? So more cycles spent.? And, it's not that the pipeline get saved for return from interrupts..?
But longer pipeline still takes longer to fill in, no matter at what freq you run the core. Ignore if it's 1us, or 100ms, lets say it needs N cycles to fill in N stage pipeline.
On the other hand, I also don't see through this: so with longer pipelines, you get better overall throughput. And it's better when pipeline is fully busy, no bubbles, hazards, or flushes.
Now, if my chip will have many interrupts to handle & flushes, then this overall throughput will be going down. So now, I'm gonna be with longer pipelines to fill before responding to ISRs, and I'm going to have declining pipeline benefit from long pipline because i have all these async events & pipeline flushes ...
Is there some way one calculates how the ISR load impacts on pipeline benefit..?
Sure, the pipeline gets flushed and must be reloaded. But it does not mean the core stalls. Your worst case interrupt latency depends on the current instruction executed and the number of cycles for the new instruction reaching final stage.
If this takes 10 cycles, then you need to see if this fits your need at 30MHz, if not than maybe at 60MHz.
If you have a Cortex-A9 with a 12 stage pipeline, having it run at 1GHz is sufficient for an 1us interrupt.
Only, and this might be a problem, you get a larger jitter the longer the pipeline.