This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Longer pipelines on Cortex R's vs real-time & performance

On Cortex M's , you have 3 stage pipelines, while Cortex R's starting from R4 up you have 8 stage, and R7 even 11.

I don't understand, isn't worse for real-time interrupt response to have longer pipelines? I did read that Cortex-R can interrupt long stores/loads, and jump straight to the vector address & the irq number passed, that is great, but doesn't the pipeline get flushed every hw interrupt or exception/fault, and needs to be re-filled with instructions? So more cycles spent.?  And, it's not that the pipeline get saved for return from interrupts..? 

Parents
  • Cortex-R and cortex-M series is targeted for different requirements and for different applications. Performance Monitor Unit, Yes, No Performance Monitor Unit: This is the module which makes Cortex-R to be used for Real Time Applications. abort mask bit in a register and also because of number of pipeline stages.

    Walgreenslistens

Reply
  • Cortex-R and cortex-M series is targeted for different requirements and for different applications. Performance Monitor Unit, Yes, No Performance Monitor Unit: This is the module which makes Cortex-R to be used for Real Time Applications. abort mask bit in a register and also because of number of pipeline stages.

    Walgreenslistens

Children
  • Cortex-R and cortex-M series is targeted for different requirements and for different applications.

    Nothing new so far.

    Why should the PMU make an Cortex-R a "real-time" CPU? The Cortex-A and even an Intel Xeon have performance units. None of which would one honestly call a real time CPU.

    So why Arm calls it "realtime" is and maybe will be forever a mystery :-)

  • Stumbled upon this post on a TI form:

    TI forum link R4 vs M4

    Quoting interesting part of TI reply :

    " ...

    The M4 uses a simple 32b AHB interface to access peripherals.  A round-trip access is possible in three clock cycles best case (not considering pipelining or device level architecture impact).  

    The R4 uses a more complex 64b AXI interface to access peripherals.  A round-trip access is possible in around seven clock cycles best case.  This interface is more optimized for bursting access, parallel access by multiple bus masters, and for cache operations.  As a result, it can move quite a bit more data than the M4 on average, but it does so by sacrificing some latency in its design."

    Which then I'm back strongly to my question (or almost..) Why is the R4  (or Cortex-R in general) is "Real-Time", when you can get faster reponse from and M4 ??

    Most of where I'm using R4/5 now is not moving any large amounts of data, its all small messages / packets, like CAN, or sensor readings, and the latency I think is much more important than moving more data in burts. 

  • Which then I'm back strongly to my question (or almost..) Why is the R4  (or Cortex-R in general) is "Real-Time", when you can get faster reponse from and M4 ??

    I have seen one "explanation": The Cortex-R can accept interrupts during multi-cycle instructions (like STM,LDM and maybe xDIV).

    But honestly: I never understood why the Cortex-M concept of automatic saving registers never made it to Cortex-R.

    The term "real time" is something the application defines, never the CPU.

  • I have seen one "explanation": The Cortex-R can accept interrupts during multi-cycle instructions (like STM,LDM and maybe xDIV).

    We have that already above, in my first post /question of the thread.