Arm Community
Arm Community
  • Site
  • User
  • Site
  • Search
  • User
Arm Community blogs
Arm Community blogs
Architectures and Processors blog Five things you may not know about Cortex-R Series processors
  • Blogs
  • Mentions
  • Sub-Groups
  • Tags
  • Jump...
  • Cancel
More blogs in Arm Community blogs
  • AI blog

  • Announcements

  • Architectures and Processors blog

  • Automotive blog

  • Embedded and Microcontrollers blog

  • Internet of Things (IoT) blog

  • Laptops and Desktops blog

  • Mobile, Graphics, and Gaming blog

  • Operating Systems blog

  • Servers and Cloud Computing blog

  • SoC Design and Simulation blog

  • Tools, Software and IDEs blog

Tell us what you think
Tags
  • fast
  • Cortex-R
  • reliable
  • Cortex-R5
  • deterministic
  • tcm
  • cortex
  • Cortex-R4
  • Cortex-R7
Actions
  • RSS
  • More
  • Cancel
Related blog posts
Related forum threads

Five things you may not know about Cortex-R Series processors

Neil Werdmuller
Neil Werdmuller
March 14, 2014
5 minute read time.

1) Cortex-R processor are widely used across many embedded applications


Often the Cortex-R Series are used in devices such as storage controller processors, LTE modems and industrial and automotive applications where the key attributes are needed:

    • Fast: High processing performance at high clock frequencies
    • Real-time: Deterministic processing always meets real-time constraints
    • Reliable: Dependable with safety features and high error resistance

Cortex-R Series are not always as visible as the Cortex-A Series application processors or the Cortex-M microcontrollers, where the ARM brand adds value to our partners’ products and demonstrates there is a wide eco-system of engineers that have skills in programming them.


The safety features are especially important when implementing automotive and industrial embedded control systems where features such as memory protection, error-correcting codes and lock-step, using a redundant copy of the processor to detect errors, deliver high error resistance.


Many LTE modems use Cortex-R processor and in storage the Cortex-Rs are very popular. To date (3Q13) 900+ million devices have shipped that incorporate Cortex-R processors, proving the processors to be very mature and reliable.

image001.pngimage003.pngimage005.pngimage007.png

2) Tightly Coupled Memory (TCM) for performance and determinism


TCM is memory connected closely to the processor core. This memory is very fast for the processor to access. Typically it will hold interrupt service routines and data tables that need to be accessed quickly. As soon as an interrupt arrives the Cortex-R processor can switch to interrupt privilege mode and quickly start working on the interrupt code that is held there. Without TCM if the interrupt service routine code, or any data it needed to access, was not held locally in the cache then the cache would need to fetch the code from main memory and this may take many clock cycles while the processor must wait until the code and data is available. With TCM then the worst case number of cycles to start running the interrupt code is known and hence the Cortex-R processors are deterministic.

image009.pngMemory access above the dotted line the Cortex-R processor is always fast and deterministic


In a system with a Memory Management Unit then if the code or data is not available in the cache then a page table walk may be required and this could take hundred of cycles. TCM enables fast deterministic response to interrupts which makes the Cortex-R series ideal for real time systems and .

3) SIMD instructions and CMSIS-DSP Library functions add DSP capabilities


The Cortex-R Series provide native ability to do perform Single Instruction Multiple Data (SIMD) and Multiply and Accumulate (MAC) instructions. These enable multiple operations to be performed in a single clock cycle and includes saturating maths that clips rather than overflows results that are too large.


The CMSIS-DSP library is a collection of 61 algorithms that utilise the SIMD capabilities and include:

  • Basic maths: vector multiply, add, subtract, scale, shift, negate...
  • Statistics: root mean square, mean, standard deviation...
  • Fast maths: sine, cosine, square root...
  • Complex maths: conjugate, dot product, magnitude, multiply by real...
  • Filters:  FIR, IIR, convolution, correlation..
  • Matrix algebra: addition, multiplication, scale...
  • Transforms: Fast Fourier, discrete cosine...
  • Controller: PID motor control, (Inverse)Park transform, (Inverse)Clarke transform...
  • Interpolation: linear and bilinear...
  • Support functions: type conversion, copy, fill...

By including these capabilities in the processor a much simpler, more cost-effective and easier to debug system can be created than by having a separate DSP. The performance and width of SIMD data processed is not as advanced as some of the very high-end standalone DSPs but in many applications, use of these capabilities can make the system more efficient and lower power.

image011.png

Example motor control application where Park and Clarke transforms are handled by the SIMD/DSP capabilities through the CMSIS-DSP library

4) Branch shadow and branch prediction

The Cortex-R Series enhance performance through advanced branch prediction techniques. In a pipelined processor multiple actions happen in each clock cycle. In Cortex-R, both instruction fetch and data read/write access are extended to two cycles allowing longer memory access time, enabling either larger memories or slower memories that can be denser or lower power. This removes memory system limitations on processor clock frequency. Plus another additional decode stage that accommodates branch prediction (conditionals, loops and function returns) and an instruction queue to keep the data processing unit fed with instructions. If a branch happens without prediction then the processor must stall and wait until the pipeline is reloaded with instructions from the new address to refill the pipeline and reach the data processing unit. Branch prediction determines the most likely outcome of any branch instruction and either continues as normal, if it predicts the branch will not be taken, or starts loading the pipeline with the instructions from the branch address so that the data processing unit will not stalled. Branch prediction can significantly improve the performance of processors. The Cortex-R7 approaches 100% branch prediction accuracy compared to ~80% for Cortex-R4/R5.

5) Error Correcting Code (ECC) generation/checking is built into the processor pipeline


ECC is a method of checking that the memory location data is correct and has not been corrupted. If a single bit error is detected then it can be automatically corrected and written back to the memory location. The memory has additional bits added and a code is generated and stored in these additional bits whenever information is written to memory. When the memory is read back the code is checked to ensure the data and code still match. This could be the case if there has been a Single Event Upset (SEU) such as radiation hitting the memory location and flipping the bit, or if there is a physical error in the memory. In the Cortex-R Series the ECC code generation and checking is done automatically and does not cause any performance impact, unless of course and error is detected. EEC is an optional feature on all of the Cortex-R Series.

Pipelined ECC.png

Example of ECC on TCM as part of the Cortex-R Series pipeline

Anonymous
  • bagherian_99
    bagherian_99 over 4 years ago

    as far as i see Cmsis DSP library wrote in single precision, is there any way to calculate in double precision?

    • Cancel
    • Up 0 Down
    • Reply
    • More
    • Cancel
Architectures and Processors blog
  • Introducing GICv5: Scalable and secure interrupt management for Arm

    Christoffer Dall
    Christoffer Dall
    Introducing Arm GICv5: a scalable, hypervisor-free interrupt controller for modern multi-core systems with improved virtualization and real-time support.
    • April 28, 2025
  • Getting started with AARCHMRS Features.json using Python

    Joh
    Joh
    A high-level introduction to the Arm Architecture Machine Readable Specification (AARCHMRS) Features.json with some examples to interpret and start to work with the available data using Python.
    • April 8, 2025
  • Advancing server manageability on Arm Neoverse Compute Subsystem (CSS) with OpenBMC

    Samer El-Haj-Mahmoud
    Samer El-Haj-Mahmoud
    Arm and 9elements Cyber Security have brought a prototype of OpenBMC to the Arm Neoverse Compute Subsystem (CSS) to advancing server manageability.
    • January 28, 2025