On Thursday 18th February we announced the latest real-time processor IP, the ARM® Cortex®-R8. Cortex-R8 is a quad-core high-performance real-time processor, building on the R profile ARMv7-R architecture, already firmly established by Cortex-R4, R5 and R7.
With Cortex-R8 we’re delivering increased performance and introducing new features to meet the demands of next-generation storage device controllers and mobile communications with a particular focus on the forthcoming 5G cellular wireless standards. Though this blog focuses on the application of Cortex-R8 in storage and modem products, the Cortex-R8 is also applicable to many other markets where the fastest real-time performance is required.
To give some orientation; a real-time Cortex-R processor is one of three variants of the ARM architecture family, the others being Cortex-A for applications and Cortex-M for microcontrollers. These three architectures have a lot in common, in terms of instruction set, programming model and support in the wider ARM ecosystem, but they are each specifically equipped for their intended application spaces.
For Cortex-A that is primarily running a high level operating system such as Linux or Android. You’ll find our rich portfolio of Cortex-A processors in just about every mobile phone as well as tablets, servers, enterprise systems, networking equipment, industrial controllers and so on. Cortex-M is profiled to enable our partners to build the very lowest power and lowest cost microcontrollers and edge devices in the Internet of Things such as remote sensors and Embedded wireless chips for standards like Bluetooth.
Cortex-R sits in between, with a range of processors and multi-core configuration options offering high performance with cached memory systems and a tightly-coupled memory system for fast and deterministic response to system events. This is what you need, for example, in a System on Chip that controls a storage device, especially magnetic storage with hard disks, which is typical of a hard real-time system where deadlines are measured in micro-seconds or less.
In storage devices, and in particular for hard disk drives, the Cortex-R processors have long been established as number one choice for performance and response when it comes to controlling heads and motors and controlling the host interface. All the major hard disk manufacturers use Cortex-R processors, and they also ship in increasingly high volume in the solid state flash drive space, in both consumer and enterprise class storage.
Storage device capacities and interface data rates are still increasing rapidly, both for magnetic and solid state devices and we see increasingly higher input-output operations per second and increasingly complex algorithms for keeping track of data and managing errors as the physical limits of storage media are challenged.
Our new Cortex-R8 processor offers storage controller designers both additional performance and new AMBA® bus ports with error correcting code protection, amongst other things. Some of these features are a direct outcome of our close engineering relationships with storage System On Chip architects in the ARM silicon partnership where we've worked together to optimise the technology boundary between the ARM processor IP and rest of the system.
Now, turning to cellular modems, here we see ARM’s processors already used in very high numbers within the modem sub-system of a mobile phone SoC, either as a stand-alone modem chip or, as is now usually the case, in highly-integrated modem plus application processor chips.
Cortex-R processors are well-suited to the modem task where they both manage scheduling of data flows through the signal processing for reception and transmission and run the protocol stack software tasks to establish and manage connections whilst a data, voice or video call is taking place. Once again, these are hard real-time tasks where the processor must respond to events in the communication channel with micro-second granularity. Otherwise data is dropped and has to be re-transmitted over and over. The data rates and complexity are increasing, thus placing higher workload and feature set demands on the modem processor.
Third generation, and now fourth generation cellular communications using the LTE set of standards, are established worldwide with over a billion subscribers to mobile services. LTE and LTE-Advanced are providing data rates upwards of 300 Mbits per second and importantly LTE-Advanced enables operators to maximise use of their spectrum allocation by aggregating transmission over a number of carrier frequencies. This flexibility of spectrum usage is very valuable to operators who license it from governments but it places significant additional workload on the modem processor as it requires multiple instances of some tasks in the protocol software stack for each carrier.
Now the future of cellular communications is becoming clearer as outline standards and schedules have been set for the introduction of fifth generation and the last fourth generation LTE-Advanced Pro. Substantially higher data rates to a Gigabit and beyond, even more carrier frequencies, multiple antenna arrays and new features for emergency services and the like all contribute to increasing workload and feature set requirements for the modem processor and this is where the Cortex-R8 comes in.
The bars in the image above show the period when we’ll see our partners designing and testing their chips up until their market introduction in a mobile phone. The service roll-out happens after that. You can see we anticipate the second wave of 5G to be a more challenging when new air interfaces using very high millimetre-wave frequencies have to be developed.
The next generation LTE-Advanced Pro standard brings both WiFi and also new unlicensed band cellular technology together with existing LTE-Advanced in the same modem. This creates a further substantial increase in the modem processing tasks.
Then, with 5G, even more carriers are planned. There will be higher data rates, multi-dimensional antennas, direct phone to phone services, mission-critical services designed for first responders, low-latency services for vehicles and highways and new narrow band communications designed for the IoT and other capabilities for the 2020s.
All of this requires a real-time multi-core processor that can cope with increasing software workloads in the protocol stack layers and manage data scheduling through the modem signal processing and its various dedicated hardware accelerators for security, compression and the like.
In addition, modem designers are asking for even more layer-1 scheduling activities to be managed by software in the ARM processor instead of dedicated hardware as this allows more flexibility when switching between all the different communication standards.
And, like the storage use case, here again ARM has worked very closely with modem teams around the silicon partnership to understand their requirements and deliver a processor that both integrates neatly into the SoC hardware design and executes their software efficiently.
The modem processor and associated DSP and hardware accelerators are key parts of a mobile phone SOC along with the applications and graphics processing etc. ARM has introduced the Cortex-R8 to support this next set of LTE-Pro modems and the initial 5G design cycles. It is a very fast processor and can deliver a total 15,000 Dhrystone MIPS from a quad core configuration on a 28 or 16 nm silicon process (1.5 GHz).
Like our top-line application processors, it can execute instructions out of order. This can be key to success in real-time applications like modems because it enables the processor to continue execution whilst outstanding memory or peripheral transactions are in flight.
Reliability is very important in storage applications where error detect and correct features must ensure that soft errors do not propagate through the control processor into the storage medium.
Like most ARM processors the Cortex-R8 scales in terms of size and capability. Chips designers can optimise it for their application by selecting configurations from one to four CPU cores, level-1 memory sizes, a choice of bus interfaces, error handling features and so on. Also, once a chip is running, the software can power cores up and down depending on workload. For example a modem may run all four cores during a video call but drop down to a single core when the phone is almost asleep in your pocket.
In common with all the Cortex-R processors, the Cortex-R8 takes interrupts into its pipeline as quickly as possible and then services them with code and data stored in a tightly-coupled memory thus avoiding the longer and non-deterministic latency cycles you get when fetching interrupt service routines into the cached memory system. Cortex-R8 supports eight times as much TCM than the R7 did, all the way up to 2 MB for each CPU core.
In the block diagram below you can see the pipeline, TCM and error handling features:
Why the Cortex-R8 is attractive to System on Chip designers? Firstly, there is a lot of software already out there, for example modem protocol stacks and drivers going all the way back to 2G, then GPRS, HSPA etc., followed by first generation LTE.
All this software and the associated electronic system-level design, simulation and verification equipment and know-how represents a huge investment for the modem design teams at our silicon partners, so we must protect that investment by offering them scalability and forward compatibility, which the Cortex-R8 of course does.
In addition, the complexity of this software has increased dramatically and the rest of the modem hardware is also very complex. So you can see that the Cortex-R8’s quad CPU core and coherent memory system allows software execution to be parallelised across the four cores and various interfaces into the modem hardware can be used to achieve the best performing and lowest power overall design.
Another characteristic of all the ARM Cortex real-time processors is that they are designed to minimise latency for memory reads and writes using a protected memory system. This is different to the virtual memory system that is needed for a high level OS like Linux or Android and, for the Cortex-R processors running a Real Time Operating System, it is key to their ability to start responding to hard real-time events in 1/10th of a microsecond or less.
The diagram below demonstrates how we’ve evolved our real-time processor line-up with a combination of micro-architectural and multi-core developments to keep up the pace of innovation in communications through 3G, 4G and onward to 5G.
Of course, this has followed the semiconductor process technology roadmap. Broadly speaking, the Cortex-R4 designs were on 65 nm, Cortex-R5s on 40, Cortex-R7s on 28 and the Cortex-R8 products will be on 16, 14 and 10 nm, maybe even 7 nm. The process technology is a key enabler for increasing data rate and modem capabilities as it allows us more transistors within the same cost and power budget. ARM develops its processors to take best advantage of this and overall the phone user gains an ever-improving mobile experience.
With four cores to spread the software over, there are more advantages as it’s less likely that implementation techniques like voltage overdrive or very high frequency operation using low threshold transistors will be needed. That of course can save a considerable amount of power.
So, we believe that Cortex-R8 is by far the best performing real-time processor IP available and it offers all the right features for its intended applications. Its aggregate multi-core performance up to 28,000 CoreMarks is more than sufficient for the next set of LTE-Advanced Pro and first 5G modem designs, and testing with silicon partners’ software has already demonstrated this to be the case.
Combine that with large TCM memories, out-of-order execution, the rich set of interface ports, error management etc. and you have the best solution for any modem or storage controller or similar high-performance deeply-embedded hard real-time application.
To conclude, we are know that our processors are already the best and most widely used in the deeply embedded hard real-time applications such as modems and storage controllers. And now, with the new Cortex-R8, we have delivered the next step in hard real-time performance and features to extend that leadership as we enable the next design cycles for high performance storage controllers and cellular modems targeting 4G-Pro and thereafter 5G communications standards.
To find out more about LTE Advanced Pro and 5G, read ARM’s Whitepaper “The Route to 5G”
Great news! There is also a report at Anandtech ARM Announces New Cortex-R8 Real-Time Processor