In September 2018, Arm donated Arm NN software to Linaro as part of its Machine Intelligence Initiative. The donation was significant, with Arm NN being the product of 100-man years of effort. The uptake of Arm NN has already been huge, with our own estimations showing that it has been shipped in 250 million Android devices.
Before going into detail about where Arm NN has been shipped, it’s important to outline why the Arm NN software development kit (SDK), and software in general, is so important to machine learning (ML) on devices and the overall user experience.
Firstly, our SDK not only includes Arm NN, but also the Arm Compute Library. Arm NN is a software platform and set of tools that enable ML workloads on power-efficient devices. Whereas Arm’s Compute Library is a collection of low-level functions optimized for our CPU and GPU architectures to target image processing, computer vision and ML. Both optimize ML features through bridging the gap between existing NN (neural network) frameworks, such as TensorFlow and Caffe, and the underlying hardware computing elements, such as our power-efficient Arm Cortex CPUs, Arm Mali GPUs and Arm ML processors. This allows developers to continue to use their preferred frameworks and tools, while the software seamlessly converts the result to run on the underlying platform.
While hardware is an important baseline, it is only one part of the story. The true power of hardware can only be fully utilized with decent software. In this regard, comparing the ML optimization process to a car’s speed and performance is a good analogy, as a car not only needs an engine (the hardware) but also a drivetrain (the software) to transform the power into the real speed. Software optimizes the hardware’s power to maximise the ML performance on devices. It also optimizes the ML workloads across different computing elements on devices. As a result, the speed, performance and energy efficiency of key ML-based use cases on devices, such as face unlock, object detection, predictive text, and voice assistant, are all enhanced.
ML on devices is something I’m personally very interested in. Previously, we’ve talked about the different ML use cases that are enabled by hardware, such as this blog exploring the different ML use cases in the average person’s everyday life. Moreover, our Cortex CPUs and Mali GPUs offer a range of dedicated features to enhance ML performance, such as the Cortex-A76 CPU which delivers a 4x increase in ML performance and Mali-G76 GPU which provides a 3x increase in ML performance compared to the previous generation of products. However, the role of software should not be forgotten.
Nowadays, most IP vendors are using software to fine-tune the performance of their hardware blocks. However, Arm’s ML software of Arm NN and the Compute Library aim to bring greater value to partners and the industry through enabling heterogeneous computing over multiple IPs, such as CPUs, GPUs and ML processors. Our partners are already adopting Arm NN software in smartphones and other devices, which has resulted in our estimate that Arm NN has been shipped in 250 million Android devices¹. Some examples of our partners using Arm NN and the Compute Library start from a flagship premium-tier smartphone device, to numerous cost-efficient mid-tier devices from a chip level.
It’s not surprising that our partners are using Arm NN and Arm Compute Library as part of their devices or chip offerings. The performance analysis shows that it is helping to prioritize optimization and already bringing about impressive improvements, as revealed in this blog by Steve Roddy, VP of Product Marketing for Arm’s ML division. These include a performance uplift – up to 9.2x faster – over a period of just six months for a big Cortex-A CPU, a LITTLE Cortex-A CPU and a Mali GPU. With Arm NN and the Compute Library now being open-source software, these performance figures are improving all the time and are only likely to get better through the continuous efforts of the open-source community. This will provide further ML benefits on devices. These efforts for improvement have already been proven through multiple releases already taking place since the Arm NN and Arm Compute Library open source donation.
Both Arm NN and the Compute Library are part of Project Trillium, our heterogenous ML compute platform, which provides a suit of Arm IP to deliver scalable ML and NN functionality across different devices. Project Trillium helps to manage the increasing number of ML workloads now performed at the edge – on devices. The benefits of ML at the edge include better performance through improved latency, better protection of personal data, and a specially tailored custom experience for users. Constantly interacting with the cloud produces a time delay so tasks on a device perform at slower speeds and, at the same time, can only take place in areas with a network connection. Moreover, sending data back and forth from the cloud creates a system more vulnerable to security threats. Being able to use devices in a private and secure manner allows users to fully realize the benefits of ML at the edge, providing exactly what they need when they need it.
As we move towards ML at the edge, software improvements are as important as the hardware in the device. This is proven by the fact that ML optimizations are already taking place through Arm NN and Compute Library software on hundreds of millions of devices worldwide. We expect this number to grow even more, as ML continues to play a vital role in the evolution of consumer devices.
Download Arm NN here
¹ Source: Arm estimates