Yesterday, at Google I/O, Google announced that they are partnering with Arm to develop TensorFlow Lite Micro and that uTensor – an inference library based on Arm Mbed and TensorFlow – is becoming part of this new project. (See the Mbed blog for more details.)
ML developers will likely know that TensorFlow Lite is an open-source deep learning framework for on-device ML inference with low latency. Its new sibling, TensorFlow Lite Micro – or TF Lite Micro for short – takes efficiency to another level, targeting microcontrollers and other devices with just kilobytes of memory.
If you have an interest in embedded machine learning, or simply have an ear to the ground in the tech world, you’re likely to have seen the recent announcement from Google’s Pete Warden about the project’s launch. Speaking at the TensorFlow Developer Summit, Pete demonstrated the framework running on an Arm Cortex-M4-based developer board and successfully handling simple speech keyword recognition.
So, why is this project a game changer? Well, because Arm and Google have just made it even easier to deploy edge ML in power-conscious environments. And a benefit of adding uTensor to the project is its extensibility for optimized kernels, such as Arm’s CMSIS-NN, and discrete accelerators – which helps neural networks to run faster and more energy-efficiently on Arm Cortex-M MCUs.
On-device inference has been gaining traction in recent years, with an increasing migration of functionality from the cloud to the edge device. The benefits of edge ML are well documented: reliability; consistent performance without dependence on a stable internet connection; reduced latency, since there’s no need for data to travel back and forth to the cloud; and privacy, since data may be less exposed to risk when it stays on the device.
But even where the inference is cloud-based, devices tend to rely on edge-based ML – typically on small, super-efficient processors such as Cortex-M – to wake up the rest of the system. Keyword spotting, as used by Pete to demo this new capability, is a good example of this. By allowing the main system to sleep and keeping the power requirements of the always-on element exceptionally low, embedded devices can achieve the efficiency they need to provide great performance as well as great battery life.
The other notable thing about TFLite Micro is that, like our very own Arm NN, it’s open source, which means that you can customize the example code or even train your own model if you so desire. (While TFLite Micro is the framework of choice for Cortex-M, Arm NN provides a bridge between existing neural network frameworks and power-efficient Arm Cortex-A CPUs, Arm Mali GPUs, the Arm Machine Learning processor and other third-party IP.)
The project is still in its infancy, but as more and more ML moves to the edge, this kind of open-source approach will become increasingly important.
The technological challenges that once limited ‘tiny’ edge ML are rapidly evaporating. The recent launch of Arm Helium – the new vector extension of the Armv8.1-M architecture, used for future Cortex-M processors – was great news for developers of small, embedded devices. It’s set to bring up to 15 times performance uplift to ML functions and up to five times uplift to signal processing functions, compared to existing Armv8-M implementations.
Increasing the compute capabilities in these devices enables developers to write ML applications for decision-making at the source, enhancing data security while cutting down on network energy consumption, latency and bandwidth usage.
As we move towards a world in which a trillion connected devices is fast becoming a reality, Cortex-M-based microcontrollers – which can deliver on-device intelligence with just milliwatts of power – are poised to drive the edge revolution.
If you’d like to know more about ML on Arm Cortex-M, watch our on-demand technical webinar, 'Machine Learning on Arm Cortex-M Microcontrollers', below.
Watch Technical Webinar