Being over four months into my new job at Arm, I'm still learning about the company and all the many areas of technology that we're involved in. One of these areas is machine learning (ML). In my infinite naivety, and slight ignorance, this is something that I thought would be part of a dystopian future where the robots begin to slowly take over. However, it's not that scary or far away. In fact, ML is very much a present-day phenomenon, playing a role in today's mobile devices and helping them to achieve a level of 'smartness' that is already an intrinsic part of our everyday lives.
Looking at how my life interacts with mobile and laptop devices, I can already see that ML is playing a very significant role. Not only is it making my life more efficient, but it's also helping me to make important everyday decisions without me even realizing.
Security is a modern-day smartphone feature supported by ML. It has evolved from typing in a PIN to a variety of modern ML-based features, such as face recognition and fingerprint identification. I do have fairly distinctive features that help my smartphone recognize me, but even users that don't share these are protected from spoofing attacks by people pretending to be them. The face recognition algorithm will scan for various unique facial features like distance between the eyes, shape of your face, facial hair density and color of the eyes and stores them as information in the phone ready for the unlock. If the face fits, the phone will unlock. But if you don't look enough like you, then access is denied. The accuracy of these features has, over time, improved significantly.
A less-modern ML feature is predictive text. This is a feature that I - and most other people that own a mobile phone - use every day for communicating with my wife, friends (all two of them) and family via the various messaging services that are now commonplace on mobile. ML not only helps to drive accuracy in finding the words that users want to write, but it also improves personalization through selecting the words most likely to be typed.
When using my smartphone or laptop, I'm not just tapping away on the screen or keyboard - I'm now talking to my devices. This might look and sound odd in public, but it helps me improve the efficiency of daily tasks and also means I don't have to type anything. Voice assistants send voice data over to their respective servers where the voice input is semantically processed to understand what users are asking for. In fact, a number of Arm Cortex-M-based processors are used for Keyword Spotting within devices that use voice assistant, such as Alexa from Amazon. Cortex-M is used to spot a keyword or phrase such as 'Hey Alexa' to wake up the system. More powerful IP will then use ML to work out what the user is saying and perform the required actions. Voice assistants use ML and deep neural networks to imitate this human interaction.
I might not have as many apps on my smartphone as the average smartphone user - according to a 2017 report from App Annie this is over 80 - but I still want the ones that I do have to work. Today's modern mobile devices, such as the Huawei Mate 9 smartphone which is powered by Cortex-A CPU, Mali GPU and Arm Interconnect, use ML to figure out what apps are used the most by users and then optimizes these apps to run faster, smoother and better compared to the less used ones.
Social media apps are an important feature of my smartphone. Thanks to ML, Facebook is now using algorithms to recognize familiar faces from users' contact list and then tag them automatically. This could be potentially embarrassing if I post photos from the stag-do I recently went on, but is helpful nonetheless.
As a 32-year old, I feel that I may have missed the boat slightly with Snapchat - what I call a 'young person's social media app'. That being said, I do partake in the occasional cartoon-style selfie that make my distinctive features either less or more distinctive. Snapchat uses a combination of augmented reality (AR) and low-level ML to transform selfies into fun photos with cartoon features such as dog ears or deer eyes.
Being highly professional, the social apps more relevant to me are LinkedIn and email - the modern-day choices of business communication. LinkedIn uses ML through 'people you may know', which establishes the people who I should connect with in the world of business based on my previous experience and current role. In fact, ever since I joined Arm, it has recognized a variety of people at the organization who it feels I should add - a handy feature for navigating around the thousands of people that work here. On email, Gmail has introduced a smart reply function that uses ML to work out a number of responses to emails in inboxes that users can send through one click of a button - far more efficient than typing a generic reply!
An everyday task that I'm personally not a huge fan of, but is made easier - and less painful - through ML, is shopping. Amazon uses ML algorithms to offer a highly personalized service for customers, with recommendations that are based on their previous purchases or activities. In my case, my Amazon recommendations would probably be a combination of football-based autobiographies and DVDs, mixed in with random books that reflect what I've bought my mum in the past for birthdays and Christmases!
Another recommendations engine that is powered by ML is the one on Netflix. Being a newly married man in my early 30s, I'm beginning the steady shift away from going out in the evening towards staying in and watching programmes on Netflix. The platform uses ML algorithms to recommend new programmes to users based on the type of shows that they have watched previously. These ML algorithms are so integral to the company that the return on investment (ROI) is valued at around £1 billion a year.
When I do venture out, how I get to the destination is often powered by ML. Navigation while driving is achieved through Google Maps which uses ML to analyze the speed of traffic through anonymous location data through smartphones. This enables Google to reduce travel time by suggesting alternative routes. If I'm looking to have a few drinks, then an Uber will be required, with the service using ML algorithms to determine arrival times and pick-up locations.
Looking ahead, one exciting future case for ML will be around health. Cortex-M based processors are at the heart of the sensor hub in wearables, with these delivering advance signal-processing features for new and more complex use cases. ML is likely to advance the Cortex-M ecosystem even further through becoming a more engrained feature within wearable technologies, leading to personalized health monitoring. This will allow doctors and relatives to monitor the health of elderly family members, and spot potential anomalies earlier on. My parents live over 50 miles away, so this could help provide me with peace of mind as they both get older. In many ways, ML could potentially be a game-changer for healthcare in the future.
Above is how ML has helped to shape my life along with my mobile and laptop experiences already, but how does this link to the company I work for? Well, Arm recently launched a brand-new suite of premium IP designed to make mobile devices ready for future innovations around emerging use cases, which includes ML as one of the top priority areas. The new Arm Cortex-A76 delivers 4x compute performance improvements for ML at the edge, leading to responsive and secure experiences on PCs and mobile.
Enabling devices to have the capability and flexibility to perform at the edge - within the device and not sending data to the cloud - is important, as the world does not have the bandwidth or financial resources to cope with the vast amount of data that cloud-based ML requires. As Jem Davies, the GM of Arm's ML division, wrote earlier this year at the launch of Project Trillium, Arm's heterogeneous ML compute platform, Google has realized that if every Android device in the world performed three minutes of voice recognition each day, the company would need twice as much computing power to cope. In other words, its computing infrastructure would have to double in size.
ML at the edge also improves the latency of the device and eases security concerns. Tasks on mobile devices are often completed at slower speeds if they constantly need to interact with the cloud. Consumers, including myself (but I am admittedly very impatient), demand a seamless experience on mobile, so they are unlikely to accept any latency issues when performing tasks that require ML processing in the cloud. Moreover, the cloud is more vulnerable to security threats than a device, especially if data is continually being shifted back and forth. Users are also far more comfortable with their data being on their device than in the cloud due to privacy and security concerns.
Currently, there are plenty of examples where ML is being used every day that go far beyond my own daily encounters. However, we are still only scratching the surface of its true potential. That's why devices need a scalable IP that is flexible enough to allow OEMs to develop a range of devices that can fulfil this huge potential in the future. We still don't yet know the full potential of ML on devices in the future, but it promises to be very exciting.
Learn more about Project Trillium and how it's enabling a new era of ultra-efficient ML inference.
Thanks Jack for this blog! Very nicely explained relevant ML use cases we can find in our everyday life. We can expect this list to increase substantially in the coming years. For example, when attending SIGGRAPH 2018 last August I was surprised by the large number of talks and posters devoted to ML applications to graphics. No doubts ML will change the way we do computing.