It’s that time of the year again, all the excitement and expectations that come with the Mobile World Congress... Come and see us: ARM is located in hall 6, stand 6C10. And if you are going to the Gaming Developer Conference, that’s also cool: here is a list of what not to miss.
This year the MWC theme is “The Next Element”, which is meant to be “a reflection of how elemental mobile technology has become in our everyday lives” [1]. Makes you think, doesn't it? You can expect the usual wave of new smartphones being announced – and no doubt many will come with catchy new features. Many of them will be optimized for VR. Going forward, the new battleground for innovation is centred around intelligent technologies (such as Artificial Intelligence) and their integration with all sorts of mobile devices. This will include smartphones, but also naturally extend to wearables and IoT appliances, and other adjacent segments such as automotive, drones, robotics etc.
You would have heard a lot about computer vision and machine learning recently, important technologies that are playing a major role in making your mobile handset more intelligent, personal, and useful.
A key part of my role at ARM is to ensure that key new technologies such as computer vision and deep learning are enabled and optimised on ARM architectures, as well as providing the components needed by the developer ecosystem to help them deploy solutions on ARM processors that are both faster and more energy-efficient. To this end, our team has been developing a software library of computer vision and machine learning functions for ARM CPU and GPU, and we will showcase solutions developed by our partners based on this library at our MWC booth (hall 6, stand 6C10).
ThunderSoft is a leading mobile ISV who has been working together with ARM for many years (and also a member of Linaro - I encourage you to attend Linaro Connect in Budapest a few weeks). ARM and ThunderSoft have recently joined forces announcing an IoT ecosystem accelerator. In the last few months our ecosystem engineering team has been working with a division of ThunderSoft called ThunderView. At MWC, ThunderView will be demonstrating on the ARM stand an advanced mobile application which combines computer vision and machine learning technologies, illustrating how new technologies can provide a tangible benefit to everyday tasks, such as monitoring the calorie count of the food you are about to eat.
Here is a screenshot taken from a Huawei P9 devices showing what the demo looks like:
A screenshot of the ThunderView calories count demo on Huawei P9
Let's take a look at how the application works. First, the application uses the phone's camera to detect the food. The first part of the algorithm segments the food from the background (for example: the surface and the plate) and works out the volume (on devices featuring a dual-camera this can be done very accurately using a stereo generated depth map, however for the demo at MWC we are using a single-camera estimation generated using an advanced proprietary algorithms by ThunderView, which does a pretty good job). Next, the algorithm uses a trained neural network in order to identify the food (for example: is it an apple, or some noodles, or pumpkin seeds etc.). By combining this information and cross referencing an internal database, an estimation of calorie count is provided. Everything is done locally, which means there is no need to be concerned about data bandwidth or latency inducing communication with the cloud.
I am on a low carbs diet, so this is pretty good for me. And perhaps now that you know its calories count, you will choose not to eat the whole bowl of pumpkin seeds.
This aligns very well with our imaging and vision product line which originates from ARM’s acquisition of Apical in May 2016. A key part of the Apical portfolio was Spirit, a piece of highly efficient silicon designed for object detection. What it detects depends on the model set, typically people. Spirit can detect and track any number of people in a scene, all from a full HD, full frame rate camera feed, detecting at a wide variety of sizes and all whilst consuming minimal power and overhead on other processors in the system. As well as knowing where the people are Spirit and its accompanying software can tell which direction they are facing, track them as they move, identify them from a database of known faces and generally understand the context of what they are doing. You can find a video of Spirit in action here.
The Spirit object detection product can be used to detect a person’s location, trajectory and pose
In mobile devices this opens up all sorts of possibilities. Imagine your phone automatically cropping and zooming to keep the subjects properly framed. Or being able to automatically index a video as it records, making it possible to search for clips featuring specific people.
The other two key products from Apical (now the Imaging & Vision Group within ARM) include Assertive Display and Mali Camera. Assertive Display is a display management subsystem which cleverly applies local tone mapping to adjust display clarity; this extends the indoor viewing experience to allow your phone screen to be viewed just as easily in bright sunny conditions. Mali Camera is an HDR-capable Image Signal Processor (ISP) that cleans up and enhances images and video coming from multiple mobile sensors.
You can find out more information about all three product areas by visiting our stand.
When developing complex and demanding algorithms on mobile platforms, it is essential to get the most out of your silicon. Furthermore OEMs and developers like ThunderView want to focus their efforts on innovation and differentiation, not re-implementing common low-level primitives. We understand this and at ARM we have invested in ecosystem tools for some time.
ThunderView is one of the many partners that have been making use of a new library of optimized functions for CPU and GPU.
“The ARM Compute Library is an excellent tool” said Yu Yang, CEO of ThunderView. “Our engineers were able to improve performance of the critical path of our software, in particular convolution, using the ARM optimised CPU and GPU routines, and this has helped us achieve our performance targets and getting our solution to market sooner”.
The library currently includes over 50 functions, with many more being added later in the year, and provides implementations for CPU and GPU. Key categories include: arithmetic, image manipulation, convolutions, histograms, feature detection, SGEMM (a cornerstone of machine learning) and more.
The ARM Compute Library provides a comprehensive set of functions which is similar to what you would find in industry standards specification such as OpenVX. Additionally, the ARM Compute Library provides a much more complete set of functions, and of much better performance, than developers can find in existing open source solutions such as for example OpenCV.
In general (grouping by category) the performance improvement of common imaging and vision functions of the ARM Compute Library compared to the equivalent NEON functions found in OpenCV is as follows (here we show data from runs on the Kirin based Huawei P9 and Huawei mate 8 smartphones):
The data for Convolutions and SGEMM (which is a cornerstone of machine learning algorithms) is even more impressive:
This library will be made publically available by the end of March. For advance notification and news, register your interest by emailing: developer@arm.com
References:
[1] https://goo.gl/xvAU3F