Arm Community
Arm Community
  • Site
  • User
  • Site
  • Search
  • User
  • Groups
    • Arm Research
    • DesignStart
    • Education Hub
    • Graphics and Gaming
    • High Performance Computing
    • Innovation
    • Multimedia
    • Open Source Software and Platforms
    • Physical
    • Processors
    • Security
    • System
    • Software Tools
    • TrustZone for Armv8-M
    • 中文社区
  • Blog
    • Announcements
    • Artificial Intelligence
    • Automotive
    • Healthcare
    • HPC
    • Infrastructure
    • Innovation
    • Internet of Things
    • Machine Learning
    • Mobile
    • Smart Homes
    • Wearables
  • Forums
    • All developer forums
    • IP Product forums
    • Tool & Software forums
  • Support
    • Open a support case
    • Documentation
    • Downloads
    • Training
    • Arm Approved program
    • Arm Design Reviews
  • Community Help
  • More
  • Cancel
Processors
  • Developer Community
  • IP Products
  • Processors
  • Jump...
  • Cancel
Processors
Machine Learning IP blog Using Arm v8 for Vision at the Edge
  • Blogs
  • Leaderboard
  • Forums
  • Videos & Files
  • Mentions
  • Sub-Groups
  • Tags
  • Jump...
  • Cancel
  • New
More blogs in Processors
  • DesignStart blog

  • Machine Learning IP blog

  • Processors blog

  • TrustZone for Armv8-M blog

Tags
  • Artificial Intelligence (AI)
  • Machine Learning (ML)
  • Arm NN
Actions
  • RSS
  • More
  • Cancel
Related blog posts
Related forum threads

Using Arm v8 for Vision at the Edge

Mary Bennion
Mary Bennion
October 31, 2019

When developing vision applications, the most common knowledge gap we encounter is a lack of understanding; regarding the performance required and what can be achieved with a given hardware architecture. The confusion partly stems from dissimilar benchmarks used to measure the performance of GPUs and AI accelerators. This gets compounded by a rapidly evolving software ecosystems of networks, tools and frameworks. To truly determine performance, considerable experimentation is required.

Instead, let’s take a reductionist approach. Let’s first start with the CPU core and understand what can be achieved using this unit of processing. If we can understand the type of vision detection pipeline that can be created using an Arm v8 architecture, then we can apply this knowledge to a broad swath of existing hardware. And in the process, define the point at which the cost and complexity of specialized acceleration is justified.   

Fortunately, a lot of work has already been done to optimize vision algorithm primitivities and inference to maximize performance using the Arm v8 architecture. Specifically, by taking advantage of the NEON (DSP instruction) and a Floating-Point Unit (FPU).

Similarly, in the area of deep learning significant advances have developed network models such as MobileNet_v2 SSD. This new generation of detectors are significantly more efficient than predecessors, while retaining a similar level of accuracy. 

This improvement in performance makes it possible to apply these state-of-art models to the Arm architecture and optimize them further by using quantization and ArmNN. The efficiency from quantization is a result of converting the model inputs and weights from float32 to uint8. To do this we use TensorFlow Lite to create the quantized version of the MobileNet_v2 SSD model. ArmNN provides parsers to read Tensorflow Lite flatbuffer models, optimize them and execute them on available compute devices. Since we're using CpuAcc (Arm v8 CPU with NEON) matrix and vector math is supported by the underlying ArmCompute library with NEON SIMD instructions.

To illustrate the efficiency of ArmNN, the table below compares ArmNN to other common inference methods such as OpenCV, which does not support quantized models. 


A graph to show inference performance.Figure 1: Performance comparison of ArmNN  and other common inference methods such as OpenCV

Methodology summary for each configuration:

  • 6 images, each resized to 300 x 300
  • OpenCV and ArmNN, both using 4 threads
  • ArmNN is using CpuAcc (i.e. with NEON acceleration)
  • Model is ssd_mobilenet_v2; OpenCV loads Tensorflow .pb/.pbtxt while ArmNN uses .tflite (for both quantized and non-quantized)
  • Model is pretrained on MS-COCO taken directly from Tensorflow Model Zoo
  • Tests run using NXP i.MX 8M Mini (4 x Arm Cortex-A53 @ 1.8GHz)

By combining inference with algorithmic processing, such as tracking. We have the fundamentals of a vision system capable of detecting basic safety and security events such as; intrusion, zone incursion and boundary crossing.  These capabilities have broad application to many vision tasks. When they are applied together with higher level logic, they form a vision pipeline. Consider a simple use case of detecting the theft of a package from outside your house.

A graphic to show the ArmNN inference.Figure 2: Video of package theft using vision pipeline

In the example video we use ArmNN to detect and classify people and packages and use an algorithmic pipeline to track the important objects in the field of view. This allows us to answer the following questions:

  • Is there a package?
  • Is there a person?
  • What direction is the person moving?
  • What direction is the package moving?

To aid our higher level logic, we add an incursion line to form a boundary. A person crossing this boundary the wrong way creates an intrusion event. An object crossing the boundary the wrong way creates a removed object event.  Together, this event sequence provides adequate situational awareness to determine (with high probability) that your package won’t be there when you get home.  

Arcturus specializes in developing vision pipelines and can make use of additional functions such as; background subtraction, optical flow and specialized neural networks for re-identification or segmentation.  This capability is supported with comprehensive video pre-processing, post-processing, streaming and storage subsystems.These are combined with IoT-like event notifications and a UI/UX.

This makes it possible to create powerful edge-based vision analytics systems that eliminate the need to continuously stream pixel data for external processing. This results in improved use of the data network, reduces privacy concerns and creates a premises-based system from which local actions can take place.

Arcturus develops full-stack solutions for smart city and smart building applications. You can check out more of our work including how we are helping to bring intelligence to public transportation networks.  

Watch the demo of Arcturus here.

Big thanks goes out to David Steele, Director of Innovation for Arcturus Networks www.arcturusnetworks.com, who provided the content for this blog.

Get started with ArmNN

Anonymous
Machine Learning IP blog
  • Profiling Arm NN Machine Learning applications running on Linux with Streamline

    Florent Lebeau
    Florent Lebeau
    This blog article introduces how to profile and optimize machine learning applications running with the Arm NN inference engine.
    • January 20, 2021
  • Accelerating ML inference on X-Ray detection at edge using Raspberry Pi with PyArmNN

    sandeepsingh
    sandeepsingh
    This blog is trying to showcase an X-RAY classification model for detecting a COVID-19 vs Healthy patients using the CovidX database.
    • December 9, 2020
  • Why standard benchmarks matter to AI innovation?

    Dylan Zika
    Dylan Zika
    Arm and MLCommons, a global engineering consortium are working together to push industry benchmarks and best practices for AI.
    • December 4, 2020