Arm Community
Arm Community
  • Site
  • User
  • Site
  • Search
  • User
Arm Community blogs
Arm Community blogs
AI blog Using Arm v8 for Vision at the Edge
  • Blogs
  • Mentions
  • Sub-Groups
  • Tags
  • Jump...
  • Cancel
More blogs in Arm Community blogs
  • AI blog

  • Announcements

  • Architectures and Processors blog

  • Automotive blog

  • Embedded and Microcontrollers blog

  • Internet of Things (IoT) blog

  • Laptops and Desktops blog

  • Mobile, Graphics, and Gaming blog

  • Operating Systems blog

  • Servers and Cloud Computing blog

  • SoC Design and Simulation blog

  • Tools, Software and IDEs blog

Tags
  • Artificial Intelligence (AI)
  • Machine Learning (ML)
  • Arm NN
Actions
  • RSS
  • More
  • Cancel
Related blog posts
Related forum threads

Using Arm v8 for Vision at the Edge

Mary Bennion
Mary Bennion
October 31, 2019
4 minute read time.

When developing vision applications, the most common knowledge gap we encounter is a lack of understanding; regarding the performance required and what can be achieved with a given hardware architecture. The confusion partly stems from dissimilar benchmarks used to measure the performance of GPUs and AI accelerators. This gets compounded by a rapidly evolving software ecosystems of networks, tools and frameworks. To truly determine performance, considerable experimentation is required.

Instead, let’s take a reductionist approach. Let’s first start with the CPU core and understand what can be achieved using this unit of processing. If we can understand the type of vision detection pipeline that can be created using an Arm v8 architecture, then we can apply this knowledge to a broad swath of existing hardware. And in the process, define the point at which the cost and complexity of specialized acceleration is justified.   

Fortunately, a lot of work has already been done to optimize vision algorithm primitivities and inference to maximize performance using the Arm v8 architecture. Specifically, by taking advantage of the NEON (DSP instruction) and a Floating-Point Unit (FPU).

Similarly, in the area of deep learning significant advances have developed network models such as MobileNet_v2 SSD. This new generation of detectors are significantly more efficient than predecessors, while retaining a similar level of accuracy. 

This improvement in performance makes it possible to apply these state-of-art models to the Arm architecture and optimize them further by using quantization and ArmNN. The efficiency from quantization is a result of converting the model inputs and weights from float32 to uint8. To do this we use TensorFlow Lite to create the quantized version of the MobileNet_v2 SSD model. ArmNN provides parsers to read Tensorflow Lite flatbuffer models, optimize them and execute them on available compute devices. Since we're using CpuAcc (Arm v8 CPU with NEON) matrix and vector math is supported by the underlying ArmCompute library with NEON SIMD instructions.

To illustrate the efficiency of ArmNN, the table below compares ArmNN to other common inference methods such as OpenCV, which does not support quantized models. 


A graph to show inference performance.Figure 1: Performance comparison of ArmNN  and other common inference methods such as OpenCV

Methodology summary for each configuration:

  • 6 images, each resized to 300 x 300
  • OpenCV and ArmNN, both using 4 threads
  • ArmNN is using CpuAcc (i.e. with NEON acceleration)
  • Model is ssd_mobilenet_v2; OpenCV loads Tensorflow .pb/.pbtxt while ArmNN uses .tflite (for both quantized and non-quantized)
  • Model is pretrained on MS-COCO taken directly from Tensorflow Model Zoo
  • Tests run using NXP i.MX 8M Mini (4 x Arm Cortex-A53 @ 1.8GHz)

By combining inference with algorithmic processing, such as tracking. We have the fundamentals of a vision system capable of detecting basic safety and security events such as; intrusion, zone incursion and boundary crossing.  These capabilities have broad application to many vision tasks. When they are applied together with higher level logic, they form a vision pipeline. Consider a simple use case of detecting the theft of a package from outside your house.

A graphic to show the ArmNN inference.Figure 2: Video of package theft using vision pipeline

In the example video we use ArmNN to detect and classify people and packages and use an algorithmic pipeline to track the important objects in the field of view. This allows us to answer the following questions:

  • Is there a package?
  • Is there a person?
  • What direction is the person moving?
  • What direction is the package moving?

To aid our higher level logic, we add an incursion line to form a boundary. A person crossing this boundary the wrong way creates an intrusion event. An object crossing the boundary the wrong way creates a removed object event.  Together, this event sequence provides adequate situational awareness to determine (with high probability) that your package won’t be there when you get home.  

Arcturus specializes in developing vision pipelines and can make use of additional functions such as; background subtraction, optical flow and specialized neural networks for re-identification or segmentation.  This capability is supported with comprehensive video pre-processing, post-processing, streaming and storage subsystems.These are combined with IoT-like event notifications and a UI/UX.

This makes it possible to create powerful edge-based vision analytics systems that eliminate the need to continuously stream pixel data for external processing. This results in improved use of the data network, reduces privacy concerns and creates a premises-based system from which local actions can take place.

Arcturus develops full-stack solutions for smart city and smart building applications. You can check out more of our work including how we are helping to bring intelligence to public transportation networks.  

Watch the demo of Arcturus here.

Big thanks goes out to David Steele, Director of Innovation for Arcturus Networks www.arcturusnetworks.com, who provided the content for this blog.

Get started with ArmNN

Anonymous
AI blog
  • Sign language translation using machine learning

    Lizzie Salter
    Lizzie Salter
    In this blog post, learn how the Arm Developer Advocacy team is exploring how machine learning can enable a sign-to-speech translator for video conferencing.
    • August 15, 2025
  • Bringing Generative AI to the masses with ExecuTorch and KleidiAI

    Gian Marco Iodice
    Gian Marco Iodice
    With the recent Arm SME2 announcement, the role of Arm KleidiAI is increasingly clear as Arm’s AI accelerator layer powering the next wave of AI.
    • August 13, 2025
  • Yellow Teaming on Arm: A look inside our responsible AI workshop

    Annie Tallund
    Annie Tallund
    Led a hands-on Yellow Teaming workshop at WeAreDevelopers, exploring Responsible AI and LLMs on Arm-powered tech.
    • July 28, 2025