Arm Community
Arm Community
  • Site
  • User
  • Site
  • Search
  • User
  • Groups
    • Research Collaboration and Enablement
    • DesignStart
    • Education Hub
    • Innovation
    • Open Source Software and Platforms
  • Forums
    • AI and ML forum
    • Architectures and Processors forum
    • Arm Development Platforms forum
    • Arm Development Studio forum
    • Arm Virtual Hardware forum
    • Automotive forum
    • Compilers and Libraries forum
    • Graphics, Gaming, and VR forum
    • High Performance Computing (HPC) forum
    • Infrastructure Solutions forum
    • Internet of Things (IoT) forum
    • Keil forum
    • Morello Forum
    • Operating Systems forum
    • SoC Design and Simulation forum
    • 中文社区论区
  • Blogs
    • AI and ML blog
    • Announcements
    • Architectures and Processors blog
    • Automotive blog
    • Graphics, Gaming, and VR blog
    • High Performance Computing (HPC) blog
    • Infrastructure Solutions blog
    • Innovation blog
    • Internet of Things (IoT) blog
    • Operating Systems blog
    • Research Articles
    • SoC Design and Simulation blog
    • Smart Homes
    • Tools, Software and IDEs blog
    • Works on Arm blog
    • 中文社区博客
  • Support
    • Arm Support Services
    • Documentation
    • Downloads
    • Training
    • Arm Approved program
    • Arm Design Reviews
  • Community Help
  • More
  • Cancel
Arm Community blogs
Arm Community blogs
Operating Systems blog Ne10 Library Getting Started
  • Blogs
  • Mentions
  • Sub-Groups
  • Tags
  • Jump...
  • Cancel
More blogs in Arm Community blogs
  • AI and ML blog

  • Announcements

  • Architectures and Processors blog

  • Automotive blog

  • Embedded blog

  • Graphics, Gaming, and VR blog

  • High Performance Computing (HPC) blog

  • Infrastructure Solutions blog

  • Internet of Things (IoT) blog

  • Operating Systems blog

  • SoC Design and Simulation blog

  • Tools, Software and IDEs blog

Tags
  • ne10
  • NEON
  • simd
  • Library
Actions
  • RSS
  • More
  • Cancel
Related blog posts
Related forum threads

Ne10 Library Getting Started

Yang Zhang 张洋
Yang Zhang 张洋
September 26, 2013
4 minute read time.
  • This blog was originally posted on 9 January 2013

1 Introduction

ARM® NEON™ technology is a SIMD (single instruction multiple data) architecture extension for the ARM Cortex™-A series processors. It can accelerate multimedia and signal processing algorithms such as video encode/decode, 2D/3D graphics, gaming, audio and speech processing, and image processing. In the past three years, there have been many multimedia applications that have used NEON and are delivering a significantly enhanced user experience. Some application developers may be not familiar with NEON assembly coding so Ne10 library was created to let developers get the most out of ARMv7/NEON without arduous assembly coding.

blogentry-107443-068232700 1357756252_thumb.png

The Ne10 library provides a set of the most commonly used functions that have been heavily optimized. It was first announced in March 2012. The initial set of functionality in the library focuses on matrix/vector algebra and signal processing. Ne10 will evolve over time to encompass more of the compute heavy tasks in a variety of domains such as image processing.

This article will introduce how to compile and use Ne10 library

2 Ne10 Overview

When you checkout Ne10 source code from https://github.com/projectNe10/Ne10, you will notice a number of directories. The following figure illustrates the use of each directory.

blogentry-107443-041563200 1357756263_thumb.png

3 Environment

First, let’s prepare the whole development environment.

3.1 Hardware environment

You need to prepare an ARM Cortex-A series development platform. If no hardware development platform, you can also use emulated environments like Google’s Android Emulator. I’m using the Panda Board (http://pandaboard.org/) with Ubuntu 11.10.

Alternatively you can use a traditional desktop environment for cross compiling:

3.2 Software environment

For the desktop environment you will also need the following tools:

  • CMake (http://www.cmake.org/): the cross-platform, open-source build system
  • Toolchain: I’m using Ubuntu/Linaro gcc 4.6.1. It is also possible to use AOSP tools or Google’s Android NDK Tools.


4 Compiling and using Ne10 library

Now, we can start to download Ne10 source code and compile it.

4.1 Compiling Ne10

Ne10 uses CMake to implement the whole build system. The benefit of using CMake is that we could implement cross-platform easily.

1) Native compiling (compiling on an ARM platform).

For UNIX platforms, use the following commands in a terminal: (Replace $NE10PATH with the directory where the source code is located)

$cd $NE10PATH  $mkdir build  $cd build    $cmake ..  $make 

$cd $NE10PATH 
$mkdir build
$cd build  
$cmake
..
$make 

libNE10.a is placed in $NE10PATH /build/modules/ and a test program "NE10_test_static" is placed in $NE10PATH /build/samples/. You can run it. Consider adding -

DNE10_BUILD_SHARED=ON to the cmake call to generate the dynamic library and test program "NE10_test_dynamic".

2) Cross compiling (compiling on a non-ARM platform for ARM powered devices)

The process of cross-compiling is similar with native compiling. You just need to configure the correct toolchain by creating the config.cmake and placing this file in $NE10PATH/.

set( CMAKE_C_COMPILER arm-linux-gnueabi-gcc )  
set( CMAKE_CXX_COMPILER arm-linux-gnueabi-g++ )
set( CMAKE_ASM_COMPILER arm-linux-gnueabi-as )

find_program
(CMAKE_AR NAMES "arm-linux-gnueabi-ar") 
mark_as_advanced
(CMAKE_AR)  
find_program
(CMAKE_RANLIB NAMES "arm-linux-gnueabi-ranlib")
mark_as_advanced(CMAKE_RANLIB)

Then you can use the following commands to compile.

$mkdir build 
$cd build
$cmake
-DCMAKE_TOOLCHAIN_FILE=../config.cmake ..
$make

The Ne10 library and test sample are placed in the same directory as native compiling above. You can copy these to the target and run them.

Note:

When you run NE10_test_dynamic on the target, you might receive the error: "NE10_test_dynamic: error while loading shared libraries: libNE10_shared.so.10: cannot open shared object file: No such file or directory"

You can run the following command:

$export LD_LIBRARY_PATH=$NE10PATH/build/modules

4.2 Using Ne10

After the process above, Ne10 library is ready. I will introduce how to use Ne10 library by a sample.

1) Source code

You can call Ne10 functions directly as following.

     #include <stdio.h>
     #include <stdlib.h>
     #include "NE10.h"   


main
(void)  
{    
      ne10_int32_t i
;    
      ne10_float32_t thesrc
[5];    
      ne10_float32_t thecst
;    
      ne10_float32_t thedst1
[5];     
      ne10_float32_t thedst2
[5];     
     
for (i=0; i<5; i++)    
     
{    
      thesrc
[i] = (ne10_float32_t) rand()/RAND_MAX*5.0f;    
     
}    
      thecst
= (ne10_float32_t) rand()/RAND_MAX*5.0f; 
     
      ne10_addc_float_c
( thedst1 , thesrc, thecst, 5 );    
      ne10_addc_float_neon
( thedst2 , thesrc, thecst, 5 );    
      printf
("==========end=========\n");  
}

Ne10 also provides the feature of auto detecting NEON hardware. After initialization, the function pointer will point the correct version (C or NEON).

ne10_init( );  
ne10_addc_float
( thedst , thesrc, thecst, 5 );

2) Compiling the program

Replace $NE10_INC_PATH and $NE10_LIB_PATH with the directories where these files are located

  • Using static library
$gcc –O2 -o sample sample.c  -I$NE10_INC_PATH -l:$NE10_LIB_PATH/libNE10.a
  • Using dynamic library
$gcc –O2 -o sample sample.c -I$NE10_INC_PATH -l:$NE10_LIB_PATH/libNE10.so -lm

Note: When you use dynamic library, and you don't add option "-lm", there will be error "undefined reference to `sqrtf'".

Then you can run this sample.

5 Conclusion

Ne10 is useful library for applications developers. You can get the most out of NEON without arduous assembly coding. I hope this article could help you know how to use Ne10 to accelerate your applications. If you want to learn more about Ne10, please access http://projectne10.github.com/Ne10/

Yang Zhang, Home Software engineer - Home Software Enabling team, ARM, Yang has several years of experience working on projects related to video codec, including H.264/AVC, H.263, MPEG4, MPEG2, VC-1 and AVS. She has a deep understanding of video codec algorithm. Being Home Software Engineer , she specializes in the digital multimedia system for ARM Home. Yang graduated from Zhejiang University with the degree of Master. She is currently based in Shanghai, China.

Anonymous
  • Half past nine
    Offline Half past nine over 2 years ago

    复数矩阵运算有相关的函数库吗?

    • Cancel
    • Up 0 Down
    • Reply
    • More
    • Cancel
  • yang_star
    Offline yang_star over 3 years ago

    when the math module is optimized for aarch64?

    • Cancel
    • Up 0 Down
    • Reply
    • More
    • Cancel
  • Uladzimir
    Offline Uladzimir over 4 years ago

    I'm considering usage of Ne10 for new project - need basic DSP - filtering, correlation, fft. I'm wondering what are benefits using Ne10 comparing to FFTW + iteration loops + autovectorization? 

    • Cancel
    • Up 0 Down
    • Reply
    • More
    • Cancel
  • Uladzimir
    Offline Uladzimir over 4 years ago

    I'm considering usage of Ne10 for new project - need basic DSP - filtering, correlation, fft. I'm wondering what are benefits using Ne10 comparing to FFTW + iteration loops + autovectorization? 

    • Cancel
    • Up 0 Down
    • Reply
    • More
    • Cancel
Operating Systems blog
  • Enhancing Chromium's Control Flow Integrity with Armv9

    Richard Townsend
    Richard Townsend
    This blog explains how Control Flow Integrity, an Armv9 security feature, works on the newly launched Chromium M105.
    • October 11, 2022
  • MongoDB performance on Arm Neoverse based AWS Graviton2 processors

    Julio Suarez
    Julio Suarez
    In this post, we show how the AWS Graviton2 based R6g achieves 117% higher throughput on MongoDB than the x86-based R5.
    • June 9, 2021
  • OCI Ampere A1 Compute instances can significantly reduce video encoding costs versus modern CPUs

    Steve Demski
    Steve Demski
    In this blog we show how OCI A1 instances provide leading performance per dollar for x264 video encoding.
    • May 25, 2021