This content was initially posted 28 March 2012 on blogs.arm.com
The past three years we have seen explosive growth in the use of the NEON™ SIMD engine by many of our software partners in the open-source community. The engine itself, defined as part of the ARM® Architecture, Version 7 (ARMv7), has shown itself to be extremely flexible and able to accelerate everything from Video Codecs such as VP8 to elements of the emerging HTML5 standard including <svg> and <canvas> filters. From an applications developer viewpoint, all of this acceleration takes place behind the scenes in upstream open source projects that are harvested to build the latest and greatest open source operating systems and frameworks such as Android™ and QT. While it is good to know that the latest version of Android contains lots of goodness that makes it fast and power efficient on ARM, that sometimes isn’t enough when you want to write high performance applications yourself, such as games or augmented reality apps. So back in 2009 we set out to inform developers how to use what we felt were some of the coolest features of ARMv7 through a series of blog posts and seminars. This activity is still ongoing. We received a strong interest in our technical articles on programming NEON but we still didn’t feel we were making things sufficiently straightforward. At the end of the day, while programming an algorithm to make use of NEON can be rewarding in terms of a significant performance improvement or power saving, it isn’t likely to be the primary purpose of your application. As one of my colleagues put it, “I want to write a video game, not a FIR filter!”
At the end of 2010, it became clear that what we should do was to create a library of common “useful” functions accelerated by NEON that applications developers could just pick up and use. We had already had success with the creation of OpenMAX DL a library of low-level multimedia kernels or media processing building blocks to accelerate media codecs, but with this new library we wanted to focus our efforts on a broader applications domain. Our goal was to allow applications developers to freely make use of some or all of the functions in the library and if it didn’t meet their specific needs they could at least learn by example from the library and share that knowledge with their peers. We also wanted to give developers the opportunity to contribute their code back into the library. To attain these goals it was clear we needed to release the project as source code under a suitable open source license and so we chose Apache 2. The library’s design also needed to be modular with a minimum of interdependencies so that developers could pick out individual functions if they wanted rather than be required to use the entire library as part of their application. In addition we decided to create non-NEON optimized versions of the functions to ensure API-level portability for the few remaining ARM SoC’s that don’t have NEON today.
Early in 2011 we set up an internal project codenamed Snappy to develop the library and initially picked a small set of floating-point, vector arithmetic, and matrix manipulation functions. Our goal was to make the library available as soon as possible and then add more functions as time went by. The first set of functions for the Ne10 library (as it is now officially called) has now been completed and we have made it publicly available on GitHub at https://github.com/projectNe10/Ne10 . The code has been developed natively on Linaro (Ubuntu on ARM) and we have also included a Makefile to build it as an Android OS library under the Android Open Source Project. We hope developers will not only make use of the Ne10 library as is, but also contribute to ProjectNe10 providing:
* New functions
* Patches that make use of Ne10 in other Open Source Packages
* Ports of Ne10 to other OS environments
Ne10 is just starting out and we hope that you find it a useful grab bag of functionality to improve the performance and power consumption of your applications. We see Project Ne10 as that start of something that can become really useful across a broad applications space. Over the next few months you’ll see more functions being added by ARM and also examples of where it can accelerate applications libraries too. If you would like to contribute to this effort please feel free join the community at www.ProjectNe10.org