Dear colleagues.
I'm doing this academic nature project helping a colleague from another professional area (public administration) using an ARM Cortex-A53 that was sent to me friendly by FriendlyARM for testing.
In this case, I'm using an open source used in our country for bicycle counting the bike lane and bike lanes.
The project still needs to be matured and would immediately improve the collection of the same for the ARM architecture to because so far has only been tested on Intel architectural.
I've done a first compilation NanoPI and everything worked, including the software has worked.
But is that I can get the maximum performance improving compilation?
can anybody help me?
With reading suggestions and tips change the code? Always focusing on the chip Samsung S5P6818 Octa-Core Cortex-A53, 400M Hz - 1.4G Hz
Who wants to know the source can be found, is a basic design still maturing:: GitHub - carlosdelfino/ContadorDeCiclistas
Thanks.
ARM specific options for GCC are specified in this chapter of the official documentation.
If automatic NEON optimisations work like automatic SIMD optimisations for x86 architectures, you might need to use aligned memory allocators like aligned_alloc, aligned_malloc and, in C++, operator new to ensure the compiler that everything will be correctly aligned and, therefore, let it use SIMD optimisations as much as it can.
Analysis of the assembly produced by hotspot is still highly recommended though. In my opinion, it might help to :
This web service could help you in this task, as it provides a quick look at the assembly code produced by GCC from a specified C/C++ source code.