I am running code in ADStudio using Fixed Virtual Platforms simulator, so no hardware board is connected.
I am trying to profile a sub-routine, so I count cycles for a peace of code:
int64_t prev, curr, delta; asm volatile("isb;mrs %0, pmccntr_el0" : "=r"(prev));
int64_t prev, curr, delta;
asm volatile("isb;mrs %0, pmccntr_el0" : "=r"(prev));
// function body
asm volatile("isb;mrs %0, pmccntr_el0" : "=r"(curr)); delta = curr - prev;
asm volatile("isb;mrs %0, pmccntr_el0" : "=r"(curr));
delta = curr - prev;
My compiler settings are --target=aarch64-arm-none-eabi -march=armv8-a -mcpu=cortex-a53.
--target=aarch64-arm-none-eabi -march=armv8-a -mcpu=cortex-a53
I wanted to check if compiler uses NEON instructions:
#ifdef __aarch64__ printf("--- THIS IS ARCH64 \n");#endif
#ifdef __aarch64__
printf("--- THIS IS ARCH64 \n");
#endif
#ifdef __ARM_NEON__ printf("--- THIS IS NEON \n");#endif
#ifdef __ARM_NEON__
printf("--- THIS IS NEON \n");
But it seems that it is not using neon.
1) Is my define __ARM_NEON__ wrong?
__ARM_NEON__
2) What is the default -gfpu?
-gfpu
3) How do I force neon with -gfpu?
4) When I set -gfpu=none my cycle count is THE SAME as default one. I find this rather strange, shouldn't the math heavy code be much slower? Is there an explanation?
-gfpu=none
Thanks.