This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Why does FPU performance differ in AArch64 and AArch32 with Cortex-A53?

Hello experts,


I have come to having a question.
VFP Benchmark is a benchmark application which was made by a certain Japanese in order to measure ARM VFP performance especially for ARMv7-A and ARMv8-A.
The software can be downloaded from the following link.
http://dench.flatlib.jp/app/vfpbench

Also,  below I would show some SoC's performance results of VFP Benchmark from a web site (http://wlog.flatlib.jp/item/1793).

SP: Single Precision  DP: Double Precision ST: Single Thread  MT: Multi-Thread

I am very surprised at this results because Cortex-A53 FPU performances are different between AArch32 and AArch64.
I have believed that an FPU operation will be executed in the same way for each AArch64 and AArch32.
From this view point, the Cortex-A72 results would be reasonable.
That is, the FPU performances are the same for AArch64 and AArch32.
However, regarding Cortex-A53, the double precision performances are the same for both AArch64 and AArch32 but the single precision performance of AArch32 is a half of AArch64.
My question is why the Cortex-A53 SP performances are different between AArch64 and AArch32.
Could anyone answer this question as far as it would not invade the NDA of the hardware implementation?


Best regards,
Yasuhiko Koumoto.