This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Poor performance with GCC

I am porting a project from x86 to ARM64 and I have been struggling with poor performance for some time. Recently I tested switching from GCC to LLVM. To my surprise, I got a massive performance boost. In some cases code execution is several times faster. I experimented with all sorts of optimization flags but I can't get GCC to generate fast enough code. I suspect that vectorization doesn't work. When I compile a random source code file with the --verbose flag, LLVM reports +neon while GCC doesn't report SIMD features. I tried on different ARM64 cores and operating systems and the result is the same.

Any suggestions on how to enable vectorization with GCC on ARM64?

System:

  • GCC 12
  • LLVM 12
  • RHEL 7 and RHEL 8
  • ARMv8-a+neon
Parents
  • Thanks for the answers! I tried the ARM compiler too. This is what I get performance-wise:

    Compiler Relative performance (more is better) Compiler flags

    Vanilla GCC/Gfortran 12.1

    (built from source)

    1,00 -O2 -ftree-vectorize -fopenmp
    ARM GCC/Gfortran 11.2 1,08 -O2 -ftree-vectorize -fopenmp
    ARM GCC/Gfortran 11.2 1,09 -O3 -ftree-vectorize -fopenmp
    LLVM/Clang/Flang 12.0.0 2,86 -O2 -fopenmp
    ARM C++/Fortran 22.0.2 3,24 -O2 -fopenmp -fsimdmath

    I am experiencing crashes when the code is built with LLVM/Flang and ARM Fortran. Is it safe to use -fsimdmath? Accuracy (and stability) are important in my case.

Reply
  • Thanks for the answers! I tried the ARM compiler too. This is what I get performance-wise:

    Compiler Relative performance (more is better) Compiler flags

    Vanilla GCC/Gfortran 12.1

    (built from source)

    1,00 -O2 -ftree-vectorize -fopenmp
    ARM GCC/Gfortran 11.2 1,08 -O2 -ftree-vectorize -fopenmp
    ARM GCC/Gfortran 11.2 1,09 -O3 -ftree-vectorize -fopenmp
    LLVM/Clang/Flang 12.0.0 2,86 -O2 -fopenmp
    ARM C++/Fortran 22.0.2 3,24 -O2 -fopenmp -fsimdmath

    I am experiencing crashes when the code is built with LLVM/Flang and ARM Fortran. Is it safe to use -fsimdmath? Accuracy (and stability) are important in my case.

Children