Hi I'm attempting to use ARMPL as the FFTW library and the BLAS library in CP2K for my Mac M1 Max Macbook Pro, but I'm having issues during the build. I keep getting the following error: ... "___kmpc_reduce_nowait", referenced from: __ZN5armplL13asum_parallelIfEENS_14remove_complexIT_E4typeExPKS2_xPFS4_xS6_xEi.omp_outlined in libarmpl_lp64_mp.a[204](41979031_asum_apple_m1_flang-new_mp.o) __ZN5armplL13asum_parallelIfLb0EEENS_14remove_complexIT_E4typeEiPS2_ii.omp_outlined in libarmpl_lp64_mp.a[204](41979031_asum_apple_m1_flang-new_mp.o) __ZN5armplL13asum_parallelIfLb0EEENS_14remove_complexIT_E4typeEiPS2_ii.omp_outlined.1 in libarmpl_lp64_mp.a[204](41979031_asum_apple_m1_flang-new_mp.o) __ZN5armplL13asum_parallelIdEENS_14remove_complexIT_E4typeExPKS2_xPFS4_xS6_xEi.omp_outlined in libarmpl_lp64_mp.a[204](41979031_asum_apple_m1_flang-new_mp.o) __ZN5armplL13asum_parallelIdLb0EEENS_14remove_complexIT_E4typeEiPS2_ii.omp_outlined in libarmpl_lp64_mp.a[204](41979031_asum_apple_m1_flang-new_mp.o) __ZN5armplL13asum_parallelIdLb0EEENS_14remove_complexIT_E4typeEiPS2_ii.omp_outlined.2 in libarmpl_lp64_mp.a[204](41979031_asum_apple_m1_flang-new_mp.o) __ZN5armplL13asum_parallelINSt3__17complexIfEEEENS_14remove_complexIT_E4typeExPKS5_xPFS7_xS9_xEi.omp_outlined in libarmpl_lp64_mp.a[204](41979031_asum_apple_m1_flang-new_mp.o) ...ld: symbol(s) not found for architecture arm64collect2: error: ld returned 1 exit status
I do not know what I am doing wrong. I have copied the ssmp file I am using if anyone has any insights into what errors I am making. Any help would be greatly appreciated. Thanks.
#!/bin/bash # # CP2K Darwin arch file for a serial arm64 binary # (https://www.cp2k.org/howto:compile_on_macos) # # Tested with: GNU 13.2.0, FFTW 3.3.10, LIBINT 2.6.0, LIBVORI 220621, # LIBXC 6.2.2, OpenBLAS 0.3.26, SPGLIB 2.3.1, # LIBGRPP 20231215 # on an Apple M1 (macOS 14.2.1 Sonoma) # # Usage: Source this arch file and then run make as instructed. # Ensure the links in /usr/local/bin to the lastest gcc version. # # Last update: 19.03.2024 # # \ if [[ "${0}" == "${BASH_SOURCE}" ]]; then \ echo "ERROR: Script ${0##*/} must be sourced"; \ echo "Usage: source ${0##*/}"; \ exit 1; \ fi; \ this_file=${BASH_SOURCE##*/}; \ cd tools/toolchain; \ [[ -z "${target_cpu}" ]] && target_cpu="native"; \ if $(command -v brew >/dev/null 2>&1); then \ brew install cmake; \ brew install coreutils; \ brew install libxc; \ brew install pkg-config; \ brew install wget; \ else \ echo "ERROR: Homebrew installation not found"; \ echo ' Run: /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"'; \ cd ../..; \ return 1; \ fi; \ ./install_cp2k_toolchain.sh -j${maxtasks} --mpi-mode=no --no-arch-files --target-cpu=${target_cpu} \ --with-cmake=$(brew --prefix cmake) --with-fftw=no --with-gcc=system \ --with-libxc=$(brew --prefix libxc) --with-libxsmm=no --with-openblas=no; \ source ./install/setup; \ cd ../..; \ echo; \ echo "Check the output above for error messages and consistency!"; \ echo; \ echo "If everything is OK, you can build a CP2K production binary with"; \ echo " make -j ARCH=${this_file%.*} VERSION=${this_file##*.}"; \ echo; \ echo "Alternatively, you can add further checks, e.g. for regression testing, with"; \ echo " make -j ARCH=${this_file%.*} VERSION=${this_file##*.} DO_CHECKS=yes"; \ echo; \ echo "Run always the following command before using the CP2K binary"; \ echo " source ${PWD}/tools/toolchain/install/setup"; \ echo; \ return # Set options DO_CHECKS := no TARGET_CPU := native USE_LIBGRPP := 20231215 USE_LIBINT := 2.6.0 USE_LIBVORI := 220621 USE_LIBXC := 6.2.2 USE_SPGLIB := 2.3.1 LMAX := 5 MAX_CONTR := 4 CC := gcc CXX := g++ FC := gfortran LD := gfortran AR := ar -r -s CFLAGS := -O2 -fopenmp -fopenmp-simd -ftree-vectorize -funroll-loops -g -mtune=$(TARGET_CPU) DFLAGS += -D__MAX_CONTR=$(strip $(MAX_CONTR)) DFLAGS += -D__NO_STATM_ACCESS INSTALL_PATH := $(PWD)/tools/toolchain/install # Settings for regression testing ifeq ($(DO_CHECKS), yes) DFLAGS += -D__CHECK_DIAG FCFLAGS_DEBUG := -fcheck=bounds,do,recursion,pointer FCFLAGS_DEBUG += -fcheck=all,no-array-temps # FCFLAGS_DEBUG += -ffpe-trap=invalid,overflow,zero FCFLAGS_DEBUG += -fimplicit-none FCFLAGS_DEBUG += -finit-derived FCFLAGS_DEBUG += -finit-real=snan FCFLAGS_DEBUG += -finit-integer=-42 FCFLAGS_DEBUG += -finline-matmul-limit=0 WFLAGS := -Werror=aliasing WFLAGS += -Werror=ampersand WFLAGS += -Werror=c-binding-type WFLAGS += -Werror=conversion WFLAGS += -Werror=intrinsic-shadow WFLAGS += -Werror=intrinsics-std WFLAGS += -Werror=line-truncation WFLAGS += -Wrealloc-lhs WFLAGS += -Werror=tabs WFLAGS += -Werror=target-lifetime WFLAGS += -Werror=underflow WFLAGS += -Werror=unused-but-set-variable WFLAGS += -Werror=unused-dummy-argument WFLAGS += -Werror=unused-variable endif ifneq ($(USE_LIBVORI),) USE_LIBVORI := $(strip $(USE_LIBVORI)) LIBVORI_LIB := $(INSTALL_PATH)/libvori-$(USE_LIBVORI)/lib DFLAGS += -D__LIBVORI LIBS += $(LIBVORI_LIB)/libvori.a endif LIBXC_HOME := $(shell brew --prefix libxc) CFLAGS += -I$(LIBXC_HOME)/include DFLAGS += -D__LIBXC LIBS += -Wl,-rpath,$(LIBXC_HOME)/lib -L$(LIBXC_HOME)/lib -lxcf03 -lxc ifneq ($(USE_LIBGRPP),) USE_LIBGRPP := $(strip $(USE_LIBGRPP)) LIBGRPP_INC := $(INSTALL_PATH)/libgrpp-main-$(USE_LIBGRPP)/include LIBGRPP_LIB := $(INSTALL_PATH)/libgrpp-main-$(USE_LIBGRPP)/lib CFLAGS += -I$(LIBGRPP_INC) DFLAGS += -D__LIBGRPP LIBS += $(LIBGRPP_LIB)/liblibgrpp.a endif ifneq ($(USE_LIBINT),) USE_LIBINT := $(strip $(USE_LIBINT)) LMAX := $(strip $(LMAX)) LIBINT_INC := $(INSTALL_PATH)/libint-v$(USE_LIBINT)-cp2k-lmax-$(LMAX)/include LIBINT_LIB := $(INSTALL_PATH)/libint-v$(USE_LIBINT)-cp2k-lmax-$(LMAX)/lib CFLAGS += -I$(LIBINT_INC) DFLAGS += -D__LIBINT LIBS += $(LIBINT_LIB)/libint2.a LIBS += $(LIBINT_LIB)/libint2.a endif ifneq ($(USE_SPGLIB),) USE_SPGLIB := $(strip $(USE_SPGLIB)) SPGLIB_INC := $(INSTALL_PATH)/spglib-$(USE_SPGLIB)/include SPGLIB_LIB := $(INSTALL_PATH)/spglib-$(USE_SPGLIB)/lib CFLAGS += -I$(SPGLIB_INC) DFLAGS += -D__SPGLIB LIBS += $(SPGLIB_LIB)/libsymspg.a endif FFTW_HOME := /opt/arm/armpl_23.10_flang-new_clang_17 CFLAGS += -I$(FFTW_HOME)/include_lp64_mp DFLAGS += -D__FFTW3 LIBS += $(FFTW_HOME)/lib/libarmpl_lp64_mp.a LIBS += $(FFTW_HOME)/lib/libarmpl_lp64.a OPENBLAS_HOME := /opt/arm/armpl_23.10_flang-new_clang_17 CFLAGS += -I$(OPENBLAS_HOME)/include_lp64_mp LIBS += $(OPENBLAS_HOME)/lib/libarmpl_lp64.a CFLAGS += $(DFLAGS) FCFLAGS := $(CFLAGS) $(FCFLAGS_DEBUG) $(WFLAGS) FCFLAGS += -fallow-argument-mismatch FCFLAGS += -fbacktrace FCFLAGS += -ffree-form FCFLAGS += -ffree-line-length-none FCFLAGS += -fno-omit-frame-pointer FCFLAGS += -std=f2008 LDFLAGS += $(FCFLAGS) LIBS += -ldl -lstdc++ # End
Hi,
It looks like you're linking to the OpenMP parallel build of Arm PL, but the linker is not picking up the OpenMP runtime library. You can try adding `-fopenmp` to the LDFLAGS variable on line 169 and the undefined references errors should disappear.
In fact there seems to be some confusion about whether you want the parallel Arm PL library (libarmpl_lp64_mp.a) or the serial one (libarmpl_lp64.a) on lines 152 and 153. There's no need to link to both - you should pick one. If you pick the serial one, then you won't need to add the `-fopenmp` flag.
Let me know if that helps!
Cheers,
Chris.
Thanks Chris.
I added -fopenmp flag right after line 169 like you suggest so it appeared like this:LDFLAGS += $(FCFLAGS)
LDFLAGS += -fopenmp
I also removed the serial library references in line 153 and made the library for OpenBLAS in line 157 a parallel library. I still got the same error though, unfortunately.
I also notices that in line 74 that it calls the -fopenmp flag. As for calling both the serial and parallel versions of the ARMPL, the original arch file listed both for FFTW3 like this:FFTW_HOME := $(shell brew --prefix fftw)CFLAGS += -I$(FFTW_HOME)/includeDFLAGS += -D__FFTW3LIBS += $(FFTW_HOME)/lib/libfftw3_omp.aLIBS += $(FFTW_HOME)/lib/libfftw3.aand while it seemed counterintuitive to me as well, I wanted to keep it as close to the original as possible.
I'm also having similar issues using ARMPL for GROMACS because I keep getting an error saying that it could not find fftwf_plan_many_[r2c|c2r]; the only thing I could think of possibly is that I'm using GCC to as my compiler instead of Apple Clang. Could that be the issue since the MacOS version is optimized for Clang?
Hi.
Using gcc/gfortran will not be compatible with the current version of ArmPL on macOS. We have built against the libc and libomp instances that come with clang. This is why you're getting the missing symbols from both your CP2K and GROMACS builds. As such I recommend you recompile using the LLVM Fortran compiler. Instruction on how to get this are on our Getting Started with ArmPL - macOS page.
As far as your build recipe for CP2K goes, the reason that FFTW has both libfftw3.a and libfftw3_omp.a is that they contain a non-overlapping set of functions; for example "fftw_init_threads" only appears in 'libfftw3_omp.a' whereas "fftw_plan_dft" appears only in 'libfftw3.a'. In Arm Performance Libraries we include all functionality in the same library. As such we recommend you do not link to two different versions of ArmPL as whether you get threading or not will be determined by the link order.
Hope this helps.
Chris
Thanks for the insight, Chris. I switched to LLVM and I think the build of CP2K was successful as I did not have an error constructing the makefile, but whenever I attempt the test afterwards, I get the same error above concerning "___kmpc_reduce_nowait". As for GROMACS, I still get the same error (could not find fftwf_plan_many_[r2c|c2r]) even after switching to LLVM.
Still; progress is progress, and this is actually a step up from what I have been able to accomplish for the past month and a half, so thank you for the assistance so far!
Hi. It definitely feels like it's not linking in the right libraries to get the OpenMP libraries. If you could post the link line that would be useful. I'm assuming you've definitely made sure that libomp is installed.
As far as the missing FFTW symbols goes, again I think this means you are definitely missing something in the link line. Checking the library for them definitely shows they exist:
$ nm /opt/arm/armpl_24.04_flang-new_clang_18/lib/libarmpl_lp64_mp.dylib | grep fftwf_plan_many_00000000005b730c T _fftwf_plan_many_dft00000000005ba3d0 T _fftwf_plan_many_dft_c2r00000000005ba21c T _fftwf_plan_many_dft_r2c00000000005c9f68 T _fftwf_plan_many_r2r
$ nm /opt/arm/armpl_24.04_flang-new_clang_18/lib/libarmpl_lp64_mp.dylib | grep fftwf_plan_many_
00000000005b730c T _fftwf_plan_many_dft
00000000005ba3d0 T _fftwf_plan_many_dft_c2r
00000000005ba21c T _fftwf_plan_many_dft_r2c
00000000005c9f68 T _fftwf_plan_many_r2r
Note that these symbols are slightly differently named than in your message as they have "_dft_" in there. I'm presuming this was just your typo above.
Hope this helps!
Hi Chris; I may have determined an issue. It looks like CP2K doesn't support LLVM as a compiler, and since I cannot use GCC for ARMPL on MacOS, I may be stuck at an impasse. I will investigate whether using MiMiC would be more convenient for me on an Apple silicon machine, but for now it looks like CP2K isn't feasible. Thanks for the suggestions and help!
I know this doesn't help you answer the ArmPL side of the question, but if you want CP2K on your Mac the best method is probably to follow these instructions:
https://www.cp2k.org/howto:compile_on_macos/
I managed to easily get one of their test cases running, following the "brew install" method.
You're right that today CP2K is very much GCC-centric. To date we haven't had previous interest in a GCC version of ArmPL on macOS, but that doesn't mean it won't happen in future. After all, multiple compilers are supported on Linux (GCC, ACfL/LLVM and NVHPC) and Windows (MSVC and LLVM).
Thanks again for trying.
Hi again.
I've done a bit more investigation, and it looks like the "fix" to get ArmPL working with GCC/gfortran on macOS is relatively simple. You just need to add "-lc++" to the end of your link line.
I've followed the full build instructions for CP2K on the page mentioned above, and have ended up shoehorning ArmPL into the generated arch/Darwin-gnu-arm64.ssmp file to replace libfftw3 and libopenblas sections with the following:
ARMPL_HOME := /opt/arm/armpl_24.04_flang-new_clang_18/LIBS += $(ARMPL_HOME)/lib/libarmpl_lp64.a $(ARMPL_HOME)/lib/libFortranDecimal.a $(ARMPL_HOME)/lib/libFortranRuntime.a -lc++ This seems to work for me. Obviously, this isn't a tried a tested solution, but it probably helps in many cases.
Doing builds using our parallel version this is definitely a bad thing to try, as it would have both libomp and libgomp included, but if you are happy not using OpenMP in the library then this solution should be OK.
Thanks for the suggestion, Chris! I will try this solution when I get a chance and let you know how it turns out.
I have a slight update. It turns out that flang-new for me wasn't properly compiling OpenMP code, which definitely explains my error when trying to compile CP2K. There seems to be a bug in LLVM when installing Flang as a standalone compiler where it doesn not properly install the omp files correctly. I'm currently working with the devs at LLVM to get it fixed. When I'm able to properly compile OpenMP code with Flang, I'll let you know the results of the CP2K compile.