This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

[Arm NN on Cortex-A9] Static build leads to "None of the preferred backends [CpuRef ] are supported" error at runtime

Hello ArmNN experts,
I'm currently facing an issue at runtime when using a statically built ArmNN lib.

I have been able to build a STATIC version of libarmnn with all dependancies and then build my own app for a Wandboard target (armv7).
At runtime I got this error: "ERROR: None of the preferred backends [CpuRef ] are supported. Current platform provides []"

When I compile SHARED libraries (*.so) the exact same app is running fine, and the inference is done as expected.

ArmNN version: 21.02
Model/Target: TFLite on armv7
Build options: cmake .. -DCMAKE_LINKER=/usr/bin/arm-linux-gnueabihf-ld -DCMAKE_C_COMPILER=/usr/bin/arm-linux-gnueabihf-gcc -DCMAKE_CXX_COMPILER=/usr/bin/arm-linux-gnueabihf-g++ -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_STANDARD=14 -DCMAKE_CXX_FLAGS=-mfpu=neon -DARMCOMPUTE_ROOT=$BASEDIR/ComputeLibrary -DARMCOMPUTE_BUILD_DIR=$BASEDIR/ComputeLibrary/build -DBOOST_ROOT=$BASEDIR/boost_1_64_0 -DTF_GENERATED_SOURCES=$BASEDIR/tensorflow-protobuf -DBUILD_TF_PARSER=0 -DBUILD_ONNX_PARSER=0 -DONNX_GENERATED_SOURCES=$BASEDIR/onnx -DBUILD_TF_LITE_PARSER=1 -DTF_LITE_GENERATED_PATH=$BASEDIR/tflite -DFLATBUFFERS_ROOT=$BASEDIR/flatbuffers-arm32 -DFLATC_DIR=$BASEDIR/flatbuffers/build -DPROTOBUF_ROOT=$BASEDIR/protobuf-arm -DARMCOMPUTENEON=1 -DARMNNREF=1

Does anyone have an idea of a fix or additionnal investigations ??
Thanks!!

Regards,
Nicolas

Parents
  • Some update on my issue.

    I have been able to activate the CpuRef backend when using static amrnn lib.

    For this, the backend must be explicitely registred in the application:

    BackendRegistryInstance().Register(RefBackend::GetIdStatic(), []() { return IBackendInternalUniquePtr(new RefBackend); });

    This must be done before the runtime creation.

    I have not been able yet to register the Neon backend because of a lot of compilation issues, but I keep trying.

    Does someone knows why with SHARED version of the ArmNN lib this registration is done implicitly?

    Regards,

    Nicolas

Reply
  • Some update on my issue.

    I have been able to activate the CpuRef backend when using static amrnn lib.

    For this, the backend must be explicitely registred in the application:

    BackendRegistryInstance().Register(RefBackend::GetIdStatic(), []() { return IBackendInternalUniquePtr(new RefBackend); });

    This must be done before the runtime creation.

    I have not been able yet to register the Neon backend because of a lot of compilation issues, but I keep trying.

    Does someone knows why with SHARED version of the ArmNN lib this registration is done implicitly?

    Regards,

    Nicolas

Children
  • That's interesting, thanks for the update!

    I guess when you build ArmNN as a shared library, it's then loaded at runtime and this global variable is created, which leads to registering the backend:
    https://github.com/ARM-software/armnn/blob/branches/armnn_21_05/src/backends/reference/RefRegistryInitializer.cpp

    When you build it statically, then, probably, the linker strips this piece and only leaves the functions that are needed for your application (but I'm not sure about that).

  • This is maybe the reason but I didn't found any informations about this topic in ARM documentation.

    I've also been able to compile an application using the CpuAcc (aka Neon). Issues where coming from missing lib in my build command.

    But when I run the app I got a segmentation fault when I try to optimize the network. Keep on digging ...

    For information I'm trying this static approach instead of shared because I have been a little bit disappointed by the performance of ArmNN inference compared to TF-Lite. And I've read somewhere that using shared libs can have an impact on performances.

    From what I've seen the static version of CpuRef is a little bit faster for my usecase than the shared version: 5~10% faster for inference.

    Nicolas

  • Some update.

    By using the "-whole-archive" in the compiler command line, the manual backend register I mentionned earlier can be removed from the application ... as in the SHARED tests.

    This is cool but this does not fix the segmentation fault when trying to use the CpuAcc backend, and this generates a huge binary size which is not the goal of using a STATIC lib :-)