This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

How to make Ethos-U NPU work on an ARM Cortex-A + Cortex-M processor?

I have a question about how to make Ethos-U NPU work on a ARM Cortex-A + Cortex-M processor. First, I found ethos-u-linux-driver-stack and ethos-u-core-software on https://git.mlplatform.org/.

1. I know ethos-u-linux-driver-stack is Ethos-U kernel driver. Should it be integrated into the Linux OS running on Cortex-A or be integrated into the Linux OS running on Cortex-M? I am nor clear about which core it need to perform on.

2. For ethos-u-core-software, how to run it? I did't find the detail steps to run it. Does it run on NPU or any core?

3. Except the above two repos, is there any other repo necessory to make Ethos-U NPU work on an ARM Cortex-A + Cortex-M processor?

Thanks for your suggestion in advance.

  • Kristofer, thanks for your reply, I got it. Another question, what is your suggestion about the OS or Non-OS running on Cortex-M? How about running FreeOS on Cortex-M?

  • Hi

    The code published on MLPlatform is sufficient for Linux to dispatch inferences to the Arm Cortex-M.

    The reason for moving towards virtio/rpmsg (OpenAMP on the Arm Cortex-M side) is that those are Linux native APIs. They provide standard, well designed and well tested communication channels. With that said, what we have published so far is fully functional.

    For reference you could follow the call chain from ethosu_inference_create() to see how the kernel driver creates an inference and dispatches it to the Arm Ethos-U subsystem. On the Core side the message is received and handled in MessageProcess::handleMessage().

    Best regards
    Kristofer

  • Hi, Kristofer, I am still confused about the communication between Cortex-A and Cortex-M. As you mentioned, the current communication is using the Linux kernel mailbox APIs and virtio/rpmsg are not used. I want to know whether the current codes are sufficient to accomplish the communication between Cortex-A and Cortext-M. Do virtio and rpmsg need to be used?

  • 1. Yes it is the Tensorflow framework from GitHub. The build system under tensorflow/lite/micro/tools/make/ is used to produce a static library, including CMSIS-NN and the Ethos-U driver. There is to my knowledge one small patch that has not yet reach upstream, that adjusts the build flags and a few paths to CMSIS-NN.

    2. OpenAMP is in the plan, but for the moment the communication is defined in linux_driver_stack/kernel/ethosu_core_interface.h and does not make use of virtio and rpmsg. It does however use the Linux kernel maibox APIs, to abstract which hardware block that is used to trigger IRQs on the remote CPU.

  • Kristofer, thank you very much for your detailed reply. It's very useful for me. I have two more questions.

    1. The TensorflowLite microcontroller framework you mentioned is the common source https://github.com/tensorflow/tensorflow, right? I checked the tensorflow directory in ethos-u/core_software, it seems the common one, no other external patches.

    2. Is OpenAMP used for IPC between Cortex-A and Cortex-M? According to the AMP communication you mentioned, it seems the current codes are sufficient.

  • 1.

    The Linux driver stack for Arm Ethos-U is provided as an example of how an Arm Cortex-A running Linux can dispatch inferences to an Arm Ethos-U subsystem (Arm Cortex-M, Arm Ethos-U, SRAM, …). This is a so-called Asymmetric MultiProcessing (AMP) system, which requires a small amount of shared memory and an external hardware block (e.g. the Arm MHU) to trigger IRQs on the remote CPU. 

    The Linux driver stack currently contains a user space application, a user space driver library, and a kernel driver. Important to notice is that the kernel driver will not drive the Arm Ethos-U NPU directly, but instead sends a message to an Arm Cortex-M in the Arm Ethos-U subsystem that drives the NPU. 

    The setup of the AMP communication is platform dependent and is done in the DTB file.

    2.

    All software running on the Arm Cortex-M is referred to as core. The code that runs on the NPU is referred to as command stream. 

    The Arm Ethos-U (sub)system requires an Arm Ethos-U, some SRAM and an Arm Cortex-M to drive the NPU. The (sub)system is highly customizable. A customer may choose which Arm Cortex-M to use, the amount of SRAM, which peripheral to attach, which software to run etc. 

    Because of the high degrees of flexibility Arm can’t provide a ready packaged software to boot on the Arm Cortex-M. Core software only contains the necessary software components needed to run an inference using the TensorflowLite microcontroller framework. 

    You would need to write your own main() function to initialize the platform. You will also need to define a scatter file (Arm Clang) or a linker script (GCC) that describes the memory layout.

    3.

    All software publicly available for Arm Ethos-U can be downloaded with fetch_externals.py from the Arm Ethos-U repository. 

    https://git.mlplatform.org/ml/ethos-u/ethos-u.git 

    Other repos worth mentioning is Vela, which takes a tflite file as input and produces another optimized tflite as output. The optimized tflite file contains custom operators that are executed on the Arm Ethos-U. 

    The Arm Ethos-U IP has just recently been released, so it will still take some time before you can buy an Arm Ethos-U capable platform to test on. There are currently no virtual platforms available for download on the Arm website that include the Arm Ethos-U, but hopefully that will change in a not too distant future.