I am currently working on executing inference workloads on Hikey970. I am trying to split the layers of a network amongst CPU and GPU, and run the workloads to reduce inference latency. I am following the repo attached below to run the models with CPU and GPU utilization.
Could you guys help me understand how I can split the layers of the network and assign them to CPU and GPU?
Is there any API specific for CPU and GPU in ARM-CL?