Execution of Inference Workloads on Hikey970 with layer splitting


I am currently working on executing inference workloads on Hikey970. I am trying to split the layers of a network amongst CPU and GPU, and run the workloads to reduce inference latency. I am following the repo attached below to run the models with CPU and GPU utilization.


Could you guys help me understand how I can split the layers of the network and assign them to CPU and GPU?

Is there any API specific for CPU and GPU in ARM-CL?