Computer vision in ACL

Hello, I am trying to perform inference on ARM devices using the YOLOv8 model with the Arm Compute Library (ACL). Currently, I am able to provide a single image and obtain correct detections. My goal, however, is to achieve real-time inference, but I am unsure how to feed frames to the implementation. Based on the ACL documentation and examples I have reviewed, it seems the library only accepts files in npy, jpg, and ppm formats. I am new to this and would greatly appreciate any guidance on how to proceed.

Thank you in advance!