On the NVIDIA GPU, it can execute two kernels concurrently by configuring the stream.If I want to implement similar functions on the Mali GPU, what should I do?