We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
hi,
my application use android java JNI to process camera vidéo in real time. The C part use openCL and multicore threading with médiatek 9200+ and mali G715.
Something strange append with the application. After fews seconde of processing, 70-80 frames, i go from 60-70ms per frame to 140-160 per frame.
What i am doing :
1) i do some kernel to conver YUV and extract data
2) Then i procees the extracted data with CPU multicore threading(4 thread at the time) 7 time.
3) Then i use kernel again to extract data and send it back to Java.
If i remove all the CPU work, the GPU time remain stable between 20ms at the begining(70-80 frames) to 40-45. But with the CPU work time increase dramaticly after 70-80 frame using the same amount of input data. The same problem appened with my old mali g72 after 20 frames.
I tried to use streamline, but as i run windows7 i cannot have the analizer, how just run from windows10. But i can get the graph and i remarked a strange activity on the GPU.
1) the Mali Memory Read Latency after to get red after the 70-80 frames. (it show 25 mega beats ?)
2) the Mali geométry culling rate start to ocsillate. (it show 100%)
3) the Mali geométry efficiency start to ocsillate. (it show 3 trheads)
4) Mali Early ZS rate is red
and many other thing start to ocillate. But Device Thermal State is 100% green.
in fact after 70-80 frames a lot of thing start to ocsillate.
And streamline is to complicated for me. there is too much things to know and anderstant. I do not have the time.
So i am wondering if it could be possible for an expert to analyse it.
As i said i had the same problem on old maliG78 so at some point there is something that goes wrong using in alternance GPU-CPU-GPU. In one of my post about SVM someone told me that using GPU and CPU was not a good odea. But i cannot do with GPU what i am doing with GPU. Or i do not know how to do with GPU what i am doing with CPU. At some time i need to procces data with CPU.
thanks for the help.
hi again,
I made some more testing.
1)I removed the data transfer from GPU to CPU (enqueueMapBuffer). So multicore processing process zéro data and trtake 1 to 2 ms. And i removed the transfer from CPU to GPU (enqueueWriteBuffer or cl::bufferCL_MEM_READ_ONLY|CL_MEM_USE_HOST_PTR because i tested with the two possibility of transfer) and i removed the final GPU to JNI buffer for display (enqueueReadBuffer). But i the process time still double after few frames.
2)
I also tried to remove tha all JNI call, so no more openCL and no more multicore threading. In this case the speed is stable and the streamline is not balancing anymore.
3)
i also tried to only remove all the CPU processing and keep only the OpenCL, YUV transfor and all the read and write to CPU.and in this case i remaked that at time double after 20 frame and that the Mali Memory Read Lantency start to get very red after 7 seconde like if the CPU were processing data. so from 7ms for the éà first frame then 20ms. And if i removed all the GPU/CPU transfer there is still i little bit of red and frame are processed fromm 0ms to 4ms after 7 seconde.
So it look like there is somehing wrong when transfering data from GPU to GPU and GPU to CPU. And of course CPU processing data increase with the amount the data processed. And the good indicator is the Mali Mémory Read Latency. But it may be something alse. I am not good enough to help more.