Arm Community
Site
Search
User
Site
Search
User
Groups
Education Hub
Arm Ambassadors
Open Source Software and Platforms
Research Collaboration and Enablement
Forums
AI and ML forum
Architectures and Processors forum
Arm Development Platforms forum
Arm Development Studio forum
Arm Virtual Hardware forum
Automotive forum
Compilers and Libraries forum
Graphics, Gaming, and VR forum
High Performance Computing (HPC) forum
Infrastructure Solutions forum
Internet of Things (IoT) forum
Keil forum
Morello forum
Operating Systems forum
SoC Design and Simulation forum
SystemReady Forum
Blogs
AI and ML blog
Announcements
Architectures and Processors blog
Automotive blog
Graphics, Gaming, and VR blog
High Performance Computing (HPC) blog
Infrastructure Solutions blog
Internet of Things (IoT) blog
Operating Systems blog
SoC Design and Simulation blog
Tools, Software and IDEs blog
Support
Arm Support Services
Documentation
Downloads
Training
Arm Approved program
Arm Design Reviews
Community Help
More
Cancel
Support forums
Graphics, Gaming, and VR forum
EGL Pixbuffer is slow
Jump...
Cancel
Locked
Locked
Replies
9 replies
Subscribers
136 subscribers
Views
5840 views
Users
0 members are here
Mali-GPU
Mali-400
Options
Share
More actions
Cancel
Related
How was your experience today?
This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion
EGL Pixbuffer is slow
Ahmed Tolba
over 11 years ago
Note: This was originally posted on 25th February 2013 at
http://forums.arm.com
Hi All,
I'm having a 1k*1k*rgb texture that is rendered using shader and I want to copy the pixels to buffer so that I use Opencv with it. I tried glreadpixels and its very
slow, I tried Pixbuffer of Egl, it has the same perforamance its very slow 7FPS. I'm using Mali400 on Exynos4412
Here is the code
Please view it in pastebin, there is a problem with code posting here
http://pastebin.com/TwrtF0EG
Chris Varnsverry
over 11 years ago
Note: This was originally posted on 4th March 2013 at
http://forums.arm.com
Hi Ahmed,
Just checking I understand, it sounds like you are rendering something using the GPU, and them attempting to perform some CV on it. This sounds like a fairly unusual usecase, can you give us some more detail on what it is that you're trying to achieve, so we can better advise?
Thanks,
Chris
Cancel
Up
0
Down
Cancel
Ahmed Tolba
over 11 years ago
Note: This was originally posted on 5th March 2013 at
http://forums.arm.com
Thanks for your reply.
I'm trying to do some GPU Processing like thresholding an image, then get the result from the GPU to the CPU and do some OpenCV operations.
The way to transfeer the data from the GPU to CPU using glreadpixels or eglpixel buffer is really slow.
Cancel
Up
0
Down
Cancel
Chris Varnsverry
over 11 years ago
Note: This was originally posted on 5th March 2013 at
http://forums.arm.com
Hi Ahmed,
The slowdowns you are experiencing are to be expected with your usage unfortunately, as you are "breaking" the pipeline model that Mali GPU's implement. As a deferred renderer, frames are ideally not submitted to the GPU until a call to eglSwapBuffers is made. Therefore in a normal use case such as a game, the CPU will be working on frame N, whilst the GPU is processing frame N-1. In your use case, you are attempting to
synchronously
read back pixels from frame N to the CPU side, which implies that the frame up to that point must be submitted to the GPU, be processed, and then be read back. The CPU therefore has to wait for the GPU, and after which time the GPU has a huge pipeline bubble whilst it waits for more work from the CPU. It's easy to see why this is sub-optimal on deferred renderers.
ReadPixels is a synchronous call as you want to read back the state of rendering for the frame you are currently working on (and that normally wouldn't have been submitted yet). CopyTexImage isn't necessarily synchronous, but in your case as you're copying to a Pbuffer that is accessible from the CPU side it is. I'm asking around for an asynchronous method that would work for your use-case, but my advice for now is that synchronous calls of any kind are bad for deferred renderers, and come with a slowdown. This is described in the "mali gpu application optimization guide" available from
Mali Developer Guides
, which is well worth a read.
Hope this helps,
Chris
Cancel
Up
0
Down
Cancel
Chris Varnsverry
over 11 years ago
Note: This was originally posted on 5th March 2013 at
http://forums.arm.com
Hi Ahmed,
It may be possible to render to a pixmap in one thread, and have another thread wait on an EGL Fence for the render OP to complete, at which time it can grab the data and pass it along to the CV processing. By reading back from pixmaps you are not causing a flush, and by waiting on the fence on another thread you are not blocking your render thread. This removes the synchronous read back and means you should increase your throughput and FPS. Details on the fence mechanism can be found here: h
ttp://www.khronos.org/registry/gles/extensions/OES/EGL_KHR_fence_sync.txt
. The most optimal implementation should ideally implement a ringbuffer of pixmaps so that you do not have the continuous overhead of creating and destroying pixmaps every frame, which is a general producer-consumer optimization.
Hope this helps,
Chris
Cancel
Up
0
Down
Cancel
Chris Varnsverry
over 11 years ago
Note: This was originally posted on 5th March 2013 at
http://forums.arm.com
Hi Ahmed,
I should have asked initially, are you using Linux or Android?
Thanks,
Chris
Cancel
Up
0
Down
Cancel
Ahmed Tolba
over 11 years ago
Note: This was originally posted on 5th March 2013 at
http://forums.arm.com
Thansk for your reply.
I'm using Linux.
Cancel
Up
0
Down
Cancel
Chris Varnsverry
over 11 years ago
Note: This was originally posted on 5th March 2013 at
http://forums.arm.com
Hi Ahmed,
In that case, what I described above should work. Android would have made things slightly more complicated.
Thanks,
Chris
Cancel
Up
0
Down
Cancel
Ahmed Tolba
over 11 years ago
Note: This was originally posted on 6th March 2013 at
http://forums.arm.com
Do you mean a pixel buffer object or pixmap ?
Would you show please a pseudo code ?
Cancel
Up
0
Down
Cancel
Chris Varnsverry
over 11 years ago
Note: This was originally posted on 12th July 2013 at
http://forums.arm.com
Hi Ahmed,
On GLES3, this can be done with PBO's bound to the GL_PIXEL_PACK_BUFFER target, causing glReadPixels to write to that pbo instead of returning pixel data to the application, which avoids the pipeline stall and flush. The fence is used to signal to the application when this operation has completed and the buffer can be mapped to retrieve the pixel data. There is some sample code in the works but I can't give a date on when this will be available unfortunately. For GLES2, I think for now the only option is pixmaps, but the fence is still supported.
Hope this helps,
Chris
Cancel
Up
0
Down
Cancel