This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Screen Dark working with FBOs

Note: This was originally posted on 7th November 2010 at http://forums.arm.com

Hi Guys,

Trying to port an akready well running Sw from Fedora PC to the Mali400 on ST7108 chip, with GLES-2.0, I got a problem working with Framebuffer Objects.
Everything is ok, debugging ok, framebuffer objects completes, no errors, but the screen remains completely dark.
My environment on Mali/ST7108 is OK, other applications can run nicely.

But I think to have not completely understood (or better: not understood at all) how to set display and surfaces while working woth FBOs (maybe the problem could be also found somewhere else)

Basic setting is more or less as below, where I have put only the relevant setting and call for Display and Surfaces.

In does seem that the eglSwapBuffers does't work at all while working with FBOs
(even a monocolor screen cannot be seen...).
In contrast, while working out of FBOs, everything seems to be ok.

After binding with glBindFramebuffer(GL_FRAMEBUFFER,0), then everything is working and can be displayed.

In contrast, after glBindFramebuffer(GL_FRAMEBUFFER, framebuffer), I cannot see anything after he eglSwapBuffers, even if the debug seems to be ok.

Could someone pls help me in giving me suggestions on how to properly set Display and Surfaces?
Does there is a running examples while using FBOs on ST7108 chip?
Thanks a lot in advance for this
(here below I have just put a rough skeleton of the used code)

"
EGLDisplay eglDisplay = 0;
EGLSurface eglSurface = 0;
EGLint major, minor, num_configs;

..........
eglDisplay = eglGetDisplay(EGL_DEFAULT_DISPLAY));

eglInitialize(eglDisplay, &major, &minor);
eglBindAPI(EGL_OPENGL_ES_API);

eglChooseConfig(eglDisplay, configAttribs, &eglConfig, 1, &num_configs);

eglSurface = eglCreateWindowSurface(eglDisplay, eglConfig, NULL, NULL);
eglContext = eglCreateContext(eglDisplay, eglConfig, NULL, contextAttribs)));
eglMakeCurrent(eglDisplay, eglSurface, eglSurface, eglContext);
...........
...........

egl_setup(&eglDisplay,&eglSurface);

GLuint framebuffer;
GLuint depthRenderbuffer;
GLuint texture;

.....

glGenFramebuffers(1, &framebuffer);
glGenRenderbuffers(1, &depthRenderbuffer);
glGenTextures(1, &texture);
glBindTexture(GL_TEXTURE_2D, texture);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB, texWidth, texHeight, 0, GL_RGB, GL_UNSIGNED_SHORT_5_6_5, NULL);

glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);

glBindRenderbuffer(GL_RENDERBUFFER, depthRenderbuffer);
glRenderbufferStorage(GL_RENDERBUFFER, GL_DEPTH_COMPONENT16, texWidth, texHeight);

glBindFramebuffer(GL_FRAMEBUFFER, framebuffer);

glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, texture, 0);

glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_RENDERBUFFER, depthRenderbuffer);

Switch (glCheckFramebufferStatus(GL_FRAMEBUFFER))
{
case GL_FRAMEBUFFER_COMPLETE:
fprintf(stdout, "\n o Framebuffer Complete");
break;
......
}

GLCheckError(glClearColor(1.0f, 0.0f, 0.0f, 1.0f));
.....
eglSwapBuffers(eglDisplay, eglSurface);

// glBindFramebuffer(GL_FRAMEBUFFER,0);
.........

eglSwapBuffers(eglDisplay, eglSurface);
""

Many Thanks

Parents

Stacy Smith over 11 years ago

Note: This was originally posted on 22nd August 2011 at http://forums.arm.com

Stacy - I'm fairly new to OpenGL and have been trying to learn exactly this kind of overall usage model for when to use FBOs and how they can improve either graphics or memory performance (or both). My application will essentially be operating on textures - with a background texture surface filling the "window" and a number of smaller textures being "animated" on top of that - potentially with changes in alpha as well as of course position, rotation, size, etc. This functionality represents only a small portion of the OpenGL|ES functionality, and yet I still haven't found a good resource to learn from on how to optimize this kind of drawing.

Sounds like you're writing a sprite engine there.

FBOs arent really used much for that kind of thing, unless you're intending to render a 3D animated model to an FBO then use that FBO to draw it as a sprite.

The usual use for FBOs is either to act as a reflection modulated by surface ripples, alternate view mapped to a surface (like a security monitor in a scene showing another location) or some kind of lighting/shadow effect (using depth values for framebuffer shadows or rendering a smaller version of a scene with lighting altered to act as an interpolated additive glow)

The overhead involved in using FBOs for such purposes is mostly the fragment shader cycles required to render the scene onto the FBO as well as the scene going to the main framebuffer. In terms of memory footprint it's pretty much the same as a texture.

The kinds of questions I have range from some basics such as how to account for the number of texture units when optimizing (i.e. is it better to use only one?), to understanding when the underlying memory transfers between application and GPU occur (in terms of APIs or other evolutions) and what the performance implications of these are, etc. I have also heard that some kinds of ops (e.g. DEPTH testing) will slow down texture processing, so I'd like to understand if there are things like this that I should stay away from - given more than one way to do what I need to do. Any general advice for a newbie trying to get up to speed on these sorts of topics?

None of these things are anything to do with FBOs, so I may see if I can have this bit of the thread broken onto its own topic by moderators. Until then, I can answer it here.

The question on texture units has the most in depth answer, so I'll leave that one until last.

Regarding the the underlying transfers, unless you plan on using the UMP drivers (universal memory provider) the point at which the big transfer happens is when you call glTexImage2D. It's in the GL specification that as soon as this function call has completed the CPU can drop its copy of the texture data. Depending on the memory mapping implementation of the GPU, in some cases it may be copied into the driver, but the application itself no longer needs to store this data. Textures can be loaded like this at startup time and you don't need to transfer them on a per frame basis.

Depth testing slowing down texture processing may have been a problem back in the days of software renderers but on modern graphics hardware it's been optimised to the point where the cost is negligible compared to the benefits. For example, if you use the depth testing to render all your animated objects first and then do the background pass with a depth value applied, it is measurably faster, because it doesn't have to look up texels for the parts of the background covered by the animated sprites.

The alpha channel is the real performance drain if you're not careful with it. As described above, by rendering things from front to back you can reduce texture lookups and improve your performance, but this won't work if you have semi-transparent sprites. A semi-transparent sprite in the foreground will still write to the depth buffer and the background can't be rendered behind it.
If your sprites have only fully transparent and fully opaque pixels, you can still render front to back by telling transparent fragments to call discard in the fragment shader, preventing it from writing to the depth buffer.
If however you have antialiasing to transparency or semi-transparent pixels, you'lll have to render what's behind them first, so it can show through.

The benefits of using single or multiple texture units are more complicated.
When textures are uploaded to the Mali GPU by the driver, they are stored in a way which makes them cache nicely regardless of what angle you sample them at. As such when you rotate this texture, the scanlines drawn by the fragment shader will stil be able to use cached texture blocks more often, which is good for performance. If on the other hand you need to sample two completely different places from the same texture as part of the same fragment, the caching will miss, because it is swapping between places in the texture, such as using one texture as a colour map and another as a normal map. In this case they are best arranged into separate texture units.

However, for the purposes you're describing it sounds like you'd want to arrange a texture like a sprite sheet, which is usually referred to as a texture atlas. Making a texture atlas is efficient because there's a small overhead in texture state changes. If you wanted to draw 20 different animated sprites over the top of your background, switching textures for every one is time consuming. Additionally, using a separate draw call for every sprite takes its toll.
If you batch your sprites together by arranging them onto one texture atlas they can be rendered in a single drawcall by generating the correct vertices every frame.

That's about as much detail as I can get into without diagrams. If this is something you'd be interested in a deeper description of let me know and I'll see if I can get authorisation for a sample code based on these techniques. Some of the points covered here are already touched upon in the other sample code on malideveloper.com .

Thanks very much.

You're welcome!

-Stacy
Cancel
Vote up 0 Vote down

Cancel

Reply

Stacy Smith over 11 years ago

Note: This was originally posted on 22nd August 2011 at http://forums.arm.com

Stacy - I'm fairly new to OpenGL and have been trying to learn exactly this kind of overall usage model for when to use FBOs and how they can improve either graphics or memory performance (or both). My application will essentially be operating on textures - with a background texture surface filling the "window" and a number of smaller textures being "animated" on top of that - potentially with changes in alpha as well as of course position, rotation, size, etc. This functionality represents only a small portion of the OpenGL|ES functionality, and yet I still haven't found a good resource to learn from on how to optimize this kind of drawing.

Sounds like you're writing a sprite engine there.

FBOs arent really used much for that kind of thing, unless you're intending to render a 3D animated model to an FBO then use that FBO to draw it as a sprite.

The usual use for FBOs is either to act as a reflection modulated by surface ripples, alternate view mapped to a surface (like a security monitor in a scene showing another location) or some kind of lighting/shadow effect (using depth values for framebuffer shadows or rendering a smaller version of a scene with lighting altered to act as an interpolated additive glow)

The overhead involved in using FBOs for such purposes is mostly the fragment shader cycles required to render the scene onto the FBO as well as the scene going to the main framebuffer. In terms of memory footprint it's pretty much the same as a texture.

The kinds of questions I have range from some basics such as how to account for the number of texture units when optimizing (i.e. is it better to use only one?), to understanding when the underlying memory transfers between application and GPU occur (in terms of APIs or other evolutions) and what the performance implications of these are, etc. I have also heard that some kinds of ops (e.g. DEPTH testing) will slow down texture processing, so I'd like to understand if there are things like this that I should stay away from - given more than one way to do what I need to do. Any general advice for a newbie trying to get up to speed on these sorts of topics?

None of these things are anything to do with FBOs, so I may see if I can have this bit of the thread broken onto its own topic by moderators. Until then, I can answer it here.

The question on texture units has the most in depth answer, so I'll leave that one until last.

Regarding the the underlying transfers, unless you plan on using the UMP drivers (universal memory provider) the point at which the big transfer happens is when you call glTexImage2D. It's in the GL specification that as soon as this function call has completed the CPU can drop its copy of the texture data. Depending on the memory mapping implementation of the GPU, in some cases it may be copied into the driver, but the application itself no longer needs to store this data. Textures can be loaded like this at startup time and you don't need to transfer them on a per frame basis.

Depth testing slowing down texture processing may have been a problem back in the days of software renderers but on modern graphics hardware it's been optimised to the point where the cost is negligible compared to the benefits. For example, if you use the depth testing to render all your animated objects first and then do the background pass with a depth value applied, it is measurably faster, because it doesn't have to look up texels for the parts of the background covered by the animated sprites.

The alpha channel is the real performance drain if you're not careful with it. As described above, by rendering things from front to back you can reduce texture lookups and improve your performance, but this won't work if you have semi-transparent sprites. A semi-transparent sprite in the foreground will still write to the depth buffer and the background can't be rendered behind it.
If your sprites have only fully transparent and fully opaque pixels, you can still render front to back by telling transparent fragments to call discard in the fragment shader, preventing it from writing to the depth buffer.
If however you have antialiasing to transparency or semi-transparent pixels, you'lll have to render what's behind them first, so it can show through.

The benefits of using single or multiple texture units are more complicated.
When textures are uploaded to the Mali GPU by the driver, they are stored in a way which makes them cache nicely regardless of what angle you sample them at. As such when you rotate this texture, the scanlines drawn by the fragment shader will stil be able to use cached texture blocks more often, which is good for performance. If on the other hand you need to sample two completely different places from the same texture as part of the same fragment, the caching will miss, because it is swapping between places in the texture, such as using one texture as a colour map and another as a normal map. In this case they are best arranged into separate texture units.

However, for the purposes you're describing it sounds like you'd want to arrange a texture like a sprite sheet, which is usually referred to as a texture atlas. Making a texture atlas is efficient because there's a small overhead in texture state changes. If you wanted to draw 20 different animated sprites over the top of your background, switching textures for every one is time consuming. Additionally, using a separate draw call for every sprite takes its toll.
If you batch your sprites together by arranging them onto one texture atlas they can be rendered in a single drawcall by generating the correct vertices every frame.

That's about as much detail as I can get into without diagrams. If this is something you'd be interested in a deeper description of let me know and I'll see if I can get authorisation for a sample code based on these techniques. Some of the points covered here are already touched upon in the other sample code on malideveloper.com .

Thanks very much.

You're welcome!

-Stacy
Cancel
Vote up 0 Vote down

Cancel

Children

No data