This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

glGenerateMipmap() very slow on Samsung Galaxy SII

Note: This was originally posted on 15th February 2012 at http://forums.arm.com

Hi,
I'm currently doing some GPGPU on the Samsung Galaxy SII (Mali-400 MP). For that I need to generate a mipmap from a texture that has been rendered to via a FBO. Unfortunately glGenerateMipmap() appears to be very slow on the device. It takes about 90 milliseconds to generate a mipmap for a 512x512 RGBA8888 texture. Since I also tried the same code on other Android devices, where this function works much faster (about 2 milliseconds), this slowdown really puzzles me. Am I doing something wrong or missing something here? Can anyone provide example code for this case working on a MALI device?

Here are the relevant parts of my code:

glGenTextures(1, &texId);
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, texId);

// Allocate graphics memory.
glTexImage2D(GL_TEXTURE_2D, 0, format, cols, rows, 0, format, type, NULL);
// Allocate memory for mipmap.
glGenerateMipmap(GL_TEXTURE_2D);

glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR_MIPMAP_LINEAR);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);

// Create off-screen framebuffer object and attach the texture to it.
glGenFramebuffers(1, &fboId);
glBindFramebuffer(GL_FRAMEBUFFER, fboId);
glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, texId, 0);

// Now render to that texture
...

// Generate MIP map.
glBindTexture(GL_TEXTURE_2D, texId);
glGenerateMipmap(GL_TEXTURE_2D);
Parents
  • Note: This was originally posted on 21st February 2012 at http://forums.arm.com

    Hi Bert,

    thanks for the report. I've done some quick calculations and I agree it looks like something's not right.

    If we assume each mipmap level from 256x256 down to 1x1 must be generated, that's:

    256^2 + 128^2 + ... + 1^2 ~= 90,000 texels to generate

    If we assumed a CPU based implementation which reads 4 texels from the previous level, splits out the colour channels, sums them, averages them, recombines and writes out the result that might be roughly 55 cycles.

    Assuming perfect memory, so no delays (it won't be, but...) and a 1.2GHz CPU clock that'd give a very rough estimate of:

    90,000 * 55 / 1.2e9 ~= 4ms

    So I agree, 90ms sounds too high. That duration on a 1.2GHz CPU should allow in excess of 100M cycles of work.

    Out of interest, how are you timing the operation? Are you trying to do this operation every frame? Are you trying to render the newly mipmapped texture later in the same frame?

    Do you happen to know what the other devices are using for the GL_GENERATE_MIPMAP_HINT when generating the mipmaps - i.e. GL_(FASTEST|NICEST|DONT_CARE) and have you set this hint in your application?

    Cheers, Pete
Reply
  • Note: This was originally posted on 21st February 2012 at http://forums.arm.com

    Hi Bert,

    thanks for the report. I've done some quick calculations and I agree it looks like something's not right.

    If we assume each mipmap level from 256x256 down to 1x1 must be generated, that's:

    256^2 + 128^2 + ... + 1^2 ~= 90,000 texels to generate

    If we assumed a CPU based implementation which reads 4 texels from the previous level, splits out the colour channels, sums them, averages them, recombines and writes out the result that might be roughly 55 cycles.

    Assuming perfect memory, so no delays (it won't be, but...) and a 1.2GHz CPU clock that'd give a very rough estimate of:

    90,000 * 55 / 1.2e9 ~= 4ms

    So I agree, 90ms sounds too high. That duration on a 1.2GHz CPU should allow in excess of 100M cycles of work.

    Out of interest, how are you timing the operation? Are you trying to do this operation every frame? Are you trying to render the newly mipmapped texture later in the same frame?

    Do you happen to know what the other devices are using for the GL_GENERATE_MIPMAP_HINT when generating the mipmaps - i.e. GL_(FASTEST|NICEST|DONT_CARE) and have you set this hint in your application?

    Cheers, Pete
Children
No data