This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Bad interaction with DRI2 for vsync

Hi,

We are working with Mali-400 driver r3p2-01rel0 on Exynos4412, under Linux/X11.

base: BUILD=RELEASE ARCH=arch_011_udd PLATFORM=default_7a TRACE=0 THREAD= GEOM= CORES=MALI400 USING_MALI400=1 TARGET_CORE_REVISION=0x0101 TOPLEVEL_REPO_URL=Linux-r3p2-01rel0 REVISION=Linux-r3p2-01rel0 CHANGED_REVISION=Linux-r3p2-01rel0 REPO_URL=Linux-r3p2-01rel0 BUILD_DATE=Fri Jan 11 14:58:31 UTC 2013 CHANGE_DATE=Linux-r3p2-01rel0 TARGET_TOOLCHAIN=gcc HOST_TOOLCHAIN=gcc TARGET_TOOLCHAIN_VERSION=gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)  HOST_TOOLCHAIN_VERSION=gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)  TARGET_SYSTEM=gcc-arm-linux-gnueabihf HOST_SYSTEM=gcc-arm-linux-gnueabihf CPPFLAGS= CUSTOMER=internal VARIANT=mali400-r3p2-gles11-gles20-linux-ump-x11 HOSTLIB=direct INSTRUMENTED=FALSE USING_MRI=FALSE MALI_TEST_API= UDD_OS=linux

The Mali README explains that Mali must be integrated with the display controller driver of the host system. We're trying to do just that. In this case, the display driver is exynos-drm, which uses DRI2. We require this over fbdev for the ability to change resolutions dynamically (via KMS), for perfect vblank synchronization, and to reduce the amount of CPU copying in order to get GPU rendering results on the screen.

I am starting this with the xf86-video-armsoc driver (which is authored by ARM) and I integrate it with Mali as follows: for each new GEM buffer created, I obtain a UMP secure ID for that memory and store it in the DRI2 buffer name for that allocation. This should be all that is needed, but unfortunately Mali does not seem to adhere to the basic DRI2 standards, which means that this doesn't work. The 2 main problems are:

  1. The current half-drawn back buffer is often the one that ends up settled displayed on the screen
  2. Right after scheduling a page flip, the "old" front buffer (which is still displayed on-screen until the next vblank) seems to be modified by Mali even before the driver has posted a  DRI2BufferSwapComplete event (I confirmed this by checksumming the buffers at different points in the pipeline).

For the first problem, a double-buffered DRI2 rendering client should always call DRI2GetBuffers in order to get the back buffer before starting to draw. While I can see that the command sequence is often GetBuffers SwapBuffers GetBuffers SwapBuffers... I also often see cases where it does GetBuffers SwapBuffers SwapBuffers SwapBuffers... This confuses the buffer reuse logic in the DRI2 implementation in the X server and results in the client and server disagreeing about which buffers are front and back at a given time.

For the second issue, I confirmed the problem by checksumming the buffers at different points. The old front buffer is reused as soon as the X driver's ScheduleSwap() function returns, which does not indicate that the swap has completed. The buffer is still on the screen for a while longer. But Mali draws to it right away resulting in a nasty visual glitch, corrected momentarily after when the swap completes.

Mali seems to have a bit of a fundamental misunderstanding with SwapBuffers. I already saw in Re: Mali deadlock with X server grab that Mali appears to create a dedicated thread in order to call SwapBuffers, which seems bizarre as SwapBuffers is an asynchronous operation (with completion later notified by the BufferSwapComplete event). But from this and the behaviour observed above, I guess Mali developers have misunderstood and implemented it as purely synchronous - i.e. it is a blocking function, and after it returns, the old front buffer is available for immediate reuse. I also see this comment in xf86-video-mali:

/*

* MaliDRI2ScheduleSwap is the implementation of DRI2SwapBuffers, this function

* should wait for vblank event which will trigger registered event handler.

* Event handler will do FLIP/SWAP/BLIT according to event type.

*

* Current DRM doesn't support vblank well, so this function just do FLIP/

* SWAP/BLIT directly, according to drawable information.

*/

Perhaps I could help here in regard to the comment "Current DRM doesn't support vblank well" - it certainly has not been a problem for other drivers, I'm sure we could find a solution for Mali too.

Making ScheduleSwap purely synchronous (as suggested in xf86-video-mali implementation) kills performance, as it causes the X server to block without processing any requests until a vblank occurs. Also on the calling side, I observed in Mali deadlock with X server grab that the rendering client also blocks waiting for the response, while holding a global Mali lock.

Could this be improved in future Mali-400 versions?

Has the situation changed at all in R4P0?

Thanks.

Daniel

Parents
  • Thanks! r4p0 does seem to fix the problem where the old front buffer was written to immediately after SwapBuffers returns, now it waits for the event. Great!

    I haven't had a chance to confirm if it fixes the other problem (where it sometimes called SwapBuffers without calling GetBuffers first) but I will check that soon. I am also worried that the dedicated SwapBuffers thread will continue to cause problems, but maybe we can regard it as working for today...

Reply
  • Thanks! r4p0 does seem to fix the problem where the old front buffer was written to immediately after SwapBuffers returns, now it waits for the event. Great!

    I haven't had a chance to confirm if it fixes the other problem (where it sometimes called SwapBuffers without calling GetBuffers first) but I will check that soon. I am also worried that the dedicated SwapBuffers thread will continue to cause problems, but maybe we can regard it as working for today...

Children
  • Hi dsd,

    Thanks for the feedback, glad that it's solved some of your issues. Please let us know if you confirm any outstanding issues and I can push these back to the driver team. It sounds like they might be aware of them but it helps getting them prioritised if we can show people asking for it

    Thanks,

    Chris