Hi,
We are working with Mali-400 driver r3p2-01rel0 on Exynos4412, under Linux/X11.
base: BUILD=RELEASE ARCH=arch_011_udd PLATFORM=default_7a TRACE=0 THREAD= GEOM= CORES=MALI400 USING_MALI400=1 TARGET_CORE_REVISION=0x0101 TOPLEVEL_REPO_URL=Linux-r3p2-01rel0 REVISION=Linux-r3p2-01rel0 CHANGED_REVISION=Linux-r3p2-01rel0 REPO_URL=Linux-r3p2-01rel0 BUILD_DATE=Fri Jan 11 14:58:31 UTC 2013 CHANGE_DATE=Linux-r3p2-01rel0 TARGET_TOOLCHAIN=gcc HOST_TOOLCHAIN=gcc TARGET_TOOLCHAIN_VERSION=gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) HOST_TOOLCHAIN_VERSION=gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) TARGET_SYSTEM=gcc-arm-linux-gnueabihf HOST_SYSTEM=gcc-arm-linux-gnueabihf CPPFLAGS= CUSTOMER=internal VARIANT=mali400-r3p2-gles11-gles20-linux-ump-x11 HOSTLIB=direct INSTRUMENTED=FALSE USING_MRI=FALSE MALI_TEST_API= UDD_OS=linux
The Mali README explains that Mali must be integrated with the display controller driver of the host system. We're trying to do just that. In this case, the display driver is exynos-drm, which uses DRI2. We require this over fbdev for the ability to change resolutions dynamically (via KMS), for perfect vblank synchronization, and to reduce the amount of CPU copying in order to get GPU rendering results on the screen.
I am starting this with the xf86-video-armsoc driver (which is authored by ARM) and I integrate it with Mali as follows: for each new GEM buffer created, I obtain a UMP secure ID for that memory and store it in the DRI2 buffer name for that allocation. This should be all that is needed, but unfortunately Mali does not seem to adhere to the basic DRI2 standards, which means that this doesn't work. The 2 main problems are:
For the first problem, a double-buffered DRI2 rendering client should always call DRI2GetBuffers in order to get the back buffer before starting to draw. While I can see that the command sequence is often GetBuffers SwapBuffers GetBuffers SwapBuffers... I also often see cases where it does GetBuffers SwapBuffers SwapBuffers SwapBuffers... This confuses the buffer reuse logic in the DRI2 implementation in the X server and results in the client and server disagreeing about which buffers are front and back at a given time.
For the second issue, I confirmed the problem by checksumming the buffers at different points. The old front buffer is reused as soon as the X driver's ScheduleSwap() function returns, which does not indicate that the swap has completed. The buffer is still on the screen for a while longer. But Mali draws to it right away resulting in a nasty visual glitch, corrected momentarily after when the swap completes.
Mali seems to have a bit of a fundamental misunderstanding with SwapBuffers. I already saw in Re: Mali deadlock with X server grab that Mali appears to create a dedicated thread in order to call SwapBuffers, which seems bizarre as SwapBuffers is an asynchronous operation (with completion later notified by the BufferSwapComplete event). But from this and the behaviour observed above, I guess Mali developers have misunderstood and implemented it as purely synchronous - i.e. it is a blocking function, and after it returns, the old front buffer is available for immediate reuse. I also see this comment in xf86-video-mali:
/*
* MaliDRI2ScheduleSwap is the implementation of DRI2SwapBuffers, this function
* should wait for vblank event which will trigger registered event handler.
* Event handler will do FLIP/SWAP/BLIT according to event type.
*
* Current DRM doesn't support vblank well, so this function just do FLIP/
* SWAP/BLIT directly, according to drawable information.
*/
Perhaps I could help here in regard to the comment "Current DRM doesn't support vblank well" - it certainly has not been a problem for other drivers, I'm sure we could find a solution for Mali too.
Making ScheduleSwap purely synchronous (as suggested in xf86-video-mali implementation) kills performance, as it causes the X server to block without processing any requests until a vblank occurs. Also on the calling side, I observed in Mali deadlock with X server grab that the rendering client also blocks waiting for the response, while holding a global Mali lock.
Could this be improved in future Mali-400 versions?
Has the situation changed at all in R4P0?
Thanks.
Daniel
No clarification needed, that's essentially how it works.
ARM are aware of the issues faced by developers such as yourself, and we are actively working to improve the situation. For example we have recently released a guide and driver set on malideveloper.arm.com for the Chromebook in an effort to get developers up and running with an FBDEV/X11 Linux environment on T604.
For what it's worth, HardKernel are ostensibly one of the best dev-board companies out there when it comes to software enablement/support, so I would recommend you also hit their forums with any issues you are facing.
If you were to move to developing your own board with pre-existing SoC, then there are likely many options available to you, for example licensing the DDK from ARM, or sub-licensing from your vendor. In the latter case, they would ideally provide you with a DDK pre-integrated . This is a conversation we can revisit if you ever move that way.
Hope this helps,
Chris