[/size]Thanks for the reply.I have tried both a Debian Wheezy image (http://forum.odroid.....php?f=9&t=1608) with LXDE and a Ubuntu 13.04 image with XFCE (not Linaro).The kernel is compiled straight from the hardkernel repository (https://github.com/h...ee/odroid-3.8.y), the drivers are in drivers/gpu/arm/ (https://github.com/h...drivers/gpu/arm) and have been integrated by the maintainer, working but having bad performance. The framebuffer driver I think it's at https://github.com/h.../videobuf2-fb.cI have been looking for the cause of the performance drop and debugged the drivers with ftrace, and found the issue described above: there is one GP job / ioctl started, whereas in the 3.0 kernel there are 2 GP jobs/ioctl started.My understanding is that in 3.0 you have:ioctl -> GP start job from frame register 1 -> schedule job -> submit job -> (...libMali.so binary blob...) -> send job to user -> GP start job frame register 2 -> schedule job -> submit job -> (...libMali.so binary blob...) -> send job to user -> end ioctl -> new ioctl -> repeatwhile in 3.8 the behaviour is:ioctl -> GP start job from frame register 1 -> schedule job -> submit job -> (...libMali.so binary blob...) ->send job to user -> end ioctl -> new ioctl -> repeat...... (in the meantime) ioctl -> GP start job from frame register 2 -> schedule job -> slot busy -> end ioctl -> new ioctl -> repeatthis gives 2x ioctls and 2x more locks , and scheduling for frame register 2 always results in slot busy, so you have an ioctl wasted just for putting a job in the queue. The mali code is exactly the same as before, so that's not the issue. The maintainer also thinks the UMP integration is at fault, but can't find a root cause. I was hoping for somebody here to have a better idea what exactly is causing this.As a side note, es2gears gives ~300fps on Ubuntu+XFCE(even worse with compositor enabled) and ~600 on Debian+LXDE. I did not look at the xorg server version, just went on with the Debian image