This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

bad performance on 3.8 kernel

Note: This was originally posted on 18th June 2013 at http://forums.arm.com

Mali 400 on an exynos-based board:

with 3.0 kernel, EGL working fine, with up to 600fps in es2gears

ported drivers to 3.8 kernel, and mali acceleration working, however, the performance is roughly 50%.

I have debugged the issue at the gp job start wrapper - _mali_ukk_gp_start_job, which is called now 50% more times than on the 3.0 kernel...

Here is a comparison between the 2 kernels:

1) with SKIP_GP_JOBS and retuning the job straight away from _mali_ukk_gp_start_job, both 3.0 and 3.8 kernel results in the same number of mali_ioctl calls and the same performance - 650fps in es2gears
2) i modified es2gears to stop after 600 frames and here are my results (from bottom to top):

      GP jobs actually done - calls to "mali_gp_job_start": 299 on 3.0 kernel, 302 on 3.8 kernel
      calls to mali_group_start_gp_job (which calls mali_gp_job_start): 299 on 3.0, 302 on 3.8 kernel
      executions of mali_gp_scheduler_schedule (which calls mali_group_start_gp_job): 299 on 3.0, 302 on 3.8 kernel -- appears as "mali_gp_scheduler_schedule() {" in ftrace
      calls to mali_gp_scheduler_schedule: 0 on 3.0, 299 on 3.8 kernel -- appears as "mali_gp_scheduler_schedule();" in ftrace
     
      system calls served (mali_ioctl) : 960 on 3.0 kernel, 1373 on 3.8 kernel

results: ~600fps on 3.0 kernel, ~380fps on 3.8 kernel

So the conclusion is that the slowdown is due to a much larger number (almost double) of mali_ioctls for MALI_IOC_GP2_START_JOB.

Since I don't have the code for libMali to debug why exactly it's making so many syscalls, I hope somebody here can help me and give me an idea where to look.

A strange thing is the job numbers assigned.
In the 3.0 kernel, they are all multiples of 4, like: Mali GP scheduler: Job 2405 (0xE6581B80) queued; 2409, 2413, 2417, 2421, 2425, ...
In the 3.8 kernel, they increment either by 2, 4 or 6: 8825, 8829, 8833, 8835, 8841, 8843, 8849, 8853, ...
0