Summary
Is OpenCL support for the Mali-T628 (for example as found in the Exynos 5420 SoC on the Arndale Octa board) available? If so, how to set it up?
More details
According to the vendor, OpenCL should be supported, but the Arndale Octa Wiki does not state how this can be achieved.
I am using the latest Linaro developer build and installed Mali drivers that contain OpenCL libraries for Mali T604. According to this guide, the driver actually contains references to the Mali T628. So I tried to create the udev rule as specified, which is supposed to solve a permission problem with /dev/mali0, but I found that there is no /dev/mali0 on my installation at all. So my conclusion is that the driver indeed does not support T628.
When I execute a clinfo utility, clGetDeviceInfo returns CL_OUT_OF_HOST_MEMORY for some device properties. Why can I query the GPU for some characteristics, but does this fail for some others? When running a normal application, the same error appears when trying to create an OpenCL Context.
I was surprised to find this topic, where yoshi seems to have OpenCL working and can run benchmarks on his Arndale Octa board. How is this possible if there is no driver available? Or am I just missing something? I hope that you can help me to also establish a working OpenCL development environment.
Hi Chris,
Where can we find benchmark numbers for examples kernels provided with SDK?. We ran "sobel" filter example for 1024x1024 image, profile number measured across kernel trigger and wait measures as 5msec. And Laplace transform for 4096x2048 image, number measures as 10msec. Is these numbers expected for for Mali T628 @ 533Mhz GPU? We measured the numbered across clEnqueueNDRange() and immediate clFinish().
Thanks,
Veeranna
Hi Veerannah,
We don't have stock performance numbers for the SDK samples to hand. However what I will do later this week is run Sobel and Laplace on a platform here and get some numbers back to you.
HTH, Tim
Hi Tim,
We really appreciate your quick reply. Looking forward to see the performance numbers.
Hi Veeranna,
I ran the Sobel filter as provided by the Mali OpenCL SDK, modifying it to run a 1024x1024 image and I see the following...
Queued time: 0.11ms
Wait time: 0.222292ms
Run time: 0.680683ms
The Mali OpenCL SDK doesn't include a Laplace transform... which one are you using?
Best regards, Tim
I realise I didn't include the platform details. I'm running on a Mali-T604 platform with the GPU clocked @ 533MHz. I'm using a later version of the drivers than you, I believe, and that would likely mean the performance you would expect on your platform is slightly worse. On balance though I would expect the performance on your Mali-T628 system to be roughly equivalent to mine.
Thank you for providing information on performance number. With "cl_event" profiling we are getting numbers as
Wait time: 0.2725ms
Run time: 0.7586ms
But if we measure numbers across clEnqueueNDRangeKernel() and clfinish() using gettimeofday we are seeing number is 1.5msec. Is this API/driver overhead?
Is this API/driver overhead?
Probably, yes. When I try this I'm seeing the same sort of additional time using gettimeofday vs the profiling values output. With the Sobel sample I looked at the difference in this additional time when I increased the size of the image being processed. It appeared to stay roughly the same, which would back up the theory that this is due to additional overhead in the API.
Hope that helps,
Tim
Thank you for the information.
I cannot stress enough that if you are interested in real world performance, you should move away from these synthetic benchmarks and look at actual data for proper use cases. There are some real-world oriented benchmarks out there already, which already have Mali powered devices in their results.
I totally agree that synthetic benchmarks are no proper measure for real world performance, but these benchmarks do help to determine whether the environment is set-up properly, which is clearly not (yet) the case.
Have spoken to someone from the driver team, if you're only seeing one device then it must be an old driver, it's worth asking Insignal what their roadmap is for providing updates. On current drivers you will see 2 devices, one with 4 cores and one with 2.
By now the r4p0-02rel0 drivers are available. I tried running the benchmarks using them instead of the r3p0-02rel0 drivers that I used before, however the runtime fails to create an OpenCL context.
Someone opened a topic on the Insignal forum about the latest drivers, it seems that I have to compile my own kernel using the latest kernel driver to get it working.
Hi Bram,
Yes thats correct the old kernel driver will not work with the new userspace driver, the kernel needs recompiling with the new kernel driver.
Hth,
Chris
Hi Bramv,
Are you able use all 6 cores T628?. If you are succeeded can you give complete steps to builds required binaries.
No unfortunately not. I tried to run this image to get it up and running but the board wouldn't boot.
I am now running a linaro image using kernel 3.15.0-1-linaro-arndale-octa. But so far I haven't been able to build a mali kernel driver module myself either: Update: see end of post
root@arndale-octa:~/TX011-SW-99002-r4p1-00rel0/driver/product/kernel/drivers/gpu/arm/midgard# make -C /lib/modules/`uname -r`/build M=$PWD modules make: Entering directory `/usr/src/linux-headers-3.15.0-1-linaro-arndale-octa' Building modules, stage 2. MODPOST 0 modules make: Leaving directory `/usr/src/linux-headers-3.15.0-1-linaro-arndale-octa'
root@arndale-octa:~/TX011-SW-99002-r4p1-00rel0/driver/product/kernel/drivers/gpu/arm/midgard# make -C /lib/modules/`uname -r`/build M=$PWD modules
make: Entering directory `/usr/src/linux-headers-3.15.0-1-linaro-arndale-octa'
Building modules, stage 2.
MODPOST 0 modules
make: Leaving directory `/usr/src/linux-headers-3.15.0-1-linaro-arndale-octa'
No ko file is being generated.
The driver includes some scons files, but they do not work for me. First of all, a Sconstruct file is not included. When adding one myself and running scons, the environment is not recognized:
root@arndale-octa:~/TX011-SW-99002-r4p1-00rel0/driver/product/kernel/drivers/gpu/arm# scons scons: Reading SConscript files ... scons: *** Import of non-existent variable ''env'' File "/root/TX011-SW-99002-r4p1-00rel0/driver/product/kernel/drivers/gpu/arm/midgard/sconscript", line 20, in <module>
root@arndale-octa:~/TX011-SW-99002-r4p1-00rel0/driver/product/kernel/drivers/gpu/arm# scons
scons: Reading SConscript files ...
scons: *** Import of non-existent variable ''env''
File "/root/TX011-SW-99002-r4p1-00rel0/driver/product/kernel/drivers/gpu/arm/midgard/sconscript", line 20, in <module>
So I figured, let's fix this by replacing Import('env') by env = Environment(ENV = os.environ) but the dictionary class that scons uses does not handle non-existent keys the way the sconscript expects it:
scons: Reading SConscript files ... KeyError: 'v': File "/root/TX011-SW-99002-r4p1-00rel0/driver/product/kernel/drivers/gpu/arm/Sconstruct", line 1: SConscript('midgard/sconscript') File "/usr/lib/scons/SCons/Script/SConscript.py", line 609: return method(*args, **kw) File "/usr/lib/scons/SCons/Script/SConscript.py", line 546: return _SConscript(self.fs, *files, **subst_kw) File "/usr/lib/scons/SCons/Script/SConscript.py", line 260: exec _file_ in call_stack[-1].globals File "/root/TX011-SW-99002-r4p1-00rel0/driver/product/kernel/drivers/gpu/arm/midgard/sconscript", line 28: if env['v'] != '1': File "/usr/lib/scons/SCons/Environment.py", line 412: return self._dict[key]
KeyError: 'v':
File "/root/TX011-SW-99002-r4p1-00rel0/driver/product/kernel/drivers/gpu/arm/Sconstruct", line 1:
SConscript('midgard/sconscript')
File "/usr/lib/scons/SCons/Script/SConscript.py", line 609:
return method(*args, **kw)
File "/usr/lib/scons/SCons/Script/SConscript.py", line 546:
return _SConscript(self.fs, *files, **subst_kw)
File "/usr/lib/scons/SCons/Script/SConscript.py", line 260:
exec _file_ in call_stack[-1].globals
File "/root/TX011-SW-99002-r4p1-00rel0/driver/product/kernel/drivers/gpu/arm/midgard/sconscript", line 28:
if env['v'] != '1':
File "/usr/lib/scons/SCons/Environment.py", line 412:
return self._dict[key]
Any help on compiling this kernel module is highly appreciated!
Update:
I managed to compile the kernel module. It turns out that you don't need the scons files at all and can just use the included Kbuild and Makefile. Compiling mali_kbase.ko is a matter of running:
CONFIG_MALI_MIDGARD=m make
But inserting this module into the kernel does not provide a /dev/mali, neither does lshw detects a GPU. This means that we still cannot run OpenCL on the T628. How can we make the /dev/mali device show up?
/dev/mali should show as soon as you insmod a correctly built Mali kernel module. If /dev/mali is not appearing then the kernel module is not correctly built and configured for the platform. Even if you successfully build and insert a correct kernel module for the platform userspace functionality, such as OpenCL, will not be available without Mali userspace binaries matching the version of the Mali kernel module.
Can you confirm the kernel you are trying to compile with/against and the versions of Mali kernel and userspace you have available to you, as ARM have only currently released the r4p0 userspace binaries, though r4p1 should be available through the malideveloper site in the not too distant future.
Could you also confirm the steps you have taken to configure the Mali kernel source for use with the Arndale Octa board, especially as I can see you are trying to build it out of tree? You need to configure the kernel device tree and add platform specific configuration to the kernel to set up interrupts and memory addresses etc. The easiest way to integrate the latest version of the kernel would be to download the Linaro kernel for Arndale Octa and extract the r4p1 kernel source INTO the kernel source, overwriting the included r4p0 kernel source as this will keep the platform integration files intact. Looking at the kernel source you mentioned in your earlier post "II-arndale-octa" I can see that the platform integration files are present, for example in /drivers/gpu/arm/midgard/platform/5420. You could try extracting the r4p1 kernel source into this kernel tree and building that way.
Hope this helps,
Rich
Hi Rich,
In fact, I didn't configure anything at all but just ran:
And ended up with this module for the r4p1 kernel driver:
filename: /lib/modules/3.15.0-1-linaro-arndale-octa/kernel/drivers/gpu/arm/midgard/mali_kbase.ko version: r4p1-00rel0 license: GPL srcversion: 29BCA495EB0E26992A9C01E alias: of:N*T*Carm,mali-midgard* alias: of:N*T*Carm,malit6xx* depends: vermagic: 3.15.0-1-linaro-arndale-octa SMP mod_unload ARMv7 p2v8
filename: /lib/modules/3.15.0-1-linaro-arndale-octa/kernel/drivers/gpu/arm/midgard/mali_kbase.ko
version: r4p1-00rel0
license: GPL
srcversion: 29BCA495EB0E26992A9C01E
alias: of:N*T*Carm,mali-midgard*
alias: of:N*T*Carm,malit6xx*
depends:
vermagic: 3.15.0-1-linaro-arndale-octa SMP mod_unload ARMv7 p2v8
And this one for r4p0:
Next I downloaded the following kernel: linux-linaro-3.15-2014.06, because it seems to be the closest match to the running kernel: 3.15.0-1-linaro-arndale-octa. I enabled the Mali options in the kernel configuration:
Enable Mali GPU support in Gator -Mali-400MP or Mali-450MP +Mali-T604 or Mali-T658 Path to Mali driver: drivers/gpu/arm/midgard
Enable Mali GPU support in Gator
-Mali-400MP or Mali-450MP
+Mali-T604 or Mali-T658
Path to Mali driver: drivers/gpu/arm/midgard
To do so I had to enable some timers and performance events options as well. The path points to the r4p0 driver that I copied into the kernel tree. Why does the configuration option only mention T604 and T658, and not T628?
After running "make modules" I had once again a mali_kbase.ko, but the result is the same as for the out-of-tree build: no /dev/mail.
The next step was to build and flash the full kernel. Apparently the kernel I used is not fully compatible since the -arndale-octa postfix is missing. As a result I had to modify some filenames to get the new kernel installed. Unfortunately this didn't work, the board does not boot any more. I have been looking for the proper linaro arndale-octa kernel source so that I can try it again, but haven't found it yet since it appears that only the binary hwpack and rootfs are available. Will be continued.
There is a newer Linaro kernel and full Ubuntu binary images with Mali r4p0-02rel0 driver already integrated for the Arndale board. If you want to run some applications using the Mali-T604 GPU and you just need a working Arndale system, you can download a full Ubuntu binary image from the 14.08 Linaro Releases.
The r4p1 user-side binary drivers will soon be released on our public download page mentioned previously, but in the meantime the latest version you can use is r4p0. If you're interested in rebuilding the Linux kernel for other reasons than upgrading the Mali driver, you can get the Linaro source code from Linaro Git Hosting - gwg/linaro-lsk.git/shortlog. The commit used in the binary release is 14c58eb6 and you'll need to generate the kernel configuration file using a script and the following fragments (see Linaro documentation for more details):
linaro/configs/linaro-base.conf linaro/configs/distribution.conf
linaro/configs/arndale_octa.conf linaro/configs/lt-arndale_octa.conf
linaro/configs/mali-arndale-octa.conf
Hope this helps!
Best wishes,
Guillaume