This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Chrombook OpenCL Development issues

Hi all,

I have gotten a Samsung Chromebook model number Xe303c12 and I am trying to get some OpenCL code running.

I have followed the "Graphics and Compute Development on Samsung Chromebook" (including disabling CONFIG_SECURITY_CHROMIUMOS) and have been able to boot to linux on the Chromebook.  My issue is that I see no network at all, either via the apple USB dongle or the wireless.  The devices don't even seem to exist so far as linux is concerned.  Trying to install the USB device via modprobe gives an "Operation not permitted" error on usbnet.ko.

Has anyone gotten networking to work?  If so how? 

If anyone is willing to share a working sdcard image that would be fantastic.  To be honest I got sick of building the linux kernel sometime in the 90s.

Thanks.

--Mike

P.S.  I don't see OpenCL headers.  Can I just copy them from a working system?

Parents
  • Hi Mike,


    > Everything is just stored in global memory, so this may explain things


    GPUs are designed to be latency tolerant, so where things are stored is a little less critical than CPUs, so although lower latency memory will always help it's isn't usually necessary.


    > For performance, do I need to use the OpenCL vector load and math operations


    Ideally yes - the Mali-T600 is a vector architecture with SIMD maths units, which is different to many other GPU architectures.


    Where Mali excels is that our SIMD units are very flexible and very wide - if you only need int8 or int16 data for your kernel we can process 16 or 8 elements per SIMD unit per clock cycle (i.e. we have a 128-bit data path and you can carve that up into 8, 16, or 32-bit lanes). If you need floating point I believe the current drivers we have available off the website are only exposing fp32, but we are adding the half-float extension support in our next driver release.


    While the compiler can auto-vectorize (and it is getting better at doing so, so we hope to improve here) there needs to be enough work in a work-item to fill the SIMD lanes, and auto-vectorization is relatively fiddly in any compiler, so it is always more reliable if you use the built-in functions.


    You may want to try downloading the ARM DS-5 Community Edition - this supports the Streamline profiling tool which includes support for capturing and displaying the GPU hardware performance counters. This should help you indicate where your GPU cycles are being spent, including some measure of the efficiency of the GPU interaction with main memory. We have an optimization guide which includes some hints on what counters you want to look at for different types of problem, but if you have any questions please shout:


    Mali GPU Application Optimization Guide v3.0 « Mali Developer Center


    Kind regards,
    Pete

Reply
  • Hi Mike,


    > Everything is just stored in global memory, so this may explain things


    GPUs are designed to be latency tolerant, so where things are stored is a little less critical than CPUs, so although lower latency memory will always help it's isn't usually necessary.


    > For performance, do I need to use the OpenCL vector load and math operations


    Ideally yes - the Mali-T600 is a vector architecture with SIMD maths units, which is different to many other GPU architectures.


    Where Mali excels is that our SIMD units are very flexible and very wide - if you only need int8 or int16 data for your kernel we can process 16 or 8 elements per SIMD unit per clock cycle (i.e. we have a 128-bit data path and you can carve that up into 8, 16, or 32-bit lanes). If you need floating point I believe the current drivers we have available off the website are only exposing fp32, but we are adding the half-float extension support in our next driver release.


    While the compiler can auto-vectorize (and it is getting better at doing so, so we hope to improve here) there needs to be enough work in a work-item to fill the SIMD lanes, and auto-vectorization is relatively fiddly in any compiler, so it is always more reliable if you use the built-in functions.


    You may want to try downloading the ARM DS-5 Community Edition - this supports the Streamline profiling tool which includes support for capturing and displaying the GPU hardware performance counters. This should help you indicate where your GPU cycles are being spent, including some measure of the efficiency of the GPU interaction with main memory. We have an optimization guide which includes some hints on what counters you want to look at for different types of problem, but if you have any questions please shout:


    Mali GPU Application Optimization Guide v3.0 « Mali Developer Center


    Kind regards,
    Pete

Children
No data