Arm Community
Arm Community
  • Site
  • User
  • Site
  • Search
  • User
Arm Community blogs
Arm Community blogs
Mobile, Graphics, and Gaming blog Introducing PyOpenCL
  • Blogs
  • Mentions
  • Sub-Groups
  • Tags
  • Jump...
  • Cancel
More blogs in Arm Community blogs
  • AI blog

  • Announcements

  • Architectures and Processors blog

  • Automotive blog

  • Embedded and Microcontrollers blog

  • Internet of Things (IoT) blog

  • Laptops and Desktops blog

  • Mobile, Graphics, and Gaming blog

  • Operating Systems blog

  • Servers and Cloud Computing blog

  • SoC Design and Simulation blog

  • Tools, Software and IDEs blog

Tell us what you think
Tags
  • OpenCL
  • pyopencl
  • python
Actions
  • RSS
  • More
  • Cancel
Related blog posts
Related forum threads

Introducing PyOpenCL

Anton Lokhmotov
Anton Lokhmotov
March 27, 2014
2 minute read time.

I will start my blog series with a subseries on technology that I know and come to love best - OpenCL from the Khronos Group. OpenCL aims to provide functional portability across computer systems comprised of general-purpose processors (e.g. CPUs) and special-purpose accelerators (e.g. GPUs), to which I will now and then refer to as accelerated systems.

OpenCL is often criticised for presenting a steep learning curve for software engineers but I don't think that's fair: the verbosity of the OpenCL API is really a consequence of its generality. Once you have written a couple of OpenCL programs, you realise that most of the code running on the host processor is actually boilerplate.

I will sidestep this verbosity of OpenCL altogether by using PyOpenCL - a neat Python module written by Andreas Klöckner. (If you are reading this Andreas, keep up the good work!)

Just install PyOpenCL and NumPy - and you are ready to roll!

In [1]:

import pyopencl as cl

import numpy as np

Trivial example

Suppose we want to create an array containing integers from 0 to 15.

In [2]:
N = 16

That's trivial using NumPy:

In [3]:

np_range = np.arange(N, dtype=np.int32)

np_range

Out[3]:

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15], dtype=int32)

but our PyOpenCL example will fill in a similar array using OpenCL:

In [4]:

cl_range = np.zeros(N, dtype=np.int32)

cl_range

Out[4]:

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=int32)

Gimme some context!

Creating a context could hardly be easier:

In [5]:
context=cl.create_some_context()

Ditto creating a command queue:

In [6]:
queue=cl.CommandQueue(context)

Building and running

An OpenCL C program equivalent to np.arange(N) follows:

In [7]:

source = '''

kernel void arange(global int * buffer)

{ 

    const size_t gid = get_global_id(0); 

    buffer[gid] = convert_int(gid);

}

'''

The kernel will be launched as N work-items over a one-dimensional range [0, N-1]. Each work-item will get its unique index gid in the range (that is, an integer between 0 and N-1 inclusive) and write it into argument buffer at offset gid.

Let's build the program:

In [8]:
program = cl.Program(context, source).build()

allocate a memory buffer:

In [9]:
memory_flags = cl.mem_flags.WRITE_ONLY | cl.mem_flags.ALLOC_HOST_PTR
memory = cl.Buffer(context, flags=memory_flags, size=cl_range.nbytes)

launch the kernel:

In [10]:
kernel = program.arange(queue, [N], None, memory)

and copy the results from the buffer to cl_range:

In [11]:

cl.enqueue_copy(queue, cl_range, memory, wait_for=[kernel])

cl_range

Out[11]:

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15], dtype=int32)

Finally, let's confirm that arrays np_range and cl_range match element-wise:

In [12]:
np.all(np_range == cl_range)
Out[12]:

True

Concluding remarks

I am very interested in getting feedback. I am aiming primarily at software engineers already familiar with OpenCL so I can jump straight to more advanced topics. But if you feel you like a little more explanation, just ask me: I will be happy to explain, or point you to some excellent learning resources out there, or indeed consider filling in details in my later posts...

Also, I prepared this post using the wonderful IPython Notebook environment ("Introducing PyOpenCL" as IPython notebook). I haven't seen much in the way of using PyOpenCL and IPython together, so would be very grateful for any links.

In my next post, I start discussing how to optimise OpenCL kernels for the ARM Mali-T600 and Mali-T700 GPU series.

Anonymous
  • Chris Varnsverry
    Chris Varnsverry over 11 years ago

    FYI there are also Mali based odroid boards available

    • Cancel
    • Up 0 Down
    • Reply
    • More
    • Cancel
  • Tom Gall
    Tom Gall over 11 years ago

    The ARM based Samsung Chromebook with a Mali T604 is a good choice.  I suspect the upcoming Chromebook 2 will also be a great choice but given the hardware isn't available yet it's too early. The Arndale board is also a good choice.

    I like the odroid boards but like the odroid-xu for instance doesn't include OpenCL support for Linux unless you sign an NDA with Imagination.

    • Cancel
    • Up 0 Down
    • Reply
    • More
    • Cancel
  • Tarik Saidani
    Tarik Saidani over 11 years ago

    Is there any affordable ARM Mali GPU/ OpenCL enabled that can be used for development?

    • Cancel
    • Up 0 Down
    • Reply
    • More
    • Cancel
Mobile, Graphics, and Gaming blog
  • Optimizing 3D scenes in Godot on Arm GPUs

    Clay John
    Clay John
    Exploring advanced mobile GPU optimizations in Godot using Arm tools like Streamline and Mali Offline Compiler for real-world performance gains.
    • July 10, 2025
  • Optimizing 3D scenes in Godot on Arm GPUs

    Clay John
    Clay John
    In part 1 of this series, learn how we utilized Arm Performance Studio to identify and resolve major performance issues in Godot’s Vulkan-based mobile renderer.
    • June 11, 2025
  • Bringing realistic clothing simulation to mobile: A new frontier for game developers

    Mina Dimova
    Mina Dimova
    Realistic clothing simulation on mobile—our neural GAT model delivers lifelike cloth motion without heavy physics or ground-truth data.
    • June 6, 2025