Arm Community
Arm Community
  • Site
  • User
  • Site
  • Search
  • User
  • Groups
    • Arm Research
    • DesignStart
    • Education Hub
    • Graphics and Gaming
    • High Performance Computing
    • Innovation
    • Multimedia
    • Open Source Software and Platforms
    • Physical
    • Processors
    • Security
    • System
    • Software Tools
    • TrustZone for Armv8-M
    • 中文社区
  • Blog
    • Announcements
    • Artificial Intelligence
    • Automotive
    • Healthcare
    • HPC
    • Infrastructure
    • Innovation
    • Internet of Things
    • Machine Learning
    • Mobile
    • Smart Homes
    • Wearables
  • Forums
    • All developer forums
    • IP Product forums
    • Tool & Software forums
  • Support
    • Open a support case
    • Documentation
    • Downloads
    • Training
    • Arm Approved program
    • Arm Design Reviews
  • Community Help
  • More
  • Cancel
Software Tools
  • Developer Community
  • Tools and Software
  • Software Tools
  • Jump...
  • Cancel
Software Tools
Tools, Software and IDEs blog Mali SDK supporting compilation of Kernels written in C++ for OpenCL
  • Tools, Software and IDEs blog
  • Forums
  • Videos & Files
  • Jump...
  • Cancel
More blogs in Software Tools
  • Tools, Software and IDEs blog

Tags
  • Kernel Developers
  • C++
  • Mali SDKs
  • Mali OpenCL SDK
Actions
  • RSS
  • More
  • Cancel
Related blog posts
Related forum threads

Mali SDK supporting compilation of Kernels written in C++ for OpenCL

Anastasia Stulova
Anastasia Stulova
December 14, 2020

The Mali Software Development Kit (SDK) facilitates the development of OpenGL ES, Vulkan, or OpenCL applications for Mali-based platforms. While OpenGL and Vulkan are predominantly used for graphics, OpenCL provides ability to execute applications from various domains on Mali GPUs, such as image processing, machine learning (ML), computer vision (CV) and many others. The main concept behind OpenCL is to accelerate data parallel computation on a device with high compute power. If applications are suitable for the OpenCL acceleration, the increase in speed-up can be orders of magnitude compared to conventional execution. Any OpenCL program has a host side where regular computations are executed typically on a CPU and device side where the compute intensive kernels are sent from the host for acceleration. On the host side developers could use C, C++ and even Python from very early versions of the standard. However, the kernel core accelerated on the device had to be written in OpenCL C derived from C99.

Last year a new kernel language C++ for OpenCL was developed in upstream LLVM. This allows the use of most C++17 features in the OpenCL kernel code – more details about this are in the previous blog. The kernels in C++ for OpenCL language could be compiled offline using open-source tools and the binaries in SPIR-V intermediate format could be imported by OpenCL applications running with existing OpenCL drivers. Now, we are pleased to announce that with the latest release of Mali SDK developers can benefit from the online and offline compilation of kernel written in C++ for OpenCL using the cl_ext_cxx_for_opencl extension. This has been recently published on the Khronos website.

Arm is the first vendor providing support of this new extension in its SDK. While this is great news for developers, this is still in the experimental phase. Some features, such as program scope objects with non-trivial conductors or destructors, are not yet supported at the time this blog was published.

About C++ for OpenCL

C++ for OpenCL provides the ability to use most of the modern C++ features from the C++17 standard in OpenCL kernels. This improves the programmer productivity of applications with ever growing complexity running on GPUs. Another great benefit is that C++ for OpenCL is backward compatible with OpenCL C. This means existing applications can be migrated smoothly to C++ features and developers can continue to use familiar OpenCL programming concepts and tools. Overall, the kernel code written in C++ for OpenCL looks just like code written in OpenCL C with some extra C++ features available for convenience.

C++ for OpenCL kernels can leverage many C++ specific compiler optimizations leading to competitive performance on OpenCL devices. The community-defined language documentation is hosted on the OpenCL-Docs repository on GitHub along with other Khronos specifications for OpenCL. The latest published revision can be found here.

You can find out more about the C++ for OpenCL programming language from the slide deck PDF and video of the talk presented at IWOCL this year.

Implementing kernels with C++ features

The following code is a snippet illustrating how to implement kernels with complex number arithmetic using C++ features. The full example can be found in Code Explorer.

// Define a class - Complex, that can perform complex number arithmetic

// with various precision when different types for ‘T’ are used - double, float, half...

template<typename T>

class complex_t {

T m_re; // Real component.

T m_im; // Imaginary component.

public:

complex_t(T re, T im): m_re{re}, m_im{im}{};

complex_t operator*(complex_t &other)

{

    return {m_re * other.m_re - m_im * other.m_im,

                  m_re * other.m_im + m_im * other.m_re};

}

int get_re() { return m_re;}

int get_im() { return m_im;}

};

 

// A kernel function to compute multiplication over complex numbers read from

// the input buffer and to store the result into the output buffer.

kernel void compute_helper(global float *in, global float *out) {

    auto idx = get_global_id(0);

    // Every work-item uses 4 consecutive items from the input buffer -

    // two for each complex number.

    auto offset = idx * 4;

    complex_t num1{in[offset], in[offset + 1]};

    complex_t num2{in[offset + 2], in[offset + 3]};

    // Perform complex number multiplication.

    complex_t res = num1 * num2;

    // Every work-item writes 2 consecutive items to the output buffer.

    out[idx * 2] = res.get_re();

    out[idx * 2 + 1] = res.get_im();

}

About offline compilation

The offline compilation of kernels written in C++ for OpenCL has been available through open-source tooling from September 2019. This was when Clang 9.0 was released with the experimental support of C++ for OpenCL in upstream llvm-project. Kernels can be compiled to SPIR-V format offline following the flow explained in this blog on the Khronos website. The SPIR-V binary can then be loaded by OpenCL applications running on OpenCL 2.0 or later version drivers using regular clCreateProgramWithIL API call.

Developers, try for yourself

We recommend developers experimenting with the C++ for OpenCL kernel language in their applications. If you have any feedback regarding the new kernel language, its support in tooling or Mali SDK in general, then we would like to hear from you.

Learn more about Mali SDK

Anonymous
Tools, Software and IDEs blog
  • Product update: Arm Development Studio 2021.0 now available

    Ronan Synnott
    Ronan Synnott
    Arm has released the latest updates to Arm Development Studio, with the 2021.0 release of the Gold, Silver, and Bronze editions, and 2021.a Platinum Edition.
    • April 1, 2021
  • Tencent is Working to Expand Arm’s Presence in its Cloud

    Frank Zou
    Frank Zou
    Tencent is laying the foundation to grow the presence of Arm in its cloud.
    • March 3, 2021
  • Updated AArch64 Docker images for PyTorch and TensorFlow

    Jason Andrews
    Jason Andrews
    Docker images for TensorFlow and PyTorch running on Ubuntu 18.04 for Arm are now available. This article explains the details to build and use the Docker images for TensorFlow and PyTorch on Arm. Tens…
    • February 23, 2021