Arm Community
Arm Community
  • Site
  • User
  • Site
  • Search
  • User
Arm Community blogs
Arm Community blogs
Mobile, Graphics, and Gaming blog OpenCL with ARM Mali: GPU Computing... with no compromises
  • Blogs
  • Mentions
  • Sub-Groups
  • Tags
  • Jump...
  • Cancel
More blogs in Arm Community blogs
  • AI blog

  • Announcements

  • Architectures and Processors blog

  • Automotive blog

  • Embedded and Microcontrollers blog

  • Internet of Things (IoT) blog

  • Laptops and Desktops blog

  • Mobile, Graphics, and Gaming blog

  • Operating Systems blog

  • Servers and Cloud Computing blog

  • SoC Design and Simulation blog

  • Tools, Software and IDEs blog

Tags
Actions
  • RSS
  • More
  • Cancel
Related blog posts
Related forum threads

OpenCL with ARM Mali: GPU Computing... with no compromises

Roberto Mijat
Roberto Mijat
September 11, 2013
2 minute read time.

ARM has just announced their submission to Khronos for their OpenCL 1.1 Full Profile conformance tests results for the MaliTM-T604 GPU[1]. This makes ARM the first GPU IP Vendor to submit results for Full Profile. Yes, Full Profile, not Embedded Profile. We did not see the need to compromise, and there are many good reasons, which I'll try to explain in this blog.

Full Profile is the ubiquitous implementation today: all desktops, laptops and servers that support OpenCL implement Full Profile. This means that all OpenCL developers produce code targeting Full Profile. This is what they are accustomed to, ever since OpenCL was first introduced over 4 years ago.

Existing OpenCL software has been written for Full Profile platforms, therefore assuming the presence of a large amount of features that are optional in Embedded Profile. The list of optional features is detailed in the Khronos specification. The most important ones include:

   

  • Native support for 64-bit integer maths (including vector data types and operations). Implementing 64-bit arithmetic directly in hardware on the Mali-T600 Series of GPUs makes it radically faster and more efficient than software emulation. This helps in areas such as pointer arithmetic for large address spaces (making you future proof for the post 4 GByte world), and can benefits many applications including multimedia encoders, decoders and encryption software.
  • Hardware accelerated support for 3D images. Great for volumetric modelling.
  • Compliance to IEEE 754-2008 precision. That means your code will get the same accuracy on a Mali-T600 Series GPU as any other Full Profile conformant platform.
  • Built-in atomic functions accelerated in hardware. OpenCL is about parallel computation, a world where atomic operations (and barriers/fences – these are also implemented in hardware on Mali-T600) are commonplace. Resolving this in hardware is far more efficient that any sort of emulation or expensive external memory synchronization.

  
The ARM OpenCL compiler provides support for both online and offline compilation, and is suitable for both limited-resources mobile devices and larger plugged-in compute systems. Offline compilation can be used as development tool to help programmers optimize their code.

An OpenCL Full Profile driver is more attractive to developers. It makes life easier by providing a functional baseline: new developers do not need to worry about which features are supported and which are not, and what performance you may get, and what precision may be supported. Full Profile lowers the entry point and reduces cost for developers.

And if you have Embedded Profile code – not a problem, this will of course run seamlessly on all the Mali-T600 family of GPUs.

Well, we are very excited about it, and today we are proud to announce this landmark achievement: The Mali-T600 series of processors bring mainstream GPU Computing to the embedded and mobile markets... with no compromises.

1 Product is based on a published Khronos Specification, and is expected to pass the Khronos Conformance Testing Process. Current conformance status can be found at www.khronos.org/conformance.

Anonymous
  • Guillaume FORTAINE
    Guillaume FORTAINE over 11 years ago
    Hello,

    For your information :

    [url="http://www.arndaleboard.org/wiki/index.php/Introduction"]http://www.arndalebo...hp/Introduction[/url]

    "Arndale Board offer World 1st ARM Cortex-A15 dual core CPU speed, remarkable ARM Mali-T604 3D GPU performance (72GFLOPS) with OpenGL® ES and OpenCL™ and World’s highest resolution support (24bit 2560X1600@60fps)."

    Best Regards,

    Guillaume FORTAINE
    • Cancel
    • Up 0 Down
    • Reply
    • More
    • Cancel
  • rahul garg
    rahul garg over 11 years ago
    Great news. Looking forward to the availability of the OpenCL SDK for Mali T600 series (and of course devices using the GPU). Also hoping for low-cost dev boards.
    • Cancel
    • Up 0 Down
    • Reply
    • More
    • Cancel
Mobile, Graphics, and Gaming blog
  • What is Arm Performance Studio?

    Jai Schrem
    Jai Schrem
    Arm Performance Studio gives developers free tools to analyze performance, debug graphics, and optimize apps on Arm platforms.
    • August 27, 2025
  • How Neural Super Sampling works: Architecture, training, and inference

    Liam O'Neil
    Liam O'Neil
    A deep dive into a practical, ML-powered approach to temporal super sampling.
    • August 12, 2025
  • Start experimenting with Neural Super Sampling for mobile graphics today

    Sergio Alapont Granero
    Sergio Alapont Granero
    Laying the foundation for neural upscaling to enable sharper, smoother, AI-powered gaming on next-generation Arm GPUs.
    • August 12, 2025