• Optimizing a NVIDIA CUDA ML Inference Application with Arm Forge

    David Lecomber
    David Lecomber

    With NVIDIA’s recent announcement of upcoming support for standalone NVIDIA GPUs for Arm servers, the Arm Forge team is excited to be bringing its leading developer tools to support this platform too.

    In advance of the full release, we preview an…

    • 7 months ago
    • High Performance Computing
    • HPC blog
  • Optimised OpenCL SGEMM implementation for ARM Mali Midgard GPUs.

    abhi.verma
    abhi.verma

    I wish to implement an optimised sgemm for Mali MidGard Gpu whichas of now only support OpenCL 1.2.  As far as I know, OpenCL 1.2 doesn't support subgroup extensions and Mali GPUs don't have any benefits for local memory tiling. So What should be the best…

    • 8 months ago
    • Graphics and Gaming
    • Graphics and Gaming forum
  • Characterization of Multi-threaded HPC Codes

    Josh Randall
    Josh Randall

    Core counts continue to increase for High-Performance Computing (HPC) systems, but multiple factors may prevent current software from fully utilizing the increased available thread count. Inter-thread communication and serialized execution may hamper…

    • 9 months ago
    • Arm Research
    • Research Articles