Smart Atomic Memory Operations and Advanced Sparse Data Structures at BSC

December 14, 2021

4 minute read time.

Arm supports academic research and recognizes the importance of maintaining strong links between academia and industry for continued research innovation. Our Centers of Excellence (CoEs) have been established by Arm Research Collaborations to broaden research opportunities and strengthen links with the academic communities. Part of this includes sponsoring students completing their PhDs. In this blog series, we look at the inspirational work taking place at these institutions, fuelling research success and collaboration between academia and industry.

The Barcelona Supercomputing Center (BSC) – Arm Research CoE was formed in 2019. It recognizes BSC’s leadership in pioneering Arm in High Performance Compute (HPC), an essential tool for international competitiveness in science and engineering. BSC has been fostering HPC in Spain and across Europe since its establishment in 2005, and their work in the center focuses on four application areas: Computer Sciences, Life Sciences, Earth Sciences, and Computer Applications in Science and Engineering. Here, we explore the work of two PhD students at BSC whose work Arm is supporting and supervising.

Víctor Soria Pardos | Enabling Smart Atomic Memory Operations

Arm supervisor: Alex Rico

Atomic Memory Operations (AMO) are critical instructions that are used for fine grain synchronization in multithreaded applications. As the core count of processors increases, the weight of these instructions is growing significantly. That is why Arm has recently incorporated the concept of remote or "far AMO". This means that operations are executed at the Last Level Cache instead on the traditional First Level Cache ("near AMO"). Our work aims to optimize how "near" and "remote" AMO are executed, accelerating multithreaded applications.

Víctor gained a BSc from the University of Zaragoza, and has completed his M.Sc degree at Universitat Politècnica de Catalunya. He has been working as a Research Engineer at BSC before starting a PhD in Computer Architecture.

Arm HPC and data center processors have an increasing amount of processing elements that expose hundreds of threads of parallelism. One of the challenges in parallel compute is achieving effective synchronizations such as locks, barriers, and collectives. Creating smart atomic memory operations and effective heuristics for these operations will enable faster inter-thread communication and better scaling of parallel codes, which is essential for HPC and data center systems.

Alex Rico, Principal Research Engineer, Arm

Marco Siracusa | Hardware Support for Advanced Sparse Data Structures

Arm supervisor: Mark Nutter

Sparse computation is nowadays at the core of many fundamental domains ranging from Machine Learning to Genome Analysis. As most of these workloads are bound by the off-chip memory bandwidth, it is essential to reduce the data movement of the application in order to improve performance and energy consumption. Therefore, we are devising storage-efficient sparse formats that can fraction the memory footprint and bandwidth requirements of sparse applications. In addition, since the extra work needed to manipulate these data layouts may overload the pipeline of todays CPUs and limit overall performance, we are designing specific engines to which we can to offload the (de)compression phase. The end goal is to make these engines programmable and allow users to define custom formats and operations specifically tailored for their domains of interest.

Marco Siracusa received a M.Sc. in 2020 from Politecnico di Milano and is now pursuing a PhD in Computer Architecture with Universitat Politècnica de Catalunya and BSC. During his career, Marco has joined and led several research projects mainly focused in optimizing applications, frameworks, compilers, and architectures for scientific computing. Marco is now interested in devising next-generation architectures and programming methodologies for sparse computation and other HPC workloads.

“Arm HPC and data center processors play an increasing role in processing large, sparse datasets found in fundamental domains ranging from Graph Analytics to Machine Learning. Among the challenges in processing sparse data is that it has very low compute to byte-transferred ratio and further exhibits poor spatio-temporal locality. Creating storage-efficient sparse formats and devising engines to offload the (de)compression is essential to improve performance and energy efficiency for HPC and data center systems."

Mark Nutter, Principal Research Engineer, Arm

Arm Research Collaborations

Our Centres of Excellence are one way which we collaborate with academia and industry. In a previous blog post, we explored the projects of two other BSC students who were investigating genome sequencing and smart memory controllers. Find out more about our Collaborations team and how we are helping shape the industry. We also support academic research through our Research Enablement program, which provides free, easy access to IP, tools, and support, to help drive successful research projects.

Explore Research Collaborations Discover Research Enablement

Research Articles

Smart Atomic Memory Operations and Advanced Sparse Data Structures at BSC

Víctor Soria Pardos | Enabling Smart Atomic Memory Operations

Arm supervisor: Alex Rico

Marco Siracusa | Hardware Support for Advanced Sparse Data Structures

Arm supervisor: Mark Nutter

Arm Research Collaborations

More from our Centres of Excellence

HOL4 users' workshop 2025

TinyML: Ubiquitous embedded intelligence

To the edge and beyond