GenASM: A Low-Power, Memory-Efficient Approximate String Matching Acceleration Framework for Genome - Damla Senol Cali, Carnegie Mellon University

This talk was presented at the virtual Arm Research Summit, September 9-11, 2020. This year's event explored global technology challenges across sustainability, security, and society, and attracted delegates from around the world for three days of innovative content.

Genome sequence analysis has the potential to enable significant advancements in areas such as personalized medicine, outbreak tracing, and evolution. To perform genome sequencing, devices extract small fragments (known as reads) of an organism’s DNA, and a computational process known as genome assembly must be used to reassemble the fragments into a complete human-readable DNA sequence. Unfortunately, rapid genome sequencing is currently bottlenecked by the computational power and memory bandwidth limitations of existing systems, as many of the steps in genome assembly must process a large amount of data. The largest contributor to this bottleneck is approximate string matching (ASM), which is used at multiple points during the assembly process. While ASM enables genome assembly to account for sequencing errors and mutations in the reads, many ASM algorithms scale poorly for larger sequences. Damla proposes GenASM, an ASM acceleration framework for genome sequence analysis. GenASM performs bitvector-based ASM, which can efficiently accelerate multiple steps of genome sequence analysis. Damla modified the underlying ASM algorithm (Bitap) to significantly increase its parallelism and reduce its memory footprint. Using this modified algorithm, she designed the first hardware accelerator for Bitap.

We hope to see you at the Arm Research Summit 2021, set to take place in the UK. Visit arm.com/summit to stay up to date and register your interest to attend or submit your work!

Anonymous