Co-Designing Memory Technology and ML Models for the Edge | Paul Whatmough

Abstract

Deep neural network (DNN) inference is a prevalent workload on edge devices that is currently driving up SoC performance specifications. One of the biggest challenges on resource constrained edge platforms is memory. In this talk, we will survey both conventional and emerging memory technologies relevant to chip design for edge ML, discussing power, performance and area trade-offs. We will also posit that it's essential to consider memory constraints not only during SoC design, but rather to emphasize the co-design of the hardware and the ML model itself.

Bio

Paul Whatmough received the B.Eng. degree in electronic communications engineering from the University of Lancaster in 2003, the M.Sc. degree in communications systems and signal processing from the University of Bristol in 2004, and the Doctorate degree in electronic engineering from University College London in 2012, all in the UK. From 2005 to 2008, he was a Research Scientist at Philips/NXP Research Labs, UK, working on hardware architecture and signal processing for software defined radio. From 2008 to 2015, he was with the Silicon R&D department at ARM Ltd., UK, working on hardware accelerators, digital signal processing (DSP), variation tolerance, supply voltage noise and circuits and systems for emerging IoT applications. From 2015 to 2017 he was a Research Associate at Harvard University, MA, leading inter-disciplinary research on machine learning. He currently leads research on hardware formachine learning at Arm ML Research group, Boston, MA, and is a part-time Associate at Harvard University.

Find out more

Visit https://www.esscirc-essderc2020.org/edge-ai-in-memory-computing to register