University of Manchester professor Steve Furber has always been interested in modeling the connectivity of the brain. At the end of the last millennium, he had the bright idea to build a computer that could support real-time models of brain subsystems. This would have the potential to significantly impact both brain and computer science.
The research that followed eventually spawned SpiNNaker1 (Spiking Neural Network Architecture), a massively parallel, manycore supercomputer architecture that mimicked the interactions of biological neurons. In SpiNNaker1, Professor Furber’s team had successfully connected a million mobile phone processors together in one machine – a computer that operated in some ways like a brain.
The team has since teamed up with Technische Universität (TU) Dresden to create a sequel: SpiNNaker2, which promises to drive the next generation of artificial intelligence (AI). AI is already making inroads into our daily lives. However, even the best AI hardware is still far from the energy efficiency, low latency and large-scale, high-throughput processing we have inside our heads.
Developed within the EU’s Human Brain project, and now a commercial reality, SpiNNaker2 is set to change that. It is a hybrid system that combines statistical AI and neuromorphic computing, and is expected to disrupt everything from the management of smart cities and autonomous vehicles, to 5G, the tactile internet, and biomedicine.
Here, Professor Furber and TU Dresden’s Christian Mayr share the thinking behind the SpiNNaker project – and why Arm IP was the intelligent choice.
Steve Furber, University of Manchester
“By connecting a million mobile phone processors, we get a computer with roughly the same processing scale as a mouse’s brain – around 0.1% the size of a human’s.”
A typical cortical neuron has many thousands of inputs, and connects to many thousands of other neurons. A sea slug has around 20,000 neurons; a fruit fly has 100,000. By connecting a million mobile phone processors, we get a computer with roughly the same processing scale as a mouse’s brain – around 0.1% the size of a human’s (100 million neurons to a human’s 100 billion).
This idea was originally a research proposal: SpiNNaker1’s spiking neural networks could support the modeling of brain subsystems in real time and so contribute to brain science. After five years of thinking about the concept, we submitted it for funding in around 2005.
Modeling large-scale neural networks is a massively parallel task. This is something most high-performance computers, built on high-end, complex, number-crunching processors, are very inefficient at. SpiNNaker works on the principle that this can be done by using a very large array of relatively small processors, if they communicate in a way that is optimized for spiking neural networks. That means sending tiny packets that convey just one neuron spike to a very large number of destinations.
As to the choice of processor, energy efficiency is key. With massively parallel compute, you can break down the problem into as many small parts as you like. So, you want a simple processor that executes whichever part it is allocated with the highest efficiency.
We chose the Arm968E-S. Processor performance tends to come at the expense of power efficiency, so it was logical for this project to choose a processor that was fairly near the bottom of the performance stack. Arm also seemed the natural place to go. I had led the development of the first Arm processors at Acorn in the 1980s, and was a principal designer of the Arm 32-bit RISC microprocessor.
The chosen processor met our efficiency needs very nicely. While we did explore one or two other options, they did not offer anything like the same degree of support in terms of sophisticated software stacks.
There are 18 of these processors on each SpiNNaker chip, and with each board containing 48 chips, there are a total of 864 processors per board. There are now about 100 SpiNNaker boards out there with research groups scattered around the globe, from the USA to New Zealand, exploring applications in such areas as robotics. We have had to do an awful lot of learning on this journey. Not least in what it takes to make a big machine with a million cores operate reliably for its users. That is a non-trivial undertaking.
“Modeling large-scale neural networks are a massively parallel task, which can be done using a very large array of relatively small processors.”
While SpiNNaker’s initial focus was very much on brain research, its potential for commercial applications has now become apparent. In the 20 years since the original idea for SpiNNaker, there has been an explosion in mainstream AI. SpiNNaker1 was built on relatively old processor technology, and the number of processors we can get on the chip really is not state-of-the-art today. So, the decision to develop SpiNNaker2.
SpiNNaker2 started with the brain modeling fundamentals of its predecessor. It builds on this with the ability to support conventional neural networks for AI applications, as well as hybrid systems that combine neural networks of different styles, including spiking neural networks (SNN) and deep neural networks (DNN). With 10 million cores, fitting into 16 server racks, it’s a key enabler for real-time AI at massive data rates.
As it is being developed in partnership with TU Dresden, SpiNNaker2 benefits from my team’s expertise in system architecture, and TU Dresden’s many years of experience in tapeouts of multiprocessor chips.
Christian Mayr, TU Dresden
The brain as a compute machine has always fascinated me, and I have been working on neuromorphic circuits for 20 years. We are now witnessing what the US Defense Advanced Research Projects Agency (DARPA) is calling the ‘third wave of AI’, this convergence of DNN and SNN.
What started out as a very small research community is now becoming mainstream. I like that. And SpiNNaker is at least throwing level with the big guys in the field, and demonstrating real commercial opportunities.
In the early 2000s, as Steve was getting started with SpiNNaker in the UK, my team began working with Heidelberg on a waferscale neuromorphic system. Around 2010, Steve joined one of the Heidelberg projects, Brainscales, and we began to connect. Like SpiNNaker, Brainscales was one of the precursors to the EU’s Human Brain Project, which launched in 2013. Near the start of the Human Brain Project, some of our team traveled to Manchester to show Steve what we had been doing with multiprocessor systems on chip. That’s when he said we should do SpiNNaker2 together.
We decided to use the Arm Cortex-M4 processor. We had plenty of other silicon lying around, because we have always liked to try out things. But in the end, Arm just won out in terms of efficiency.
Steve’s team handled the system design, while our newly designed network-on-chip optimally supported the small packets of data that are key to SpiNNaker. We also introduced numerical accelerators to deliver the power efficiency required by AI.
“If you want to enable pixel-level object tracking, SpiNNaker2 gives you a response inside a millisecond. There is not a single AI system on the planet currently that can do that.”
We have found that SpiNNaker2 performs very well on sparse data. A typical camera stream, for example, has huge redundancy. You can afford to get rid of 90-95% of the data, where nothing is happening. Plus an AI learning algorithm is only usually interested in specific features. SpiNNaker2 offers a lot of pre-processing that is trainable by the AI, and it only transmits the information that is really relevant. So if you want to enable something like pixel-level object tracking, SpiNNaker2 gives you a response inside a millisecond. There is not a single other AI system on the planet that can do that at this scale, with the massive data rate involved in running multiple cameras.
There are a couple of reasons why SpiNNaker2 works so well. Its memory does not sit inside the single neurons, like it would in a spiking neural network. We try to go event-driven in our processing paradigm, but use something like deep neural network and recurrent neural network structures there too. In fact, SpiNNaker2 can support a wide range of neuron types, as we are essentially software-defined for them. As well as having native hardware support for machine learning-type neurons, used when employing the machine for streaming AI, it could, for example, provide multi-compartment spiking neurons for detailed brain simulations.
In a recent AI competition held by the German government, we used SpiNNaker2-derived hardware and beat systems running optimized DNN and SNN by at least a factor of 10.
SpiNNaker is also strong because of its efficiency. Large models with trillions of parameters, such as those used in language processing, should be able to fit in half a server on Spinnaker2 in terms of memory and compute.
The last four or five years of the SpiNNaker journey have been especially interesting. The full SpiNNaker2 machine, called SpiNNcloud, is due to be deployed at TU Dresden for research purposes in Autumn 2022. We have now got real bio-derived algorithms that are a lot more efficient than conventional AI. And while a lot of AI machines are restricted in terms of the processing they can do, SpiNNaker2 is a lot more flexible. This means that we can take our cutting-edge computational neuroscience research and transfer it directly over to commercial applications. We have come to realize that SpiNNaker is the optimal solution for the problem of running a smart city in real time, for example.
In May, we legally incorporated a startup, SpiNNcloud Systems, to make SpiNNaker2 commercially available. Since then, we have been hiring and running concept studies, and we have been in discussions this year with a smart city customer. I expect the first large-scale commercial orders to be in before the end of the year.
We are calling this machine the ‘brain in a box’ – another step further along the route of combining SNN and DNN. We have had lots of new ideas about human-computer interaction, and all kinds of new accelerators, and have designed our own mini processor to tie the AI accelerators together for optimal streaming. We may go for five nanometer technology, if we can finance it, and potentially even 3D stacking.
That is to say we will be scaling capability upwards; but we will also be sizing the machine downwards. So we may well soon have a computer with the processing power of the human brain – inside a desktop machine.
Explore SpiNNaker Discover Research Enablement