What would you do with a Cortex-M4, a motor or two, some lego and a few cable ties? Well, if you’re Sebastian Förster, an embedded systems developer based in Germany, the answer is a small, four-legged robot that you’d teach to walk using neural networks.
The robot – otherwise known as Scratchy – has four servomotors for legs, an ultrasonic capsule for distance measurement and is controlled by an STM32F407 Discovery board.
So, how did Scratchy do? See for yourself, in Sebastian’s video…
Mark Connor, Arm’s Director of Deep Learning, interviewed Sebastian to find out what led him to create Scratchy, how he went about it and what he learned from the experience.
“The thesis I did for my master’s degree was about the suitability of applying machine learning to smaller, Cortex-M processors – looking at concrete examples of neural networks and running performance tests on them. As part of that, I ported the FANN neural network library to the Cortex-M4 and was looking for something less academic and more fun to try it out on. I wanted to work on something tangible, so I decided on a robot. As you can see, I didn’t put a lot of work into it – just connected a few Lego bricks to some motors and an STM32F4 developer board with enough flash and SRAM.
“The conclusion of my thesis – which is also clearly illustrated by my success with Scratchy – was that it is absolutely possible to run heavy machine learning algorithms on small, Cortex-M-based devices.”
“No, as far as I’m concerned, that’s what AI is for! Scratchy was constructed in a way that allowed me to train the forward and backward gaits independently, but I must admit that I was surprised that it worked without knee joints. I decided to use Q-learning because Deep Mind had had such success with their Atari Q-learner, and I was able to build on their work to I write the Q-learning agent. The FANN library is developed by others under LGPL; I simply ported it to the Cortex-M4."
“The topology was defined by the limits of the SRAM. The neural network could have been a lot bigger, but I wanted to train it directly on the Cortex-M4 and the additional variables use quite a bit of memory. In my opinion, it doesn’t make much difference whether you use two or three feed-forward layers, although I didn’t compare them directly – I wanted to stress the processor a bit!”
“There are very few bare metal C (not CUDA) frameworks that will even fit into 512Kb flash and 256Kb SRAM. I was lucky to find FANN, but I did have to write a small file system so the library could load the saved network weights directly from flash.”
“Whatever you do, don’t give up in your search for features and hyperparameters. If you can imagine something, you can create it!”
German speakers can read Sebastian’s original blog post on Wordpress