Arm Mbed at Data Science Africa 2017: Part 2

Part II - Karibu Sana, Asante sana

by Damon Civin, Principal Data Scientist at Arm

Conversation in Tanzania often starts with the phrase "Karibu Sana," which translates to "you are welcome." In many other places, you'd have to say "thank you" ("Asante sana") before being told you're welcome. The Data Science Africa 2017 conference (DSA2017) was full of welcome surprises and unique features.

The first welcome surprise was being able to retrieve Jan's lost box of Mbed boards and sensors. No government officials necessary, just one Jan on a mission, making his way into the arrivals part of the airport and grabbing his box.

In many more important ways, the Data Science Africa conference is a welcome change to the conferences I am used to. Here's why.

(Slides and notebooks from the talks can be found here, videos here)

Reason 1: Commitment

Day 1 kicked off with introductory sessions on data science. Neil Lawrence (Amazon) gave a characteristically inspiring overview of the field, followed by introductions to Jupyter notebook and the art and science of classification by Ernest Mwebaze (Makerere University) and Martin Mubangizi (UN Global Pulse). The students (many of whom had spent upward of 10 hours on a bus overnight the night before) accelerated through the practical sessions, and many were coding and experimenting well into the evening.

Reason 2: High engagement

Day 2 was Arm's time to shine with the IoT. Jan did an amazing job of getting the tutorial ready (in the pub) and building a dance routine into it...

and the students loved it!

Here are some of the students explaining their work themselves:

The willingness to collaborate, learn and experiment was infectious and led to more coding into the night.

Reason 3: Focus on problem solving

John Quinn (UN Global Pulse) started the third day with a fascinating introduction to spatial data analysis using QGIS and GPy. This isn't something usually taught in data science courses, though it is powerful and useful! His tutorial explained the concepts through relevant, practical applications, such as eregional vegetation health, disease outbreaks, air quality and commute times.

His quote above was also my favourite of the conference - all data science practitioners should take it to heart. Even Ralf Herbrich, who runs machine learning at Amazon, remarked: 

"One of the best things about  is customer focus in the form of putting the problems and use cases first and tech as a tool."

Ralf went on to give an insightful talk on uses of machine learning at Amazon. A few highlights for this community:

He explained further that power efficient GPUs will become necessary to sustain this business model as the industry tackles more predictions and more complex problems. One reason this is interesting is because neural networks are used for translation, and the quality of product descriptions contributes directly to profit for online retailers. Also, AI is better than humans at predicting strawberry freshness now.

Moustapha Cisse of Facebook presented new work on the frontiers of machine learning, especially learning from less data, and how to fool neural networks. Also, after his talk, I may well be a PyTorch convert. It's so similar to NumPy that moving to large scale on GPU is much easier if you are used to small data situations.

The talks at the workshop focused on solving real-world problems through data science.

Morris Agaba (NMAIST) presented work on biodiversity and isolating genetic causes for mutations.

Daniel Mutembesa and Ernest Mwebaze (Makerere University) spoke about using mobile apps to gather agricultural data to monitor and classify plant pests and diseases. They are finishing the pilot program and looking to scale it. Though the mobile data networks in East Africa are strong, the phones available to most people are feature phones, not smart phones, which makes app development significantly more difficult. The rate of adoption of smart phones will be a major factor in the pace of field-based data science innovation.

Dina Machuve (NMAIST, organiser of DSA2017 and all-round hero) presented some upcoming work on predicting banana diseases from weather data and providing the insights to framers through mobile apps. Check out this interview with her!

Here is the talk I gave about using Arm Mbed for data science. The rest of the talks mentioned above (and more) are on the playlist, too.