Machine Vision is perhaps one of the few remaining areas in technology that can still lead you to say “I didn’t know computers could do that”. The recent pace of development has been relentless. On the one hand you have the industry giants competing to out-do each other on the ImageNet challenge, surpassing human vision recognition capabilities along the way, and on the other there is significant, relentless progress in bringing this technology to smart, mobile devices. May 11 and 12 saw the annual Santa Clara California gathering of industry experts and leaders to discuss latest developments at the Embedded Vision Alliance Summit. Additionally, this year ARM was proud to host a special seminar linked to the main event to discuss developments in computer vision on ARM processor technologies. In this blog I’m going to provide my perspective of some of the highlights from both events.
The Santa Clara Convention Centre, California. Host to both the ARM workshop and the EVA Summit
It was my great pleasure to host this event and for those of you who were there I hope you enjoyed the afternoon’s presentations and panel discussion. The proceedings from the seven partner presentations can all be downloaded from here. The idea of this event – the first of its kind ARM has held on computer vision – was intended to bring together leaders and experts in computer vision from across the ARM ecosystem. The brief was to explore the subjects of processor selection, optimisation, balancing workloads across processors, debugging and more, all in the context of developing computer vision applications. This covered both CPU and NEON™ optimisations, as well as working with Mali™ GPUs.
With a certain degree of cross-over, the seminar program was divided into three broad themes:
Jeff Bier from BDTI gets things going with his presentation about processor selection and benchmark criteria
In addition to the above presentations rmijat hosted a panel discussion looking at current and future trends in computer vision on mobile and embedded platforms. The panel included the following industry experts:
It was a fascinating and wide-ranging discussion with some great audience questions. Roberto asked the panelists what had stood out for them with computer vision developments to date. Laszlo talked about the increasing importance of intelligence embedded in small chips within cameras themselves. Michael Tusch echoed this, highlighting the problem of high quality video in IP cameras causing saturation over networks. Having analysis embedded within the cameras and then only uploading selective portions, or even metadata describing the scene, would mitigate this significantly. Tim Droz stressed the importance of the industry moving away from the pixel count race and concentrating instead on sensor quality.
Roberto then asked about the panelist’s view on the most compelling future trends in the industry. Michael Tusch discussed the importance in the smart home and businesses of the future of being able to distinguish and identify multiple people within a scene, in different poses and sizes, and being able to determine trajectories of objects. This will need flexible vision processing abstractions with the the aim of understanding the target you are trying to identify: you cannot assume one size or algorithm will fit all cases. Michael forsees, just as GPUs do for graphics, the advent of engines cable of enabling this flexible level of abstraction for computer vision applications.
Laszlo Kishonti talked about future health care automation including sensor integration in hospitals and the home, how artificial intelligence in computer vision for security is going to become more important and how vision is going to enable the future of autonomous driving. Laszlo also described the need for what he sees as the 3rd generation of computer vision algorithms. These will require levels of sophistication that will reach, for example, the ability of differentiating between a small child walking safely hand-in-hand with an adult with one where there might be a risk of running out into the road. This kind of complex mix of recognition and semantic scene analysis was, said Laszlo, vital before fully autonomous vehicles can be realized. It brought home to me both the importance of ongoing research in this area and perhaps how much further computer vision technology has to develop as a technology.
Tim Droz talked about the development of new vector processors flexible enough for a variety of inputs, HDR – high dynamic range, combining multiple images from different exposures – becoming ubiquitous, along with low-level OpenCL implementations in RTL. He also talked about Plenoptic, light-field cameras that allow re-focusing after an image is taken, becoming much smaller and more efficient in the future.
The panel ended with a lively set of questions from the audience, wrapping up a fascinating discussion.
Gian Marco Iodice talks about accelerating a real-time dense passive stereo vision algorithm
Overall it was a real pleasure to see so many attendees so engaged with the afternoon and we are grateful to all of you who joined us on the day. Thanks also to all our partners and panellists whose efforts led to a fascinating set of presentations and discussions.
The presentations from the seminar can be downloaded here: EVA Summit 2015 and ARM’s Computer Vision Seminar - Mali Developer Center
The annual Embedded Vision Summit is the industry event hosted by the Embedded Vision Alliance, a collection of around 50 companies working in the computer vision field. Compared to the 2014 event, this year saw the Summit grow by over 70%, a real reflection of the growing momentum and importance in embedded vision across all industries. Over 700 attendees had access to 26 presentations on a wide range of computer vision subjects arranged into 6 conference tracks. The exhibition area show-cased the latest work from 34 companies.
See below for links to more information about the proceedings and for downloading the presentations.
Dr. Ren Wu, Distinguished Scientist from Baidu delivered the first of two keynotes, exploring what is probably the hottest topic of the hour: visual intelligence through deep learning. Dr. Wu has pioneered work in this area, from training supercomputers through to deployment on mobile and Internet of Things devices. And for robot vacuum cleaner fans – and that’s all of you surely – the afternoon keynote was from Dr. Mike Aldred from Dyson who talked about the development of their 360° vision (and ARM!) enabled device which had earlier entertained everyone as it trundled around the exhibition area, clearing crumbs thrown at it by grown men and women during lunch.
ARM showcased two new partner demos at the Summit exhibition: SLAMBench acceleration on Mali GPU by the PAMELA consortium and video image stabilization in software with Mali acceleration by FotoNation
The six conference tracks covered a wide range of subject areas. Following on from Ren Wu’s keynote, Deep Learning and CNNs (Convolutional Neural Networks) made a notable mark with its own track this year. And there were tracks covering vision libraries, vision algorithm development, 3D vision, business and markets, and processor selection. In this final track, rmijat followed on from ARM’s previous day’s seminar with an examination of the role of GPUs in accelerating vision applications.
rmijat discusses the role of the integrated GPU in mobile computer vision applications
A list of all the speakers at this year's Summit can be found here: 2015 Embedded Vision Summit Speakers
All the papers from the event can be downloaded here (registration required): 2015 Embedded Vision Summit Replay