The ApacheTVM community continues to grow. Last year the conference hosted roughly ~400 engineers and researchers. This year the conference is going virtual.
What is TVM? I thought I would start with this because this question has come to me a lot over the past year from across the industry – both from investors and developers. However, instead of getting into what it is, let us talk about the who.
TVM is a project built by an engineering-first community. TVM focuses on bridging the gap between academia and industry in a way that benefits both. It also focuses on bridging the gap between the world of many different frameworks and hardware. The community consists of many engineers and researchers from different institutions. Despite the wide variety of organizations, since they all build tools focused on ML, they have many shared problems. The community provides common ground to create solutions to these shared problems.
Arm is a significant contributor to TVM. Arm is present across the breadth of the ML space – we have the Cortex-A line in the full operating system space and the Cortex-M line of processors, focused on embedded systems. Both of these are extremely different when it comes to ML performance and development flows. That's not even mentioning other types of processors, including GPUs and NPUs. TVM works across all these systems with their varied requirements.
To learn more about TVM and how Arm is contributing to the TVM project, be sure to attend the Apache TVM and Deep Learning Compilation Conference from December 2nd -4th 2020. Ramana Radhakrishnan (Senior Principal software engineer at Arm) and Jem Davies (VP fellow and GM ML at Arm) is presenting at the conference. Arm AI ecosystem partner, OctoML, is also a significant contributor and will be presenting . This is an excellent opportunity to not only learn more about TVM in general, but how companies like Arm, OctoML, AWS, Microsoft, Alibaba, and more are using TVM in real world AI solutions.
To summarize in as much as I can as someone who loathes getting close to compilers, TVM is a compiling tool for ML workloads. This is an alternative to an interpreted framework, like Tensorflow or PyTorch. Even on embedded devices, TensorflowLiteµ is interpreted. This means it needs a translator to tell the code what to do with trained models. TVM is built as a compiled framework first. The highlight is AutoTVM, which is the auto-magically way to compile new code and models. Right now, TVM is not hitting the performance metrics of human-written kernels (you still can keep your job, kernel engineers), but these are used in it is stack. For more details, see this blog from OctoML.
However, it is perfromant without human-written kernels. As someone who has never written a kernel and who never wants to, this is great news. What if I want to build a custom model with a custom layer on a piece of hardware that is has no kernel support but has TVM support? Cool. I can. It is no longer a "it is not supported" answer. Functional but not optimal is better than "in-progress" in many situations, including a marathon. And as we all know in the world of ML, as-good-as-human is not far away.
We shall see. Probably nothing. They have been around a long time and I do not see them going anywhere. Also, in terms of fast iteration, interpreted is great. If I want to train and test a model's accuracy, I am going to do it interpreted before I compile to a specific device.
Whether it is a monumental shift in the way ML inference is done or it is just the new thing that has the efforts of a significant portion of the industry working on it, it is worth checking out. How? Go to the conference. It started yesterday, it is free, and it is virtual, as all things in life should be. I will see you there (not really, but as much as I see anyone nowadays).
[CTAToken URL = "https://www.eventbrite.com/e/tvm-conf-2020-tickets-127421618491" target="_blank" text="Register for TVM 2020" class ="green"]