Modern compute systems are increasingly leaning on accelerators to provide better performance in a post-Moore era. And patch sensors and other wearables may interact in an interesting way in the future. We spoke to Tulika Mitra, Provost’s Chair Professor, and Jerald Yoo, Associate Professor, at the National University of Singapore, to find out more.
With the end of Moore's law and Dennard scaling, many people are looking to domain-specific accelerators to deliver higher performance but also energy efficiency.
Instead, we are focusing on domain-agnostic accelerators. We are developing a “universal” or “software-defined” accelerator. Right now, a commercial SoC might have dozens of separate accelerators. We want to reduce this as much as possible, so you end up with a few universal accelerators, that can accelerate all kinds of workloads.
We are developing against workloads of linear algebra, navigation, and sparse data. AI is of course an important workload, but we are also interested in other workloads that are going to run on IoT devices. In particular, navigation; anything based on graphs – which is the more challenging type of workload that we are looking at.
“The challenge for accelerators is to create a suitable API to unlock their power for regular developers. It's really democratizing heterogeneous SoCs for all software developers.”
Our accelerator is runtime reconfigurable. You can actually reconfigure on a per-cycle basis. Though in practice you would not want to do it that often because the power consumption would be large. The trick is to keep it stable for as long as possible, and when you move from one loop kernel to another loop kernel, then you might reconfigure.
A huge challenge is how to map an application onto the hardware to get the best performance. This is why it is called a software-defined accelerator - because you use a compiler to pre-calculate the mappings. It is computationally challenging; it can take days to do the compilation. However we have developed techniques that bring that time down to just a couple of hours, which makes this approach feasible. Graph applications are still challenging for the compiler and the hardware. We are taking a hardware-software co-design approach to tackle this.
“The range of IP has been really useful - we are designing an accelerator, but need the memory controllers and other IPs. I can't even think of making a choice – using Arm was the default choice.”
More generally, we are looking at how to do this mapping efficiently for SoCs with a mixture of accelerators, both domain-specific and universal. Today, there are specialized people who know how to use them. The challenge for accelerators is to create a suitable API to unlock their power for regular developers. It is really democratizing heterogeneous SoCs for all software developers.
In the embedded space, Arm is the de-facto standard. There are tons of Arm-based platforms available. We have free access to IP and toolchains through Arm Academic Access. The range of IP has been really useful - we are designing an accelerator, but need the memory controllers and other IPs. I cannot even think of making a choice – using Arm was the default choice.
Circuit designers are good at designing devices small enough that they are unobtrusive. I became interested in applying this to biomedical sensors. Specifically, analyzing EEG signals to detect seizures and other chronic conditions. The earlier in life you detect them and attempt to manage them, the better the prognosis in general.
ECG (heart signal) is relatively easy to analyze from a biomedical perspective. EEG (brain signal) is much more challenging both in analog and digital because the brain wave signal strength is much smaller.
“Without Arm Academic Access it would have been very difficult to get both front- and back-end access for the IPs that we needed as well as the packages, all of which was critical for my research to get started.”
An early assumption that the EEG pattern would become noisier during seizures, turned out to be false. In reality, each person's EEG pattern can behave differently. One patient's seizure pattern might correspond to a normal pattern for another patient, and vice-versa. So we had to incorporate machine learning to enable a patient-specific approach. We used classical machine learning because neural networks would not work within the modest processing power available in patch sensors.
Given the power constraints for this device, we needed to have both front-end and back-end. The latter because noise coupling as well as detailed analysis is very critical. Without Arm Academic Access it would have been very difficult to get both front- and back-end access for the IPs that we needed as well as the packages. All of which was critical for my research to get started.
Device communication was initially a problem. The patch sensor will have to transmit data to your phone or some other device. Conventional Bluetooth radio uses 2.4GHz which the human body strongly absorbs. If your transceiver is in your chest and your receiver is in your back pocket, then indoors the reception will be fine because of reflections. But outdoors, the signal may get lost because the transmitting strength is not powerful enough and the body absorbs too much of the energy.
We have solved this by using the human body as a coupling medium. Transmitting a signal across your skin is 20-30dB more efficient than using a radio frequency channel for the same purpose.
“Eliminating sensor batteries will alleviate the main limitation of wearables – charging.”
In the future you might have 10-20 wearables, but you will not want to have to charge each of those devices every night. Wireless power transfer is an option, but you will hit the same problem - the human body absorbs a lot of the energy. Fortunately, we can use the same human body coupling to power these devices - literally passing current through the skin.
It is very safe. It is about two orders of magnitude lower than the IEEE standard and we do not feel it at all. Just sitting in an office, 50-60 Hz interference is coupling to your body, just from power lines and appliances. It is around 7-8V peak-to-peak, and humans have been exposed to this type of field for 100 years. You should be concerned about appliances, lighting systems, and power lines before you start to worry about the safety of body-coupled power and communication.
It is a satisfying solution because bulky batteries cause so many problems, even degrading the signal quality. Eliminating sensor batteries will alleviate the main limitation of wearables – charging. Body-coupled powering of multiple wearables from a central power pack will help them become pervasive and hopefully reduce battery anxiety.
I am also looking at how we can model underlying micro-architectures to predict the worst-case execution time in a real-time system. Modern CPUs contain many non-deterministic features such as instruction reordering, branch prediction, caches - the execution time of an instruction depends on the context. Then you have the operating system which adds another layer of memory management and other non-determinism.
Our challenge is to predict the shortest possible time during which execution is guaranteed to complete. Right now, commercial solutions provide either high performance, or high determinism - but not both. We are looking at how we can provide high determinism to an existing high-performance system.
Inspired by the use of ultrasound in the biomedical field, I am looking at how we can use air-channel ultrasound to replace drone vision.
The goal is to enable drones to fly at night or in fog. The main challenge is the low frame rate. Car reversing ultrasound sensors have a ~0.5s - 1s delay before an object is detected. A drone would have crashed in that time.
Our research has focused on boosting the framerate to around 24 frames per second, which is sufficient for drone navigation purposes, while using relatively low-cost hardware. To achieve this, we had to compromise on resolution.
The resolution of our system is minimal, but for collision avoidance, you do not need to know what the object is. Only that there is an object there. Low resolution is good enough for this.
There are other viable approaches we could have taken. Lidar offers higher resolution. Radar offers a faster response. But ultrasound is the only one that would fit into the cost envelope for a $200 drone.
The next step is to complement this sensor with a camera to create a multi-modal sensing for night-time and foggy conditions. This work has relied heavily on Arm products and compilers. The Academic Access program has really benefitted me.
Tulika Mitra is Vice-Provost (Academic Affairs) and Provost’s Chair Professor of Computer Science at the National University of Singapore
Jerald Yoo is an Associate Professor with the Department of Electrical and Computer Engineering at the National University of Singapore
Arm offers free access to a wide range of commercially-proven Arm IP, tools, and other resources – to enable you to do your best research work, on your own terms.
Explore Research Enablement