Arm Community
Arm Community
  • Site
  • User
  • Site
  • Search
  • User
  • Groups
    • Arm Research
    • DesignStart
    • Education Hub
    • Innovation
    • Open Source Software and Platforms
  • Forums
    • AI and ML forum
    • Architectures and Processors forum
    • Arm Development Platforms forum
    • Arm Development Studio forum
    • Arm Virtual Hardware forum
    • Automotive forum
    • Compilers and Libraries forum
    • Graphics, Gaming, and VR forum
    • High Performance Computing (HPC) forum
    • Infrastructure Solutions forum
    • Internet of Things (IoT) forum
    • Keil forum
    • Morello Forum
    • Operating Systems forum
    • SoC Design and Simulation forum
    • 中文社区论区
  • Blogs
    • AI and ML blog
    • Announcements
    • Architectures and Processors blog
    • Automotive blog
    • Graphics, Gaming, and VR blog
    • High Performance Computing (HPC) blog
    • Infrastructure Solutions blog
    • Innovation blog
    • Internet of Things (IoT) blog
    • Mobile blog
    • Operating Systems blog
    • Research Articles
    • SoC Design and Simulation blog
    • Smart Homes
    • Tools, Software and IDEs blog
    • Works on Arm blog
    • 中文社区博客
  • Support
    • Open a support case
    • Documentation
    • Downloads
    • Training
    • Arm Approved program
    • Arm Design Reviews
  • Community Help
  • More
  • Cancel
Arm Research
Arm Research
Research Articles Making FaaS aWsm: An efficient serverless Wasm runtime for the Edge
  • Research Articles
  • Leaderboard
  • Resources
  • Arm Research Events
  • Members
  • Mentions
  • Sub-Groups
  • Tags
  • Jump...
  • Cancel
Arm Research requires membership for participation - click to join
More blogs in Arm Research
  • Research Articles

Tags
  • Arm Research
  • Cloud Computing
  • Research collaboration
  • Internet of Things (IoT)
Actions
  • RSS
  • More
  • Cancel
Related blog posts
Related forum threads

Making FaaS aWsm: An efficient serverless Wasm runtime for the Edge

Ludmila Cherkasova
Ludmila Cherkasova
January 7, 2021

Serverless computing, also known as Function-as-a-Service (FaaS), offers a new compelling paradigm: it enables users to run their code (a small application dedicated to a specific task) without being concerned about operational issues. Under this model, it is the responsibility of a Cloud provider to guarantee the server provisioning and resource management issues. For the last two years, serverless has been the top-growing Cloud service, seeing a 50% growth in 2019 from the previous year [1]. Since the appearance of Amazon Lambda in 2014, numerous Cloud providers have released alternative serverless platforms.

Serverless applications are intended to be event-driven and stateless. For example, FaaS could be instantiated (triggered by the described condition) to execute a predefined function, and shutdown when finished. In a commercial system, a user is charged on a per-invocation basis, without paying for unused or idle resources. The serverless model favors the applications with good parallelism (for example, video-encoding applications, where different frames can be processed concurrently) and devices with intermittent activities, such as data processing triggered by the Edge devices. Therefore, it is a perfect fit for Internet of Things (IoT) environments.

Many novel IoT applications require low-latency data processing and near real-time responses such as in connected and autonomous cars. Imagine a world where cars and related data services can alert drivers about dangerous road conditions, because of their ability to communicate. You might hear, “Black ice on the road in front of you - right lane in 200 meters”. Real-time performance is expected for detection and control in many industrial and enterprise systems. Some scenarios require a response within 10 milliseconds (ms). While Cloud computing provides a good solution for applications designed at human perception speeds, it becomes inadequate for novel latency-critical applications that rely on fast, automated decisions made with no human in the loop. To satisfy the performance requirements of such workloads, we must provide an alternative way of processing the data closer to the source, also known as Edge computing.

Existing Cloud-based serverless frameworks execute the function instances in short-lived Virtual Machines (VMs) or containers, which support application process isolation and resource provisioning. These frameworks are heavyweight for being used in Edge systems. They have a large memory footprint (from 100 MBs up to GBs) and a high function invocation time (125ms to 1sec). There is a lot of unnecessary redundancy and little resource sharing in such deployments as those shown in figure 1 (a,b), where blue colors reflect shared software and parts of the system. Another critical difference, when comparing Cloud and Edge computing, is that Cloud utilizes the "unlimited" computing resources available in multiple data centers. The Edge represents a limited and resource-constrained environment, and therefore Edge resources need to be very carefully managed. In the Edge environment, the long-lived and over-provisioned containers/VMs can quickly exhaust the limited node resources and become impractical for serving many IoT devices. Supporting a high number of serverless functions while providing a low response time, say 10ms, is one of the main performance challenges for resource-constrained Edge computing nodes.

Arm Research Sledge aWsm diagram.

Figure 1: (a) VM-based Serverless (for example, AWS Lambda using Firecrackers, Microsoft Azure Functions using Hyper-V, and so on). (b) Container- based Serverless (OpenWhisk, Google Cloud Functions, and more), (c) Container + Processes-based Serverless (Nuclio), (d) Sledge: Wasm-based Approach for Serverless at the Edge.

WebAssembly as a solution for the Edge

WebAssembly (Wasm) is a nascent but fast-evolving technology that provides a strong memory isolation (through sandboxing) with a much smaller memory footprint, compared to VMs and containers. Wasm enables users to write functions in different languages (for example, C, C++, C#, Go, and Rust), which are compiled into a platform-independent bytecode. Wasm runtimes could leverage various hardware and software technologies to provide isolation and manage desirable resource allocations.

Since many existing Wasm compilers and runtimes exhibit significant overheads as compared to the application native execution, we implemented our own LLVM-based ahead-of-time (AoT) Wasm compiler, named aWsm (pronounced “awesome”). It offers configurable sandboxing and is optimized for performance. Several of the latest Wasm papers (written over a period of three years) are devoted to optimizing Wasm compilers and performance of the resulting code. In 2017, only 7 out of 30 PolyBench/C benchmarks performed within 1.1 times of native execution [2]. While by May 2019, 13 benchmarks out of 30 could perform within 1.1 times of native execution, due to improved Wasm compilers [3]. In our paper, though the focus is on the serverless runtime, we demonstrate that the aWsm compiler performs within 1.1 times of the native execution for 24 out of 30 PolyBench/C benchmarks. We evaluated the aWsm compiler and its runtime on x86_64 and AArch64 architectures, showing an average performance overhead for PolyBench/C benchmarks (compared to a native code execution) being within 13% and 7% respectively. Additionally, we have compared our aWsm compiler with various existing LLVM- and Cranelift-based Wasm compilers and runtimes to demonstrate its efficiency and performance. Please, see our ACM/IFIP/USENIX Middleware’2020 paper for these interesting details.

In our work, we propose a new serverless-first infrastructure, Sledge, (ServerLess at the Edge runtime) optimized for properties of low latency serverless execution at the Edge. Toward this, we focus on the serverless runtimes for single host servers ranging from powerful multiprocessor servers to low-cost systems, like Raspberry Pi. A new runtime Sledge enables lightweight function instantiation and isolation facilities. The memory footprint of functions has a significant impact on “cold-start” performance and scalability at the resource-constrained Edge. The single-process Sledge runtime binary size is only 359KB. It enables functions to share the library dependencies, while providing a strong spatial and temporal isolation for multi-tenant functions executions. The AoT compiled shared object sizes are between 108 KB-112 KB. This is significantly smaller than VM- and container-based function isolation, which is often in 10s to 100s of MBs. Our framework enables a lighter weight (30μs) function startup time and efficient management of a high churn of request rates in the Edge systems.

Sledge uses a kernel bypass to optimize the framework efficiency and enable custom (specialized) serverless function scheduling. It leverages the short-lived execution properties of serverless to specialize system scheduling by decoupling both the work-distribution and load balancing across system cores for scalability. The Sledge runtime focuses squarely on efficiency of serverless functions and enables strong spatial and temporal isolation of multi-tenant function executions. These lightweight sandboxes are designed to support high-density computation, with fast startup and teardown times to handle high client request rates. An extensive evaluation of Sledge with varying workloads and real-world serverless applications demonstrates the effectiveness of the designed serverless-first runtime for the Edge. Sledge supports up to 4 times higher throughput and 4 times lower latencies compared to Nuclio, one of the fastest open-source, container-based serverless frameworks.

This demonstrates that a serverless runtime, optimized by leveraging a lightweight Wasm-based isolation and bypass of traditional kernel scheduling, holds a significant promise for demanding requirements of future Edge computing solutions. The proposed framework opens a set of interesting opportunities for customized performance management of users’ serverless functions, which we plan to investigate in our future work.

Moreover, the implemented AoT compiler aWsm leverages the LLVM compiler to optimize code and targets different architectural backends. Using Polybench/C benchmarks, we have evaluated aWsm on x86-64, AArch64 (Raspberry Pi), and Thumb (Arm Cortex-M4 and M7).

aWsm performance on microprocessors is within 10% of native and within 40% on the microcontrollers. To explore aWsm on Cortex-M, please see our paper published at EMSOFT 2020, “eWASM: Practical Software Fault Isolation for Reliable Embedded Devices”.

This work started during P. K. Gadepalli’s summer internship in 2019 with Arm Research and has evolved as a collaborative project with George Washington university. 

Read the full paper Explore the open-sourced aWsm Questions? Contact me 

References

[1] Flexera 2020 State of the Cloud Report

[2] Andreas Haas, Andreas Rossberg, Derek L. Schuff, Ben L. Titzer, Michael Holman, Dan Gohman, Luke Wagner, Alon Zakai, and JF Bastien. "Bringing the Web Up to Speed with WebAssembly". In Proc. of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’17).

[3] Abhinav Jangda, Bobby Powers, Emery D. Berger, and Arjun Guha "Not So Fast: Analyzing the Performance of WebAssembly vs. Native Code". In USENIX Annual Technical Conference (ATC 19).

Anonymous
Research Articles
  • How about a short walk?

    Ilias Vougioukas
    Ilias Vougioukas
    Current solutions to improve virtual to physical translation performance are impractical. We present an alternative, where a small change has a significant impact.
    • March 10, 2022
  • SpiNNaker: Next-level thinking

    Charlotte Christopherson
    Charlotte Christopherson
    SpiNNaker1 connected a million mobile phone processors, operating in some ways like a brain. SpiNNaker2 will drive the next generation of AI.
    • January 31, 2022
  • Sparking potential for community development: Arm Education Kits now available on GitHub

    Dipesh Patel
    Dipesh Patel
    University Program materials are now even easier to access via GitHub. Use our resources, tools and more to spark the potential of your students.
    • January 24, 2022