Authors: Jim Wallace, Arm; Joseph Byrne, NXP
Service providers or anyone involved in building out next-generation networks are faced with complex challenges today as they seek to evolve, future-proof, and secure their networks to meet the ever-increasing needs of our hyper connected world.
Addressing these challenges requires whole-network visibility and management, rapid deployment of new networking functions, and the ability to scale these functions up or down based on load. Three technologies collectively meet these requirements:
SDN, SD-WAN, and NFV complement each other and can be a powerful combination together. These technologies enable new network functions and services to be quickly deployed and dynamically updated, reducing cost while enabling new revenue streams for service providers and improving end-customer experience.
Google was an early user of SDN, using SDN within its data centers and between them to simplify configuration of network switches and optimally allocate network packet flow. NFV originated with a group of network operators, their initial use cases were focused on mainframe / big-iron functions such as routing, the mobile packet core, and network signalling.
Figure 1 Virtual Network Services – improving business flexibility, revenue & TTM
However, right from the start of SDN and NFV, communications service providers felt that an SDN switch and virtual network functions (VNF) hosted in the service-providers’ infrastructure could replace home, small-business, and branch-office routers.
On the surface, this approach made sense. Replacing a router with a dumb switch would lower customer premises equipment (CPE) cost. Saving a thousand dollars on a piece of CPE equipment is significant when multiplied by the number of customers a network operator serves. Moreover, operators could make money on new services quicker. They could instantiate new features by adding VNFs and customers could order them with mouse clicks. For example, they could offer VPN, WAN optimization, and VoIP using VNFs for these functions, charging a monthly service fee. Customers could also consolidate on a single box and not buy separate boxes for each function and pay for their ongoing maintenance.
Reality, however, is a bit more complex. Cautious IT managers want their VPN tunnels terminated onsite to ensure security. WAN optimization, by definition, must be onsite, too: it’s whole reason for being is to rationalize bandwidth over the last mile link. VoIP also benefits from onsite termination. At the same time, NFV had been more about addressing operators’ rather than customers’ needs.
Figure 2 SD-WAN & traditional MPLS network
We’re therefore seeing the market adapt.
In cases where enterprises want to semt up their own global network in house, SD-WAN is gaining interest. Although a branch-office router may implement SD-WAN, in many cases a low-cost generic system hosts the SD-WAN functions. IT managers like SD-WAN because it’s easy to manage and saves money by shunting best-effort traffic off costly MPLS WANs to low-cost broadband connections. These savings increase in importance as companies move more workloads to the cloud.
In cases where network functions must remain on premises, the NFV vision of being able to consume, configure, and manage networks easily by relying on software running on standard hardware has made NFV-based CPE an important use case. A networking device on premises can host virtual network functions (VNF), such as a router, firewall, VPN, WAN optimizer, or even SD-WAN. In these systems, an SDN switch chains these VNF-based services.
This switch can also add cloud-based services to the chain. For example, malware scanning can be done in the cloud while VPN termination is done locally. Typically, the SDN switch is wholly implemented in software such as Open vSwitch (OVS) built into the Linux kernel, as is common in infrastructure NFV.
Figure 3 Virtualization used throughout the network from the end user’s premises to the datacenter
Almost any network function from on-premises CPE to datacenter can be virtualized; however, the main limitation in determining how much of the network can be virtualized and what functions can be virtualized include throughput, latency, performance and power. If the network functions are implemented using only general-purpose processors, latency, throughput and power will suffer for many applications so badly, that in some cases virtualization would be impractical. The cost constraints of CPE exacerbate the situation.
Fortunately, in recent years the industry has made great strides toward a standardized processing environment. Processor suppliers have converged on designs implementing Arm’s 64-bit instruction set, AArch64. Arm CPUs address a range of price-performance trade-offs to provide scalable compute platforms from edge to cloud. The fastest of these platforms are comparable to server CPUs but designed for power efficient embedded use. The smallest of these occupies a fraction of the die size of their more powerful brethren hence offering much lower cost. Importantly, all implement Arm’s v8 architecture with a unified software ecosystem and work with complementary Arm partner IP.
In a virtualized environment, each VNF can run in its own virtual machine (VM) that operates under the impression of having exclusive access to the processors, peripherals, system memory and IO devices.
In general terms, virtualization of CPU, Memory and I/O enables the support of multiple system images or guests for deployment of VNF applications and for efficiency or right-sizing of system resources from one large system into smaller virtual systems.
To ensure optimal performance, Arm has added dedicated hardware extensions to accelerate a Virtual Machine Manager (VMM) switching between virtual machines and hypervisor software. Arm has also made continuous improvement to its System Memory Management Unit (SMMU) and Generic Interrupt Controller (GIC) architecture.
In short, Arm-based IP delivers performance and virtualization capabilities comparable to server-derived designs but offers lower power and is ideal for integration in SoCs.
Arm IP is a key part of NXP’s Layerscape processor designs. This family of processors range from a single-core Cortex-A53 device suited to uCPE to a 16-core Cortex-A72 device for high-performance NFV and SDN-control systems. NXP has also integrated accelerators in the Layerscape product line for packet and I/O processing, encryption, and compression (the DCE block). Whereas Arm’s IP helps enable virtualization of compute resources, NXP’s accelerators help enable network virtualization.
Pictured in Figure 4 is the eight-core LS2088A processor. One of its accelerators is the C-programmable AIOP packet engine, which can offload virtual switching from the CPUs. This overcomes the challenge in virtual switching with its added protocol processing burden stemming from its added encapsulation to form VXLAN or GRE tunnels between endpoints in addition to the baseline header/trailer processing, and queuing required to switch network traffic. It can also fully offload IPsec for VPN VNF and accelerate gathering of network statistics. The DCE’s compression function is useful in SD-WAN and WAN optimization VNFs. The wire-rate I/O processor (WRIOP) can parse and classify packets, then distribute them to the Arm CPUs for further processing. The WRIOP also provides Ethernet L2 switch functionality.
NXP supplies a Linux kernel, a Ubuntu-based user environment, and a software library for its using the accelerators.
Figure 4 NXP LS2088A eight-core processor based on Arm Cortex-A72 CPU. Abbreviations: decompression/compression engine (DCE), pattern-matching engine (PME), wire-rate I/O processor (WRIOP).
The combination of Arm’s scalable processors, virtualization extensions, SMMU and GIC together with NXP’s accelerators and offloads for network virtualization has been shown to significantly improve system performance. One example of this can be seen in Figure 5, which shows significant uplift in throughput performance for a basic router VNF using NXP’s LS2088 compared with a competing non- Arm processor implementation. That processor has four dual-thread cores running at 4.0GHz.
Figure 5 The NXP LS2088A with eight Arm Cortex-A72 CPUs and accelerators outperforms a non-Arm competing eight-thread processor.[i]
Arm and NXP see these accelerated heterogenous systems enabling next generation networks, driving volumes for SDN / SD-WAN and NFV systems from on premises (branch offices) to the datacenter. These new systems will improve latency, reduce bandwidth on network links, reduce costs and deliver localised services to meet the ever-increasing demands of our future hyper connected world.
[i] This comparison in Fig 5. uses only two virtualized CPUs, each pinned to a physical CPU core. One vCPU handles the control plane and the other the data plane.
Joseph Byrne is a senior strategic marketing manager for NXP's Digital Networking Group. He blogs about edge computing and security.