SmartNICs have become increasingly popular in the datacenter and telco as they can help achieve high-performance and feature-rich networking with minimum utilization of the host’s CPU resources. In this blog, we discuss how Open vSwitch, which is one of the most common networking solutions, can be offloaded to SmartNICs. We will also provide a detailed tutorial to demonstrate the setup for Open vSwitch acceleration with Mellanox ConnectX-5 on an Arm platform.
Traditionally, a Network Interface Card, or NIC has been a fixed function device which would support only offloading of checksum and segmentation functions. Since these fixed functions are built into the silicon during design, they accelerate only specific network processing functions and do not offer any flexibility.
A SmartNIC, on the other hand, offers more programmable network processing features and intelligence than a traditional NIC. The extent of these capabilities supported tends to vary from vendor-to-vendor. But most vendors agree that a SmartNIC must possess the following capabilities and features to be classified as a SmartNIC [1][2][3]:
In addition to the previously features, some vendors are also looking to offer even greater flexibility by allowing offload of control plane functions in addition to data plane functions.
Virtual switching was born as a consequence of hypervisors needing the ability to transparently bridge traffic between virtual machines (VMs) and the outside world. Open vSwitch (OvS) is the most popular and commonly used virtual switch for virtualized environments in data centers. In such environments, OvS runs on the host CPU cores to forward packets to VMs. This packet processing consumes a large amount of host CPU cycles, thereby taking away expensive CPU resources from other workloads running on the host CPU. As a result, host CPU cores are not able to effectively support the data center applications for which they were deployed. A solution to this problem is to offload OvS packet processing to a SmartNIC, which can offload at least the OvS datapath directly on the NIC. Such a deployment has the capability to improve packet throughput significantly, and at the same time reclaiming precious CPU computing resources that would have otherwise been consumed.
SmartNICs are often categorized based on their implementation. These implementations have various tradeoffs in terms of cost, flexibility, and programmability. Mellanox provides three implementation flavors: ASIC-based SmartNIC, FPGA-based SmartNIC and System on Chip (SoC) based SmartNIC. Their FPGA-based SmartNIC has application in security and storage, and is beyond the scope of this blog (which solely focuses on the networking applications of SmartNICs).
With an ASIC-based NIC, the specific data plane functions of OvS are offloaded to the NIC while the control plane and slowpath is still running on the host CPU. Such a type of NIC has a configurable data plane to configure new switching rules and protocols but its functionality will have limitations based on the functions which were offloaded. An example of this type of NIC is the Mellanox ConnectX-5. An SoC based SmartNIC offers most flexibility since it integrates an ASIC-based NIC described previously as well as programmable CPU cores. These programmable CPU cores can be easily reprogrammed on-demand to add new data plane features with a standard language such as C, or even run the networking control plane. Arm CPUs are popularly selected for such SoC based SmartNICs because of their efficiency, performance, along with a well-supported software ecosystem that enables highly efficient implementation of complex data plane features. An example of this type of NIC is the Mellanox BlueField. It integrates an ASIC-based ConnectX-5 NIC for enabling data plane offload and 16 programmable ARMv8 Cortex-A72 cores to allow offload of the slowpath and the control plane. Another example of a SoC-based SmartNIC is the Broadcom Stringray family of SmartNICs where the SoC integrates a 100G NetXtreme ethernet controller with a subsystem of eight ARMv8 Cortex-A72 CPUs. The Arm CPU cores can be utilized for data plane offload and to configure the HW accelerator.
Another way to categorize SmartNICs is based on their functionality. The functionality provided by SmartNICs defines the three offload models that they support: Partial data plane offload model, full data plane offload model and, complete data plane and control plane offload model. Partial data plane offload model offloads only some of the data plane networking features such as packet classification and packet parsing functions. On the contrary, the full data plane offload model offloads the host’s entire networking data plane. Most vendors tend to focus on how to offload the end-to-end packet processing pipeline, that is, full data plane offload, but profiling results have shown that most CPU cycles are spent doing packet classification and parsing. So, instead of offloading the full range of data plane functions, if only packet classification and packet parsing functions are offloaded, it can still be quite beneficial for conserving host CPU resources. ASIC-based SmartNICs like Mellanox ConnectX-5 support partial and full data plane offload models. The third type of offload model assumes that there is no networking control or data plane running on the host and they have been completely implemented in and offloaded by the SmartNIC. SoC based SmartNICs such as Mellanox BlueField and Broadcom Stingray are hardware capable of supporting this model.
In this section, we will illustrate the steps to demonstrate OvS Acceleration (with DPDK datapath) using Mellanox ConnectX-5. OvS uses the DPDK library to operate fully in the userspace. For the purpose of this blog, we are using the Marvell ThunderX2 server which is running an Ubuntu 18.04 OS. You can also replicate these tutorials on any other Arm based platform of your choice.
As mentioned previously, Mellanox ConnectX-5 is an ASIC-based NIC that allows partial and full offload of data plane functions. The data plane operations are offloaded to an Embedded Switch (E-Switch) in ConnectX-5 while allowing the SDN control plane operating on the host CPU to remain unmodified. In case of OvS, the SDN control plane remains the same— forwarding table and policy information are communicated from a corresponding SDN controller through the OvS daemon running in user space. Data plane offload utilizes the switchdev mode of SR-IOV implementation. The switchdev mode in the E-Switch binds the Virtual Functions (VFs) with its representor.
switchdev
In the previous diagram, the first packet of new flow received is sent to the OvS-DPDK application which handles the packets and also decides if the flow should be offloaded to the E-Switch. If the application decides to offload this flow to the E-Switch, consecutive packets belonging to the same flow are then handled by the E-Switch instead of the OvS-DPDK application. This enables VM instances to connect to the ConnectX-5 NIC through SR-IOV, and directly send and receive data packets to/from the NIC itself.
1. To enable DPDK compilation for Mellanox NICs, libibverbs package needs to be installed. libibverbs is the user space verbs framework used by librte_pmd_mlx5 library in DPDK. This library provides a generic interface between the kernel and low-level user space drivers such as libmlx5. On Ubuntu platforms, this can be installed with the following command:
libibverbs
librte_pmd_mlx5
libmlx5
$ sudo apt install -y libibverbs-dev
2. To enable hardware acceleration, rdma-core 24.0 or higher must be built and later linked at the time of DPDK installation. This consists of the userspace components for the Linux Kernel's drivers or Infiniband subsystem.
rdma-core 24.0
Install the following packages if they are not already present on the system.
$ sudo apt install libudev-dev libnl-3-dev libnl-route-3-dev ninja-build pkg-config valgrind python3-dev cython3 python3-docutils pandoc
Clone the rdma-core repository from GitHub.
rdma-core
$ git clone https://github.com/linux-rdma/rdma-core.git
Run the build script in rdma-core repository. build/bin will contain the sample programs and build/lib will contain the shared libraries.
build
build/bin
build/lib
$ cd $HOME/rdma-core $ bash build.sh
Build the static libraries as shown in the DPDK documentation for MLX5 Poll Mode Drivers (PMDs).
$ cd $HOME/rdma-core/build $ CFLAGS=-fPIC cmake -DIN_PLACE=1 -DENABLE_STATIC=1 -GNinja .. $ ninja
1. Download the latest Mellanox OFED/EN Software Driver ISO image for your system. In this context, the software driver OFED-5.0-1 for Ubuntu 18.04 was downloaded.
OFED-5.0-1
$ wget http://www.mellanox.com/downloads/ofed/MLNX_EN-5.0-1.0.0.0/mlnx-en-5.0-1.0.0.0-ubuntu18.04-aarch64.iso
2. Mount the ISO and install the required libraries and kernel modules.
$ sudo mkdir -p /mnt/mlnx-en $ sudo mount mlnx-en-5.0-1.0.0.0-ubuntu18.04-aarch64.iso /mnt/mlnx-en $ cd /mnt/mlnx-en $ sudo ./install --upstream-libs --dpdk
3. Enable the SR-IOV mode and configure the number of maximum VFs for the Mellanox NIC. The PCIe address for this command can be obtained from the lspci command:
lspci
$ lspci | grep Mellanox 85:00.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5] 85:00.1 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5] $ sudo mlxconfig -d 85:00.0 set SRIOV_EN=true $ sudo mlxconfig -d 85:00.0 set NUM_OF_VFS=8
4. Reset the device to make the configuration effective.
$ sudo mlxconfig -d 85:00.0 reset
5. Configure the actual number of VFs. Two VFs have been created here. The interface name associated with the above PCIe address, 85:00.0 can be found with dmesg command:
85:00.0
dmesg
$ dmesg | grep 85:00.0 [.. 12.126544] mlx5_core 0000:85:00.0 enp133s0f0: renamed from eth3] $ echo 2 | sudo tee /sys/class/net/enp133s0f0/device/mlx5_num_vfs
After creating two VFs, you can find out the PCIe addresses associated with these VF interfaces using the dmesg command as shown previously. In this instance, the PCIe addresses for the VFs is 85:00.2 and 85:00.3 respectively. We would need these PCIe addresses later in the VM Setup section when we will assign these VFs to the VMs.
85:00.2
85:00.3
6. Unbind the VF interfaces from mlx5_core driver.
mlx5_core
$ echo -n "0000:85:00.2" | sudo tee /sys/bus/pci/drivers/mlx5_core/unbind $ echo -n "0000:85:00.3" | sudo tee /sys/bus/pci/drivers/mlx5_core/unbind
7. Enable the switchdev mode for the ConnectX-5 NIC.
$ echo switchdev |sudo tee /sys/class/net/enp133s0f0/compat/devlink/mode
1. Download the DPDK 19.11 repository.
$ cd $HOME/repos $ wget http://fast.dpdk.org/rel/dpdk-19.11.tar.xz $ tar xf dpdk-19.11.tar.xz
2. Modify the default configuration to build the Mellanox PMDs during the DPDK installation process.
$ cd $HOME/repos/dpdk-19.11 $ sed -i 's/CONFIG_RTE_LIBRTE_MLX4_PMD=n/CONFIG_RTE_LIBRTE_MLX4_PMD=y/g' config/common_base $ sed -i 's/CONFIG_RTE_LIBRTE_MLX5_PMD=n/CONFIG_RTE_LIBRTE_MLX5_PMD=y/g' config/common_base
3. Create the ThunderX2 configuration file for Arm platforms.
$ make config T=arm64-thunderx2-linuxapp-gcc
4. Install DPDK with the following flags and and link the rdma-core libraries.
$ sudo make T=arm64-thunderx2-linuxapp-gcc install DESTDIR=install \ EXTRA_CFLAGS="-I$HOME/repos/rdma-core-24.0/build/include" \ EXTRA_LDFLAGS=-L$HOME/repos/rdma-core-24.0/build/lib \ PKG_CONFIG_PATH=$HOME/repos/rdma-core-24.0/build/lib/pkgconfig
1. Clone the latest version of OvS on the system.
$ git clone https://github.com/openvswitch/ovs.git
2. Run boot.sh in the top source directory to build the configure script.
boot.sh
configure
$ cd $HOME/ovs $ ./boot.sh
3. The OvS official documentation provides steps which install all the files under /usr/local. But, we will do the OvS-DPDK installation in our home directory. Consequently, we create the usr/local directory structure under our home directory so that at the time of configuration, appropriate flags can be used to install these files in the home directory instead.
usr/local
$ mkdir -p $HOME/usr $ cd $HOME/usr $ mkdir -p local
4. Configure OvS to use DPDK datapath via --with-dpdk flag. OpenSSL support is also disabled by passing the --disable-ssl parameter.
--with-dpdk
--disable-ssl
$ cd $HOME/ovs $ export DPDK_DIR = $HOME/dpdk-19.11 $ export DPDK_TARGET = arm64-thunderx2-linuxapp-gcc $ sudo ./configure --with-dpdk=$DPDK_DIR/$DPDK_TARGET \ --prefix=$HOME/usr/local --localstatedir=$HOME/usr/local/var \ --sysconfdir=$HOME/usr/local/etc --disable-ssl
5. Run make and make install to install the executables under the $HOME/usr/local directory.
make
make install
$HOME/usr/local
$ sudo make -j $ sudo make install
Before proceeding ahead, please complete the exact steps listed under the following headings from my previous blog — Chapter 2: Setup for PHY-PHY Test
1. Set the environment variables.
$ export OVSSBIN=$HOME/usr/local/sbin $ export OVSBIN=$HOME/usr/local/bin $ export RUNDIR=$HOME/usr/local/var/run $ export LOGDIR=$HOME/usr/local/var/log $ export ETCDIR=$HOME/usr/local/etc
2. Clean the OvS environment by killing any existing OvS daemons and files generated from the previous run.
$ sudo $HOME/usr/local/sbin/ovs-appctl -t ovs-vswitchd exit $ sleep 1 $ sudo $HOME/usr/local/sbin/ovs-appctl -t ovsdb-server exit $ sleep 1 $ sudo rm -rf $RUNDIR/openvswitch/* $ sudo rm -rf $ETCDIR/openvswitch/conf.db $ sudo rm -rf $LOGDIR/openvswitch/ovs-vswitchd.log
3. Initialize a database to be used by the ovsdb-server.
ovsdb-server
$ sudo $OVSBIN/ovsdb-tool create $ETCDIR/openvswitch/conf.db $HOME/usr/local/share/openvswitch/vswitch.ovsschema
4. Configure ovsdb-server to use the database created in the previous step.
$ sudo $OVSSBIN/ovsdb-server --remote=punix:$RUNDIR/openvswitch/db.sock --remote=db:Open_vSwitch,Open_vSwitch,manager_options --pidfile=$RUNDIR/openvswitch/ovsdb-server.pid --detach
5. Initialize the database using ovs-vsctl. DPDK configuration arguments are passed to ovs-vswitchd via the other_config column of the OvS table. At a minimum, the dpdk-init option must be set to either true or try. Defaults are provided for all configuration options that have not been set explicitly.
ovs-vsctl
ovs-vswitchd
other_config
dpdk-init
$ sudo $OVSBIN/ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true $ sudo $OVSBIN/ovs-vsctl --no-wait set Open_vSwitch . other_config:max-idle=500000 $ sudo $OVSBIN/ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-lcore-mask=0x40000 $ sudo $OVSBIN/ovs-vsctl --no-wait set Open_vSwitch . other_config:pmd-cpu-mask=0x80000 $ sudo $OVSBIN/ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem=2048 $ sudo $OVSBIN/ovs-vsctl --no-wait set Open_vSwitch . other_config:n-rxq=1 $ sudo $OVSBIN/ovs-vsctl --no-wait set Open_vSwitch . other_config:n-txq=1
In the above configuration, an isolated CPU has been assigned to the pmd-cpu-mask argument. The isolated CPUs on the system can be checked with the following command:
pmd-cpu-mask
$ cat /etc/default/grub | grep isolcpus isolcpus=14-27
6. Enable the hardware offloading feature in OvS DPDK datapath via the ovs-vsctl utility.
.
$ sudo $OVSBIN/ovs-vsctl --no-wait set Open_vSwitch . other_config:hw-offload=true
7. Configure the VF representors and enable dv_flow via ovs-vsctl utility. A nonzero value given to dv_flow_en enables the DV (Direct Verbs) flow steering. DV flow steering uses a combination of L2-L4 flow specifications to steer network flows to specific queues.
dv_flow
dv_flow_en
$ sudo $OVSBIN/ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-extra="-w 0000:85:00.0,representor=[0-1],dv_flow_en=1"
8. Start the OvS daemon, telling it to connect to the same Unix domain socket created earlier.
$ sudo $OVSSBIN/ovs-vswitchd unix:$RUNDIR/openvswitch/db.sock \ --pidfile=$RUNDIR/openvswitch/ovs-vswitchd.pid --detach \ --log-file=$LOGDIR/openvswitch/ovs-vswitchd.log
9. Add a userspace bridge under the DPDK datapath via the ovs-vsctl utility.
$ sudo $OVSBIN/ovs-vsctl add-br dpdk-br0 -- set bridge dpdk-br0 datapath_type=netdev
10. Create DPDK ports and attach them to the previously created bridge.
$ sudo $OVSBIN/ovs-vsctl add-port dpdk-br0 pf0 -- set Interface pf0 type=dpdk options:dpdk-devargs="0000:85:00.0" $ sudo $OVSBIN/ovs-vsctl add-port dpdk-br0 vm1 -- set Interface vm1 type=dpdk options:dpdk-devargs="0000:85:00.0,representor=[0]" $ sudo $OVSBIN/ovs-vsctl add-port dpdk-br0 vm2 -- set Interface vm2 type=dpdk options:dpdk-devargs="0000:85:00.0,representor=[1]"
11. Add OpenFlow rules to direct traffic to the VMs to be created in the following section.
$ sudo $OVSBIN/ovs-ofctl add-flow dpdk-br0 in_port=pf0,ip,nw_src=10.10.1.0/255.255.255.255,action=output:vm1 $ sudo $OVSBIN/ovs-ofctl add-flow dpdk-br0 in_port=vm1,action=output:pf0 $ sudo $OVSBIN/ovs-ofctl add-flow dpdk-br0 in_port=pf0,ip,nw_src=10.10.1.1/255.255.255.255,action=output:vm2 $ sudo $OVSBIN/ovs-ofctl add-flow dpdk-br0 in_port=vm2,action=output:pf0
1. Create a VM using an XML configuration file via libvirt.
libvirt
$ virsh define ovs_vm_01.xml
You can use the following XML file as reference for creating the VM.
<domain xmlns:qemu="http://libvirt.org/schemas/domain/qemu/1.0" type="kvm"> <name>ovs_vm_01</name> <uuid>11e1fa3e-70cb-4e23-828c-2e7daa030364</uuid> <metadata xmlns:ns0="https://launchpad.net/uvtool/libvirt/1"> </metadata> <memory unit="KiB">4194304</memory> <currentMemory unit="KiB">4194304</currentMemory> <vcpu placement="static">4</vcpu> <resource> <partition>/machine</partition> </resource> <os> <type machine="virt-2.11" arch="aarch64">hvm</type> <loader type="pflash" readonly="yes">/usr/share/AAVMF/AAVMF_CODE.fd</loader> <nvram template="/usr/share/AAVMF/AAVMF_CODE.fd">/home/lance/pvp/nvram/ovs-sriov1_VARS.fd</nvram> <boot dev="hd"/> </os> <features> <acpi/> <apic/> <pae/> <gic version="host"/> </features> <cpu check="partial" match="exact" mode="custom"> <model fallback="forbid">host</model> </cpu> <clock offset="utc"/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <devices> <emulator>/usr/bin/qemu-system-aarch64</emulator> <disk type="file" device="disk"> <driver type="qcow2" name="qemu"/> <source file="/home/lance/pvp/images/ovs-sriov1.qcow2"/> <target dev="vda" bus="virtio"/> <address type="pci" bus="0x02" function="0x0" slot="0x00" domain="0x0000"/> </disk> <disk type="file" device="disk"> <driver type="raw" name="qemu"/> <source file="/home/lance/ovsdir/images/ovs-sriov1_init.img"/> <target dev="vdb" bus="virtio"/> <address type="pci" bus="0x03" function="0x0" slot="0x00" domain="0x0000"/> </disk> <interface type="hostdev" managed="yes"> <mac address="52:54:00:f0:16:f3"/> <source> <address type="pci" bus="0x01" function="0x1" slot="0x00" domain="0x0001"/> </source> <address type="pci" bus="0x01" function="0x0" slot="0x00" domain="0x0000"/> </interface> <interface type="network"> <mac address="54:52:00:de:a5:f0"/> <source network="virbr0"/> <model type="virtio"/> </interface> </devices> <qemu:commandline> <qemu:arg value="-object"/> <qemu:arg value="memory-backend-file,id=mem0,size=4096M,mem-path=/dev/hugepages,share=on"/> <qemu:arg value="-numa"/> <qemu:arg value="node,memdev=mem0"/> </qemu:commandline> </domain>
2. Edit the XML file to assign the VF interface to the VM. In this context, the PCIe address for the VF is 85:00.2 — the bus is 0x85, slot is 0x00 and the function is 0x2. You would need to change the XML segment according to your VF’s PCIe address.
0x85
0x00
0x2
$ virsh edit ovs_vm_01
<interface type='hostdev' managed='yes'> <mac address='54:52:00:f0:00:13'/> <source> <address type='pci' domain='0x0000' bus='0x85' slot='0x00' function='0x2'/> </source> <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/> </interface>
3. Start the VM.
$ virsh start ovs_vm_01
4. Once the VM has started, install the essential packages as well as extra kernel headers. These headers will consequently install the mlx5_core driver which will then be able to identify the assigned VF that was passed to the VM. It will also install the userspace verbs module, ib_uverbs.
ib_uverbs.
$ sudo apt install build-essential $ sudo apt install numactl libnuma-dev bc device-tree-compiler dh-autoreconf curl $ sudo apt install linux-modules-extra-`uname -r`
5. Load the kernel modules.
$ sudo modprobe ib_uverbs mlx5_core mlx5_ib
6. Install the libibverbs package to enable DPDK compilation for Mellanox interfaces.
$ sudo apt install libibverbs-dev -y
7. Download and install DPDK 18.11 in the VM. Similar to what was done in the DPDK Compilation section, modify the default configuration to enable Mellanox PMDs to be built during the DPDK installation process.
$ wget https://fast.dpdk.org/rel/dpdk-18.11.tar.xz $ tar xf dpdk-18.11.tar.xz $ cd $HOME/dpdk-18.11 $ sed -i 's/CONFIG_RTE_LIBRTE_MLX4_PMD=n/CONFIG_RTE_LIBRTE_MLX4_PMD=y/g' config/common_base $ sed -i 's/CONFIG_RTE_LIBRTE_MLX5_PMD=n/CONFIG_RTE_LIBRTE_MLX5_PMD=y/g' config/common_base $ export DPDK_TARGET=arm64-armv8a-linuxapp-gcc $ make config T=$DPDK_TARGET $ sudo make T=$DPDK_TARGET install DESTDIR=install -j2 EXTRA_CFLAGS="-O2 -march=native"
8. Allocate hugepages for DPDK testpmd application.
testpmd
$ mkdir -p /mnt/huge $ sudo mount -t hugetlbfs nodev /mnt/huge $ echo 1024 | sudo tee /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
9. Retrieve the PCIe address of the VF for whitelisting in the testpmd application.
$ lspci 02:00.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function]
10. Run the testpmd application, passing the Mellanox VF interface as an EAL argument.
$ sudo $HOME/dpdk-18.11/install/bin/testpmd -c 0x3 n 2 -w 0000:02:00.0 -- -i --rxq=1 --txq=1 --rxd=2048 --txd=2048
Enable a forwarding mode of your choice in the testpmd application and configure your traffic generator to send traffic to the host server.
At this point, you can repeat the above procedure for creating and setting up a second VM which can be assigned the PCIe address of 85:00.3. The OpenFlow rules for passing traffic between the second VM and the NIC have already been added. This would be an intermediate step if your ultimate goal is to run traffic between the two VMs (VM-VM forwarding). On the other hand, if your goal is to test the PHY-VM-PHY (vHost loopback) performance, then your setup is now complete.
This blog provides the reader with an overview of hardware offload to SmartNICs and a detailed tutorial on how to achieve OvS-DPDK Acceleration with a SmartNIC. We have used a Mellanox ConnectX-5 for our demonstration but would like to encourage readers to also experiment with Broadcom Stingray or Mellanox Bluefield SmartNIC. We would also suggest readers to view Lance’s session in Linaro Virtual Connect 2020 (following link) for gaining a deeper understanding on how different Arm based SmartNIC SoCs support different OvS offload models.
[CTAToken URL = "https://connect.linaro.org/resources/lvc20/lvc20-109/" target="_blank" text="Arm Core: Empower Networking on SmartNIC" class ="green"]
Hi,Really nice job!
And a very interesting initiative when it comes to direct access to the physical interface from a VM. But concretely what is the expected benefit in terms of transfer speed, if the method increases the processing of the system itself?
Do you have a benchmark test or something similar on a speed comparison?Regards
You could use the ovs_perf tool (GitHub / Conference talk) to get some performance numbers. I used it in the past to get various HW offload numbers when comparing TC to DPDK/Kernel (https://www.youtube.com/watch?v=87Mx547WEG8&t=1025s).