Arm Community
Arm Community
  • Site
  • User
  • Site
  • Search
  • User
Arm Community blogs
Arm Community blogs
Servers and Cloud Computing blog Out-of-band telemetry on Arm Neoverse based servers
  • Blogs
  • Mentions
  • Sub-Groups
  • Tags
  • Jump...
  • Cancel
More blogs in Arm Community blogs
  • AI blog

  • Announcements

  • Architectures and Processors blog

  • Automotive blog

  • Embedded and Microcontrollers blog

  • Internet of Things (IoT) blog

  • Laptops and Desktops blog

  • Mobile, Graphics, and Gaming blog

  • Operating Systems blog

  • Servers and Cloud Computing blog

  • SoC Design and Simulation blog

  • Tools, Software and IDEs blog

Tags
  • Server and Infrastructure
  • firmware
  • infrastructure
  • Neoverse
  • Telemetry
Actions
  • RSS
  • More
  • Cancel
Related blog posts
Related forum threads

Out-of-band telemetry on Arm Neoverse based servers

Samer El-Haj-Mahmoud
Samer El-Haj-Mahmoud
September 17, 2025
7 minute read time.

This blog is co-authored by Samer El-Haj-Mahmoud, Arm & Tim Lewis, Insyde Software.


Modern datacenters run at massive scale. Even small inefficiencies in server management can multiply into higher costs, downtime, or missed performance targets. To meet strict Service Level Objectives (SLOs), operators need continuous, reliable visibility into platform health without disrupting workloads.

This is where out-of-band (OOB) telemetry comes in. By streaming real-time insights into thermal, power, and hardware subsystems—independently of the OS—OOB telemetry enables proactive management and automation at fleet scale.

In a previous Arm community blog, we presented Arm’s work for advancing server manageability. In this blog post, we show how Arm, in collaboration with Insyde Software is advancing OOB telemetry on Arm Neoverse–based servers. Built on open standards and supported by a working proof-of-concept, this architecture is designed to scale from silicon sensors all the way to datacenter analytics. This approach ensures interoperability and readiness for production environments.

Architecture at a glance

End-to-end OOB telemetry architecture

Figure 1: End-to-end OOB telemetry architecture

The above figure shows the end-to-end flow of the OOB Telemetry architecture on Arm servers. Key components are:

  1. SoC on-‑chip sensors: the System Monitoring Control Framework (SMCF) is a distributed framework of heterogeneous SoC sensors and monitors that are aggregated into coherent groups with DMA-based sampling.
  2. Service Control Processor (SCP): The System Control and Management Interface (SCMI) provides a set of standard firmware interfaces for power, performance, and system management. SCMI v4.0 introduced the System Telemetry protocol to formalize discovery, configuration, and collection of SoC telemetry data, including from SMCF and other on-chip telemetry sources.
  3. Manageability Control Processor (MCP): The MCP consumes the telemetry data from the SCP using Shared Memory Channels between the two processors.  It then uses the Arm SBMR defined side-band interface to send the data to the BMC, using DMTF standard Platform Level Data Model (PLDM) and Management Component Transport Protocol (MCTP).
  4. BMC northbound: the BMC exposes telemetry data remotely via standard Redfish TelemetryService and publishes bulk telemetry payloads (in SCMI formatted records) using the newly defined Redfish TelemetryData resources.
  5. Fleet Management: an OpenTelemetry OTel Collector receiver decodes the bulk telemetry payload and exports to a database (such as Victoria Metrics), to be consumed by the observability back-end, Grafana.

Inside the SoC: SMCF and SCMI

Telemetry Data Capture Format (TDCF)

Figure 2: Telemetry Data Capture Format (TDCF)

As explained in a previous Arm Community blog post, SMCF normalizes on-‑chip monitors (thermal, voltage, activity, fabric, and third-party IP), schedules sampling, and writes compact samples to memory—so the SCP can serve telemetry without ad-‑hoc‑ register scraping.

On top of that, SCMI is the single contract the SCP exposes. With System Telemetry added to SCMI v4.0, agents can: discover available Telemetry Data Events, configure cadence/filters, and collect data via snapshot or buffered methods from various on-chip sources including SMCF. SCMI also defines Telemetry Data Capture Format (TDCF), as shown in Figure 2. This same data is surfaced in-band (IB) to the OS using the ACPI PCC mailbox, and out-of-band (OOB) to the MCP—identical payloads, two paths. That symmetry keeps the decoder portable and lets the platform owners decide which data are IB+OOB, versus IB only.

Crossing the sideband: PLDM-over-MCTP

Arm SBMR architecture

Figure 3: Arm SBMR architecture

As outlined in a previous Arm Community blog and in Figure 3, Arm SBMR server management architecture relies on DMTF MCTP and PLDM standards for communication over the MCP (which is a Satellite Management Controller or SatMC) and the BMC.

  • MCTP defines a transport-agnostic protocol that can run over various physical interfaces—for example SMBus/I2C, PCIe, I3C, USB, and Serial. The proof-of-concept prototype runs MCTP-over-Serial  (which is easily available on the Arm Neoverse FVP model). MCTP-over-I3C or MCTP-over-USB or other bindings are common in real server platforms.
  • PLDM provides a standard data model and commands that can be sent over MCTP for representing platform management data. We use two PLDM mechanisms to represent telemetry data:
    • PLDM Numeric Sensors (from PLDM for Platform Monitoring & Control specification): used for on-platform environmental telemetry—temperatures, voltages, currents, fan targets, power states—read and acted on by the BMC (alerts, local policies, user interfaces) for local platform monitoring.
    • PLDM File I/O (from PLDM for File Transfer specification): used for bulk telemetry (dense time‑series buffers, traces, snapshots). The BMC performs PLDM file operations to open, read (multi-part), and close the binary files. These blobs are not consumed by the BMC; they are forwarded to fleet analytics. This data should flow through without lossy transformation at the BMC.

Northbound API: Redfish Telemetry + passthrough bulk data

Redfish and OTel handoff

Figure 4: Redfish and OTel handoff

The above figure illustrates an example of a typical out-of-band telemetry deployment in datacenters. The setup uses some platform components and off-the-shelf software packages:

  • Metrics & eventing: The BMC publishes Redfish service that includes TelemetryService resources (MetricReportDefinition, MetricReport, Triggers) to publish sensor-based telemetry. Redfish can also stream alert message events to subscribers using Server-sent Events (SSE).
  • Bulk telemetry: is part of the Redfish TelemetryData. Small payloads can be included as part of the resource AdditionalData (Base64). For larger payloads, the payload is published as a stand-alone binary that can be downloaded using the AdditionalDataURI. This is designed specifically for passthrough bulk telemetry from devices or services—perfect for SCMI-encoded blobs coming over PLDM File I/O. See Figures 5 and 6 for examples.

Example Redfish TelemetryData resource and the attached SCMI telemetry
Figure 5: Example Redfish TelemetryData resource and the attached SCMI telemetry

Example SCMI TDCF telemetry data

Figure 6: Example SCMI TDCF telemetry data


Fleet analytics: OpenTelemetry to Grafana

The BMC forwards PLDM File I/O payloads to an OpenTelemetry Collector with a custom receiver that parses the SCMI TDCF telemetry container (header, records, timestamps). From there, you can export via OTLP to a database (such as Victoria Metrics in our prototype) and visualize in Grafana. For example, Socket and Core thermal envelopes, or workload-correlated power traces. The OpenTelemetry documentation shows how to build a receiver and assemble a custom Collector. Grafana documents ingesting OTLP natively.

Visualizing bulk telemetry data using Grafana

Figure 7: Visualizing bulk telemetry data using Grafana

Putting it together: Getting started with the prototype

 OOB Telemetry prototype

Figure 8 – OOB Telemetry prototype

The above figure shows a complete setup of this proof-of-concept using Arm Neoverse FVP, OpenBMC, OpenTelemetry, Victoria Metrics, and Grafana. All the source code of this proof-of-concept prototype is published here.

Insyde Software: Integration with Supervyse® OPF and InsydeH2O®

Insyde took the Arm prototype and made production-level modifications. First, the BMC model was updated from the Arm Base FVP to the QEMU emulation of the ASPEED AST2700. This model supports a wide variety of the standard AST2700 peripherals, and that allows usage of the production OpenBMC code from Insyde’s Supervyse OPF product. Second, for the Neoverse FVP model from Arm, Insyde transitioned to use its fully featured InsydeH2O UEFI firmware product, allowing the simulation of basic functionality and numerous extended features. Combined with the firmware images for Arm’s SCP and MCP, this provides simulation of a powerful, fully featured Arm data center system.

 OOB Telemetry with InsydeH2O UEFI firmware and Supervyse OPF OpenBMC firmware

Figure 9 – OOB Telemetry with InsydeH2O UEFI firmware and Supervyse OPF OpenBMC firmware

Visit us at OCP Summit 2025 to Learn More

To learn more about this architecture and see a demo of the Arm and Insyde software prototypes, visit the Arm booth at the 2025 OCP Global Summit 2025, San Jose, CA, Oct 13-16, 2025.

References

  • A System Monitoring Control Framework (SMCF) for Arm Neoverse CSS: https://community.arm.com/arm-community-blogs/b/servers-and-cloud-computing-blog/posts/system-monitoring-control-framework-arm-neoverse-css
  • Advancing server manageability on Arm Neoverse Compute Subsystem (CSS) with OpenBMC https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/advancing-server-manageability-arm-css-openbmc
  • OpenBMC proof-of-concept Arm FVP platform: https://gitlab.arm.com/server_management/PoCs/fvp-poc
  • Arm Server Base Manageability Requirements (SBMR) Specification: https://developer.arm.com/documentation/den0069/latest/
  • Arm System Control and management Interface specification: https://developer.arm.com/documentation/den0056/latest/
  • System Monitoring Control Framework Architecture Specification: https://developer.arm.com/documentation/den0108/latest/   

 

Anonymous
Servers and Cloud Computing blog
  • How Fujitsu implemented confidential computing on FUJITSU-MONAKA with Arm CCA

    Marc Meunier
    Marc Meunier
    Discover how FUJITSU-MONAKA secures AI and HPC workloads with Arm v9 and Realm-based confidential computing.
    • October 13, 2025
  • Pre-silicon simulation and validation of OpenBMC + UEFI on Neoverse RD-V3

    odinlmshen
    odinlmshen
    In this blog post, learn how to integrate virtual BMC and firmware simulation into CI pipelines to speed bring-up, testing, and developer onboarding.
    • October 13, 2025
  • Accelerating early developer bring-up and pre-silicon validation with Arm Neoverse CSS V3

    odinlmshen
    odinlmshen
    Discover the Arm Neoverse RD-V3 Software Stack Learning Path—helping developers accelerate early bring-up and pre-silicon validation for complex firmware on Neoverse CSS V3.
    • October 13, 2025