The amount of data being generated is exploding. The volume of data created, captured or replicated is expected to increase from 33 zettabytes1 in 2018 to 175 zettabytes in 20252, according to analyst firm IDC. To realize value from this data, we must be able to process it into meaningful insights. It is becoming clear that compute centric architectures will not continue to scale, and the focus is now on generating insights from the vast volumes of data where it resides, in storage devices. This is driving the rapid development of data centric computational storage.
Today, a growing amount of data is being stored on drives, but almost always the storage and compute that processes the data are not in the same place. The approach of moving large amounts of data (drives are commonly 16TB today and capacities are increasing), between storage and compute cannot scale, and makes it difficult to extract insights from data that can be converted into added value and benefit service organizations.
In the traditional storage model, data is stored on hard disk drives (HDDs) and solid-state drives (SSDs) and accessed and transferred to some external compute, typically a server. Computational storage puts data processing on the drive where the data is stored, enabling the generation of insights and value directly from the data.
Computational storage is all about making storage devices smarter to process the data directly where it is stored. This approach reduces the movement of large amounts of data to external processing and delivers myriad benefits, such as reduced latency and bandwidth usage, increased security and energy savings. In other words, data workloads are processed directly on the storage controller itself.
Applying computational storage is critical to address the real-time processing requirements of many machine learning (ML) or analytics applications, and other use cases from IoT to edge computing. In the case of IoT, the acceleration in numbers of deployments will produce huge amounts of raw, unstructured data which is moved to, stored, and processed in a server. However, not all the data captured is relevant.
Let us take an example: A surveillance camera system in a big parking lot records license plate numbers and the time when cars enter and leave to enable billing for parking time, as well as for security purposes. The information of interest is the license plates and it would be highly inefficient to move all of the large images or video streams to the server for image processing whether or not cars are entering or leaving the parking lot.
With computational storage, each camera streams to its local drive and then the compute recognizes the car license plates directly on the drive. Performing ML and image recognition directly on the storage drive and only returning the insight from that data to the server – license plate numbers and the time—is highly efficient. Also, when there is more than one camera in a parking lot, and one drive per camera, then the more cameras, the more drives, and the more compute in the right place. It makes the system more efficient and very scalable.
There are many other use cases where computational storage can have a significant impact. A few common examples:
Today with a traditional storage drive, data is moved from the device all the way to the server to be computed, which:
If the backhaul, the connection to the servers, in these systems provides limited bandwidth or is expensive, then the benefits of computation storage can reduce total cost of ownership (TCO) significantly. Additional benefits include:
Faster response time and reduced latency
Moving intelligence to where it is needed allows results to be delivered in near real-time. The data does not need to be encapsulated in protocols, then moved and copied through routers and switches, and unpacked on the server before it can be processed.
Reduced energy
No more huge data transfers that require energy and generate heat.
Security and privacy
The data does not leave the drive, only the insight is returned, reducing the risk of leaking information.
Scalability
Since the compute is on the drive, adding more drives means adding more compute where the data is stored.
A CSD is a storage device that provides persistent data storage and computational services. Computational storage is about coupling compute and storage to run applications locally on the data, reducing the processing required on the remote server, and reducing data movement. To do that, a processor on the drive is dedicated to processing the data directly on that drive, which allows the remote host processor to work on other tasks.
In a traditional storage system, the compute wants to do some processing on the data.
In a Computational storage system, the compute does not request data.
Read our guide to computational storage for more insight.
There are multiple ways of implementing computational storage, however, the main requirement is embedding processing capability in the drive controller that can run a rich operating system such as Linux and software components. This has key benefits:
Open source software with a vast Linux developer community
Open source software with a vast Linux developer community and standard tools that are used industry wide make the development experience easier. Being able to create workloads that developers can then deploy using standard systems based on Linux to the drive, still following the SNIA standard, simplifies a system and allows for easier software development.
Readily available tools
With Linux, the vast ecosystem of tools and open source software are available to develop, deploy, and manage computational storage workloads. This enables the developer community to quickly migrate tasks to computational storage drives.
Intelligent storage enabled
In a standard NVMe drive, the drive is sent blocks of data, breaks them up and stores them into pages in its NAND dies. The server asks to be sent a block of data, fetches it from the NAND, reassembles it back into a block, and finally sends it to the host. Still, the drive does not know that these blocks make up a JPEG image, for example, because it does not understand the file system. Instead, Linux running on the drive enables intelligent storage as it can mount the standard file system while the CSD applications can understand what files the blocks of data actually represent and perform actions on the data directly.
The drive as a mini server
Linux running on the drive can manage the drive, develop workloads, and download new workloads all using current standard open source systems. It turns the drive into a mini server at the lowest possible cost.
Now you may wonder: Is Linux really adapted for computational storage? The answer is Yes.
Isn’t it too big? The answer is No.
Storage drives today already have gigabytes of RAM and terabytes of storage and fast compute to handle the massive data movements in and out of the drive. Linux may bring to mind large installations of software for big servers, not adapted to on-device storage and compute, but the requirements for Linux are much smaller compared to a big server. The software can be significantly reduced in size.
With Linux, there is no need for display drivers, several functions are not applicable, and you can simplify it and tailor it to your controller. For example, Debian 9 requires only 512MB of RAM and 2GB of storage.
Administration of the CSD can be performed using the standard open source tools that are used in these complex systems. Workloads can be securely downloaded and managed using common tools such as Kubernetes, Docker or extended Berkeley Packet Filter (eBPF) to enable secure execution of applications or scripts in a secure way.
The Arm storage solution offers easy, fast, and cost-effective technology, support, and vast ecosystem for successful in computational storage.
Arm processor portfolio
Arm Cortex-A processors are optimized for low power and high performance in complex computing tasks on storage devices. Furthermore, the Cortex-R82 processor is optimized for high-performance real time and high-level operating system applications. These applications-capable processors enable computational storage with:
Porting and optimization of applications
Arm and partners have ported, and optimized, all the leading Linux distributions and open source applications. This is maintained automatically and runs on any Cortex-A processor or Cortex-R82, without adaptations. Arm has also optimized these applications both internally and through Linaro to make sure everything runs optimally on Arm.
Software ecosystem of tools and libraries
With support from the Arm software ecosystem, programming work is minimal. ML software libraries that run on the Cortex-A processors and Cortex-R82 accelerate the search speed through images or other files.
Arm’s partner, NGD Systems, is exploring the use of computational storage to help airlines improve the analysis of flight data. Today, airlines generate multiple terabytes of telemetry data per hour and offloading and analyzing that data can take hours, which is time operators cannot spare. With computational storage, flight analytics can be provided to the right people at the right time, helping to improve safety in the air.
There are many other, non-Linux, types of compute possible on CSDs. For fixed purpose compute functions such as encryption, compression or deduplication then low-level real-time software, hardware acceleration or neural processing units (NPUs) can all be used in a CSD system. These types of specific functions and accelerators can be built into CSDs and accessed directly through the CSD protocol extensions defined by the industry.
These low-level functions can also be accessed from high-level operating systems where available. The flexibility and ease of customization that a high-level operating system provides, combined with a huge developer community and low-level accelerators can deliver very high-performance and efficient CSD solutions.
There are devices based on Arm processors already available today from multiple partners, together with an industry-wide effort to have all storage developers and players aligned to a common implementation. Arm is actively involved in the SNIA Computational Storage Technical Working Group, working with 45 companies and 202 members to define a standard. This standard will encourage the adoption and development of computational storage as it removes the risk of fragmentation and lack of compatibility.
Visit our computational storage website to learn more about the Arm storage solution.
[CTAToken URL = "https://www.arm.com/resources/contact-us/computational-storage-consultation" target="_blank" text="Talk to an expert" class ="green"]
11 ZB zettabyte = 1,000,000,000,000,000,000,000 bytes, 1,000 EB exabytes, 1 million PB petabytes, 1 billion TB terabytes
2Source: Data Age 2025. The Digitization of the World From Edge to Core. An IDC White Paper – #US44413318, sponsored by Seagate. https://www.seagate.com/files/www-content/our-story/trends/files/idc-seagate-dataage-whitepaper.pdf