Collaborating without divulging secrets with Veracruz

April 27, 2021

7 minute read time.

By the Arm Veracruz Development Team: Basma El Gaabouri, Christopher Haster, Derek Miller, Dominic Mulligan, Guilhem Bryant, Nick Spinale, Hugo Vincent, Shale Xiong.

Protecting data in use

Data exists in one of three modes: in transit, at rest, and in use. Today, we understand how to protect data when in transit, that is, when it is being sent from computer to computer. This protection is achieved using protocols like the Transport Layer Security protocol (TLS), which is commonly deployed in web-browsers to protect the confidentiality and integrity of our internet traffic. Likewise, we also generally understand how to protect data when it is at rest. That is, when it is stored on a computer’s disk or similar, using standardized block ciphers like the Advanced Encryption Standard (AES), and full-disk encryption tools built around them.

However, how to protect data when in use–––that is, when data is being fed as input into a computation, in a potentially collaborative setting–––is not well understood. Cryptographers have made great strides in developing a host of techniques for protecting data when in use, for example Fully Homomorphic Encryption schemes and protocols for affecting Secure Multiparty Computations. But the unfortunate truth is that these techniques tend not to be deployed widely, barring exceptional cases. There are many reasons why this is true, but Advanced Cryptographic techniques are slow, hard-to-use, and even harder to understand. What is more, these techniques tend to be quite brittle, requiring significant amounts of reconfiguration if the underlying computation changes.

Strong Isolation Technologies pose a potentially interesting, and pragmatic, alternative to the use of pure cryptography for protecting data when in use. Here, we use the phrase Strong Isolation Technology to denote a range of hardware- and high-assurance software-based isolates. These isolates provide strong confidentiality and integrity guarantees to software, even in the face of a privileged attacker (for example, an attacker able to wield the capabilities of the Operating System or Hypervisor). Strong Isolation Technologies are also typically accompanied by a remote attestation procedure which allows third parties to reliably challenge the authenticity of an isolate, and the integrity of software loaded within it, from a potentially remote machine. Remote attestation, along with the confidentiality and integrity guarantees of isolates, allows a third party to establish an execution environment, safe from prying eyes or interference, in a known good state, on somebody else’s machine.

Simply secure with Veracruz

The Arm Veracruz explores how novel, data-intensive distributed systems can be built using Strong Isolation Technologies and remote attestation.

Veracruz allows programmers to quickly (and easily!) design collaborative, privacy-preserving computations amongst a group of mutually mistrusting individuals, using Strong Isolation Technologies as a shared “neutral ground” within which a collaborative computation takes place. Participants in a Veracruz computation use standard transport-layer security to feed their secrets directly into the isolate after authenticating the isolate and its contents using remote attestation. Veracruz harnesses a range of strong isolation technologies–––including Arm TrustZone , AWS Nitro Enclaves, Intel SGX Secure Enclaves, and the seL4 high-assurance hypervisor – as a mechanism by which groups of collaborators can securely pool their data without necessarily revealing it to each other, or to anybody else. Once pooled inside an isolate, this data is fed as an input to a program, with the result retrievable by principals stated in a global policy file.

Whilst Veracruz aims to provide strong security and privacy guarantees to principals engaging in collaborative computation, our guarantees are naturally not as strong as those offered by Advanced Cryptography. On the other hand, Veracruz is more efficient, easier to deploy and configure, and much easier to explain as compared to pure cryptography.

Note that Veracruz can be used to affect several interesting privacy-preserving collaborative computations, including:

Collaborative, privacy-preserving Machine Learning (ML),
Safely outsourcing computations from a computationally weak device (such as a microcontroller) to a more powerful edge device or server,
Protection of IP, for example, novel computer vision or ML algorithms,
Privacy-preserving social network and graph analytics,

Alice and Bob

Let us focus on one of the use-cases mentioned above — privacy-preserving ML — and describe how Veracruz can be used to design a distributed computation that allows Alice and Bob, representatives from two competing companies, to collaborate in a delimited manner.

Specifically, Alice and Bob want to pool their private customer click-through data together to derive a more effective ML-based customer recommendation system than either could have hoped to achieve separately. Importantly, neither wish to divulge their data set to each other, nor to anybody else: the only thing that should be divulged from the computation, and only to both Alice and Bob, is the ML model learnt from their pooled data sets.

Alice and Bob securely provision their data, and their agreed algorithm, into the Veracruz runtime after authenticating the authenticity of the runtime.

Figure 1: Alice and Bob securely provision their data, and their agreed algorithm, into the Veracruz runtime after authenticating the authenticity of the runtime, and the Isolate containing it, using remote attestation. Once the computation is complete, both Alice and Bob (and nobody else!) gets access to the learnt ML model.

To achieve this, Alice and Bob first agree on the ML algorithm to use, its parameters, and the format that their datasets must be stored in. Arbitrarily, Alice implements this algorithm, in Rust say, and divulges it to Bob for vetting. Once Bob is happy with the algorithm, the two start an isolate on a host machine and load the Veracruz runtime into it. Alice and Bob then both use a remote attestation procedure to check that the isolate has indeed been started, and that the correct Veracruz runtime has been loaded within it.

Once the isolate and its software have been authenticated using remote attestation, Alice and Bob know that the isolate is indeed genuine and contains the software that they think it does. Accordingly, the two make a secure connection to the isolate itself using TLS and provision their data sets and the ML algorithm into it. Note that the host of the machine cannot see inside or influence the behavior of the isolate, nor can they break the encryption of the TLS connection used to provision the data sets or algorithm. At this point, both data sets and algorithm are now in one place without Alice or Bob, or the host of the computation, having learnt anything that they should not have. All that is left now is for the computation to trigger, and produce a result, which is then made retrievable by both Alice and Bob, again via a secure TLS link.

Now open source

Veracruz is now an open source project, with all design and development discussion now taking place in public. Moreover, Veracruz was also recently adopted as a project by the Confidential Compute Consortium (CCC), an industry-led Linux Foundation consortium aiming to promote hardware-based confidential computing technologies. The Arm Veracruz team welcomes contributions from interested third parties, and we have listed several issues in our GitHub issue tracker suitable for newcomers to the project.

Find out more

To find out more about the project, and how to contribute, you can consult the following resources:

The Veracruz project homepage, which features news updates and links to upcoming talks and events, and the Veracruz project GitHub repository, which hosts our issue tracker, design discussions, and source code.
A more in-depth explanation of how Veracruz works from the Veracruz project wiki.
Catch up with talks by Dominic Mulligan at FOSDEM 2021 and Derek Miller at LCA 2021 and OC3 2021. Dominic’s talk focuses on providing a general overview of the project, whilst Derek’s talks delve deeper into technical aspects of Veracruz’s use of cryptography.

Veracruz on GitHub Questions? Contact Dominic Mulligan

Research Articles

HOL4 users' workshop 2025

Hrutvik Kanabar

Tue 10th - Wed 11th June 2025. A workshop to bring together developers/users of the HOL4 interactive theorem prover.
- March 24, 2025
TinyML: Ubiquitous embedded intelligence

Becky Ellis

With Arm’s vast microprocessor ecosystem at its foundation, the world is entering a new era of Tiny ML. Professor Vijay Janapa Reddi walks us through this emerging field.
- November 28, 2024
To the edge and beyond

Becky Ellis

London South Bank University’s Electrical and Electronic Engineering department have been using Arm IP and teaching resources as core elements in their courses and student projects.
- November 5, 2024