Defacto SoC Compiler is a leading tool for System on Chip integration, allowing users to bring together various IP blocks such as CPU cores and interconnect fabrics according to relevant constraints and create the RTL needed to stich all these components together. SoC Compiler is used at Arm to accelerate the generation of top-level Verilog code while reducing the scope for errors.
As an important component within key IP design flows, ensuring that SoC Compiler is able to turn around results as quickly as possible is critical to maintaining the productivity of our engineers. With that in mind, we have been reviewing the performance of the tool on the latest Arm Architecture instances available within our compute environment.
Arm performed an analysis to validate the performance of Defacto’s SoC Compiler on a spread of machines across our AWS environment.
We first looked at running SoC Compiler on its own. We used the largest instance type available for this test, to make sure that the hardware was not being shared with any other users. Here, we found that the high clock frequency R7iz instance type in AWS was about 20% faster than the Graviton3 powered R7g. However, the R7iz costs over 70% more than the R7g per vCPU hour. The R7g was faster than all the other tested instances, and at least 20% cheaper:
Figure 1. Unstressed Runtime, normalized to R7g instance type
This test allowed us to see the underlying performance of the instance types, but in real world use we would rarely have a whole CPU dedicated to a lone single-threaded task. To see what we would see when our compute was being shared by many users, we repeated the tests, but this time while running a background task on all the vCPUs not being used by SoC Compiler. The results were noticeably different:
Figure 2. Loaded Runtime, normalized to R7g instance type
On a busy instance such as we would typically see in our production use, the Graviton3 turns around results in the fastest time by some margin. Even the accelerated R7iz takes 30% longer to complete the task.
Having looked at the performance, we then investigated the cost. In a loaded run scenario, the figure of merit is the amount of money it would cost to purchase a single vCPU for the duration of the run. This is because this test is single threaded, and the costs for the other vCPUs would be assigned to the workloads running on them. Looking at AWS On Demand pricing in the Ohio region, again AWS Graviton3 comes out as the most cost-effective platform to use for running SoC Compiler:
Figure 3. Runtime Cost, normalized to R7g instance type
As well as being faster and lower cost AWS Graviton3 is, according to Amazon, also far more energy efficient than their non-Arm-based instance types. This means that as well as saving time and money by running SoC Compiler on Arm-based instances, it reduces the environmental impact of the design flow.
Defacto’s success with bringing SoC Compiler to platforms such as AWS Graviton demonstrates the advantages of switching EDA flows to Arm Architecture compute while maintaining the same quality of tool as their customers have come to expect. The partnership between Defacto and Arm is helping to deliver tools to our joint customers that optimize the development of the next generation of Systems on Chip.
More information about Defacto can be found at https://defactotech.com/.
More HPC blog posts