The complex task of selecting which processor core to license can be simplified by using a standard check list approach to evaluate all relevant criteria. This white paper goes through such a check list in full details to demonstrate the process. An objective check list approach to the problem ensures that the interests of all stakeholders are taken into consideration.
Selecting a processor IP for a new device is a very complex task due to the number of factors involved in the decision making process. One has to take into consideration a whole range of business and technology elements that will span the lifetime of the product. What may seem as a logical choice at the start of the design cycle may end up be a bad choice once the product is in the hands of end customers. Focusing on the initial cash outlay may not be the best answer, nor focusing on one specific technical aspect. The choice always involves multiple people, departments and business units.
This document was written with the intent of giving the reader an objective set of criteria to use in qualifying the offers from various vendors. The list allows all stakeholders to cover all factors in the selection process and to discuss objectively the end result.
One size does not fit all. Each project has specific focus points that have more weight in the decision process. These focus points will change between projects and between departments. What is important to product A is not necessarily important for project B. For this reason, the scoring matrix at the end of the document has a weight column where the importance of item is entered to bias the results accordingly.
The proposed approach is to first discuss each and every item in the business and technology section to allocated importance coefficients then to create a scoring matrix for each vendor with a score for fit for each criterion. The total of the two tables gives the overall suitability of that vendor for that specific project.
figure 1
Selecting a CPU core has major implications on the business side of things. Figure 1 shows the selection criteria that are useful for the final decision. The products created with that core do not exist in isolation. The design and fabrication of any product requires tools and design engineers. The product then exists within a context of customers, users and suppliers. All these aspects have to be taken into account when selecting the CPU core.>
Usually, a product line is desired as opposed to a single product. This offers the end customers the choice between low, medium and high performance units or different flavors for different applications requiring for example different interfaces. Given the choice, one would want to use the same tools for all products in the product line to simplify migration from one to the other. In the CPU arena, migration across devices is made easier if all cores offer a coherent instruction set. Normally, migration is made easier going higher up in performance so that customers may start with a low end product and naturally migrate to the next device as more performance is required for the application.
The item to avoid is a disruptive jump between products which opens a window of opportunity for the customer to switch to a different supplier. If different tools are required for different products from the same family then why not go for a totally different supplier?
ARM offers a coherent processor product family with a compatible instruction set using the same design and development tools. One or multiple processors may be used to create a complete product line that spans various application end points.
Roadmap
Is the product or product line being created an isolated case? Or is this the start of a new business that will grow and expand over time? For any business, a clear path to growth and expansion is necessary. One has to be confident that the selected supplier is able to deliver new products in the future for the business to build upon to create new products.
ARM maintains a clear product roadmap that extends out several years in the future. This plan is shared in advance with customers to permit them to plan their own products accordingly and allows them to have input into the feature set and the expected time line of deployment.
The license cost is the most visible part of the selection process, followed by royalties. One should keep in mind what the license is giving access to. It is not just the processor IP but a whole set of services that go along for the design, deployment and marketing of the product.As for the royalties, this is the indication that the supplier is a true partner since a partnership means sharing the business risk so there is no payment of royalties until the delivery of the completed product.
At ARM, the processor IP cost is split into license fees and royalties since the partnership model is the selected mode of operation.
Designing an SoC requires design engineers. This is a fact of life. The most common CPU architecture in the market will have the largest pool of design engineers available. Selecting the most common architecture gives access to that resource pool. Selecting a less common architecture makes it harder and more expensive to find qualified design engineers.
A device with a CPU requires software to create a complete product. Choosing a processor with an instruction set that is widely spread in the market provides access to a larger pool of software engineers. This competition helps keep the cost of recruiting qualified engineers in check.
Besides design engineers and software developers one requires a whole set of tools and services that are used once in the life time of the product design. The presence of an ecosystem of suppliers around the selected processor IP means wider availability of tools and services at a competitive price.
ARM currently has the largest ecosystem in the industry with over 1000 companies due to the focus on a partnership business model as mentioned above.
Internal software engineering resources may be enough to build the software code base for the product but sometimes middleware is needed to complete the offering. In this case, internal resources may not be the best choice since middleware is not core business for the company. Access to multiple suppliers is ideal since it opens up the choice. Software companies normally target the largest possible markets which in this case translates to the most widely used instruction set. Selecting processor IP with that instruction set opens the door to a wide competitive selection of middleware suppliers.
If the product is a black box, then end customers will not add any applications to it. However, if end customers will add their own software to the device then having a source of multiple software development tools is essential. One tool does not fit all. Each end customer has their own views on what a tool should be and how it should interface to the user.
ARM, via the ecosystem, has the largest number of software development tool suppliers that compete in this market. The range includes free tools and professional commercial packages.
Upon completion, a product usually requires some marketing to start the sales process. All marketing plans have a budget which varies depending on the resources of each company.
ARM has a large marketing reach and offers its partners co-marketing activity where the end product is introduced into the market via the ARM marketing channels. For most product categories this is the equivalent of a very wide marketing campaign that would have required a large budget.
In the same argument of product roadmaps, one needs to consider the supplier from the angle of business continuity. It takes over a year to build, qualify and release a device. What happens if the supplier is no longer in business during that time or shortly afterwards? What will be the business impact from such an unexpected event?
Does the selected supplier have the financial and business strength to be around for the next two to three years? The answer to this question is probably more important than many others.
Once the design is complete, the device needs to be built. The initial choice of the processor IP and process node will now come into play. Not every foundry is qualified for every processor IP and every node. One can always be the first company to go through that process node at a specific foundry at the cost of additional risk of multiple device iterations.Choosing a supplier, that is already proven at multiple foundries, for multiple process nodes, removes this additional risk and increases the probability of success from first silicon.
The worst thing that can happen is to discover quality issues in a device once it has hit mass production. Proper design methodology, process check point, validation and qualification are necessary and costly steps required to ensure the reliability of the block that will form the heart of the chip. Choosing an IP vendor with a sophisticated quality validation is essential.
Finally, the market position of a vendor is a clear indication of the value of his offer. In the technology space, there is always a need to identify the main company that drives the market. Collectively, customers elect such a leader by selecting its products. This is the case for each market segment, be it networking, enterprise software or processor IP. Selecting processor IP from the designated market leader is a decision associated with lower risk since the market players have already collectively voted for that player.
Figure 2
Selecting a CPU core involves the study and analysis of several technical aspects. Figure 2 shows the selection criteria that are useful for the final decision. The processor is only one part of a total system. All these aspects have to be taken into account when selecting the CPU core.
An important factor in all designs is the processor performance. One can claim that software will always expand to use all available processor performance. Nonetheless, the device has to be designed with a specific performance limit in mind, with some amount of headroom. Not too little to put the product in danger and not too much to be wasted on idle loops.The item that is often overlooked is energy efficiency. A larger or faster processor gives more performance at increased power consumption. There is a need to balance energy efficiency as opposed to brute force performance. For example, the Cortex-M0+ processor has the highest energy efficiency, measured in coreMark/mW, of all processors in the same class on the market today.
Another factor is the memory subsystem. If an additional memory is required in order to reach the required performance then power consumption will go up and the die area will increase. Coprocessors and hardware multiply and divide provide a boost to performance with a relatively low increase in power consumption.
The size of the die dictates the final price for the device. The smaller the die the lower the cost.
As mentioned above, power consumption needs to be considered in tight coupling with processor performance. One can always reduce the clock frequency to reduce power at the expense of performance. A delicate balance is required to reach the maximum energy efficiency to where a processor is giving maximum performance with the lowest power consumption.
Target frequency will dictate the process node which will impact the total design. Target frequency is associated with peak performance and impacts the clock network and distribution across the device. More advanced nodes will reach higher frequencies with the same processor at the expense of increased static power.
A device needs interfaces to connect the CPU to the system. These interface blocks are available from various vendors for integration in any design. A common set of blocks is normally available from the processor IP vendor. The availability of peripheral IP blocks needs to be considered tightly along with the choice of the processor since any incompatibility or interconnect issues will impact the project in a severe manner.
Once the device is back from fab, it needs to be validated at the hardware level and the software level. This validation process requires some sort of a debug system that is embedded in the design. A debug port, a trace buffer and all such debug blocks that allow designers to validate the operation of the device and allows software engineers to easily bring up the software are required.
To speed up device design, a vendor may offer a complete package where the processor is already integrated with common peripheral blocks. The package will cut a few weeks from the project schedule and allow the designers to focus totally on the key parts that differentiate the design.
A critical factor for the selection process is the availability of a design support team and a software support team. If the designers hit a blocking point then they need an expert to get them going again. Easy and rapid access to such experts, translates into a gain in project time worth weeks if not months of design effort.
The right design tools reduce the development time for the project, especially when the design tools already support the processor of choice. The design complexity increases when the device has an analog component. Tool suppliers offer special tool combinations specific for mixed signal designs.
All major tool vendors are partners of ARM and supply preconfigured designs around Cortex-M series processor IP.
The device was designed with built in debug components. To access these components via a debug port, a debugger is required. JTAG is a standard debug interface. There exists also a 2 wire interface to reduce the number of pins required. Usually, the debugger integrates with the development environment. It is better to have a selection of units to select from as opposed to being forced to a specific unit. The unit needs to support access to trace buffers to stream instruction traces in real time or to dump pre-recorded traces. Breakpoint and watch point support is also important.
To reduce project risk, a prototype may be built using FPGAs. The prototype includes the processor, memory and the peripheral blocks. The selected vendor needs to provide an FPGA design that is big enough for the processor and has enough room left for the rest of the components. It is not a simple task to do this prototype from scratch. In many companies, the prototyping activity is seen as a standalone project that runs in parallel to the design of the device.
Software will always expand to fill all available processor headroom. Better tools help the developers to squeeze more performance out of the same device. For example, today’s compilers yield 30% more performance from the same processor compared to one or two years ago. An integrated software development tool set is essential to allow developers to focus on the real objective. Ideally, the chosen tool set needs to fully support the selected processor otherwise developers would be wasting their time to make the two elements work together.
The media always highlights the most advanced process nodes. These nodes are not necessarily required for the device in question. One of many criteria for selecting a process node is whether the foundry of choice has already released silicon in that node. The probability of success from first silicon increases with the number of devices that the foundry has already released in that node.
A vendor that has many foundry partners will offer a better choice for proven nodes.
It is easier to select a processor with a built in power management scheme as opposed to adding such management to a standard processor. These days, clock gating and power gating techniques extend the reduction in power consumption for a processor. In clock gating, the clocks are stopped to certain blocks of the device thus reducing power consumption. Power gating is where power is shutdown to whole blocks. For example, if the debug section is in a distinct power domain, then it can be totally shutdown during regular operation.
Sleep modes are another factor in power management. A higher performance processor will spend most of the time in sleep mode. Brief bursts of execution followed by long periods of sleep. The average power consumption is less than a lower performance processor that would be awake most of the time executing code. A variety of sleep modes give more options for power saving. Deep sleep, deep sleep with state retention, wake on event or interrupt are examples of such power saving sleep modes.
Does the processor bus interconnect allow connectivity of all peripherals and memory subsystems? What sorts of bridges are required? What is the required bus performance efficiency given the selected memory?
The selected processor has to be coherent with the corresponding memory subsystem. There is no point in targeting a high frequency for maximum performance if the memory can’t keep up. System bottle necks need to be identified and analyzed before a final selection is made.
Interrupt response time is critical in some applications. Response time is reduced if most of the setup for interrupt entry and exit is handled in hardware such as stack push and pop operations. Nested interrupts and various priority levels allow sophisticated response schemes. Finally, it used to be extremely complex to write interrupt routines in assembly whereas now a days the presence of vector interrupt table permits writing the routine in C, same as the rest of the code.
* 1 is least fit, 100 is best fit
** 0 is least important, 10 is most important
This narrows down just a bit. But within the ARM ecosystem, the choices still abound.