Skip navigation


1 2 3 Previous Next

SoC Implementation

178 posts

What to look out for when selecting an Embedded Process Monitor.


When selecting an embedded Process Monitor for use within your digital SoC it is important to make sure it ticks certain boxes.


Understanding how the chip has been made (process) has become a critical requirement for advanced node semiconductor design. Customer products are becoming ever more compelling and there is a greater need to overcome the physical challenges of advanced nodes. On a per die basis you need to carefully consider the following questions: What is your process type? What is the process variability likely to be and how will this impact your design? How is your silicon ageing? How can you dynamically optimise performance?

Listed below are the 5 things to consider:


1. Optimisation

“When things are measured, they can be optimised” this couldn’t be more true in the case of Process Monitors. If an IC can self-determine its own manufactured process characteristics it can provide critical information regarding system optimisation on a per die basis. Process Monitors can also be used to enable continuous Dynamic Voltage & Frequency Scaling (DVFS) optimisation systems to be utilised within the SoC design. Therefore a better measurement and representation of the process brings with it a greater opportunity to optimise.


2. Critical Voltage & Timing Analysis

A process detector should allow you to not only analyse critical voltage but also enable supply/logic experimentation. It should also allow you run production test phases and optimise the functionality for low power, all on a per die basis. Detecting logic speed and monitoring voltage supply levels can be used intelligently to vary system clock frequencies and the voltage levels of supply domains. An essential function of a process monitor should be the ability to reveal how your circuit works under different voltage conditions and identify the timing constraints.


3. Ease of Integration

Process Monitors should be easy to integrate, compatible with standard CMOS processes, and have several digital interfacing options for easy control and data capture.


4. Scan & Testability

A good process detector should also allow internal self-checks to identify fault conditions.


5. Application

The in-chip process monitoring and management of advanced node designs has become a critical consideration for SoC developers. Ensure you clearly understand your requirement in light of the end application.


Finally, be sure to choose an IP vendor that is able to demonstrate consistent and successful circuit performance in volume production. If you have selected well, you will soon build a rapport and a good relationship with the vendor's design team, as after all, the success of your product is a shared undertaking.


For more information about Moortec Process Monitors visit Embedded On Chip Process Monitoring Detector IP

Thermal Issues Associated with Modern SoCs - How Hot is Hot?


In this, the third instalment of the "Let's Talk PVT Monitoring" series Moortec CTO, Oliver King talks about the thermal issues associated with modern SoCs and ponders the question How Hot is Hot? Oliver has been leading the development of compelling in-chip monitoring solutions to address problems associated with ever-shrinking System-on-Chip (SoC) process geometries. An analogue and mixed signal design engineer with over a decade of experience in low power design, Oliver is now heading up the expansion of Moortec's IP portfolio into new products on advanced nodes.


Moortec CTO Oliver King.jpg




1. What are the thermal issues of modern SoCs?

Gate density has been increasing with each node and that pushes up power per unit area. This has become an even more significant issue with FinFET processes, where the channels are more thermally isolated than planar processes before them.


Then there is leakage, which in the last few planar nodes was an issue that led to significant power consumption. That has been pegged back somewhat with the latest FinFET nodes but it will continue to be an issue going forward as we look toward the next generation FinFET nodes and beyond.


In addition to these issues, if you are developing for consumer products, smartphones, tablets, that kind of thing then you are always limited in terms of how much heat you can dissipate because you don’t have active cooling systems such as fans, and obviously the upper temperature limit of the product is quite low. In addition, the hotter things get the bigger the issue of reliability and lifetime of device parts which is perhaps the biggest issue going forward, as we are then talking about electro-migration, hot carriers, and BTI effects which we have discussed in the past.


2. How hot is hot?

That all depends on the application! That said, one thing that is interesting now with the growth in automotive applications, such as ADAS and infotainment is we are starting to see that even 125°C is not high enough as those markets demand higher temperature operation.

So for those applications hot is hotter than it may be for say a consumer device where 40°C for the product might be your limit. Then there will be a thermal mass to factor in so you will have devices within that product which are much hotter.


But the key thing for our customers is knowing device temperature accurately. The more accurately they know the temperature  the closer to the limit they can operate. That is really what it is all about for modern SoCs; being as close as you can to the limit without stepping over it. And because temperature has an exponential effect in terms of ageing, the accuracy of temperature sensors is correspondingly important.


3. Trend in use

Certainly a number of years ago when we started developing temperature sensors, they were being used generally just for device characterisation, HTOL, burn in tests and those kind of things. Then they started to be used for high temperature alarms, either to switch off the device or turn on a fan. But we have seen over the last couple of years more applications which rely on these monitors. Applications like Dynamic Voltage and Frequency Scaling (DVFS), Adaptive Voltage Scaling (AVS) and lifetime reliability. These applications make use of the sensor data in a feedback control loop. So certainly the use cases now are much more varied.


The trend for the recent past has been driven by consumer electronics and in those cases you are really trying to get a lot out of a device whilst not making it too hot, because it’s in your pocket, or its on your lap or whatever, so this has driven the use cases. I believe that we are moving into a space where just the cost of the advanced node technologies mean you want to get everything out of a device , and all of the different levels of over design that are added to the process, the design flow, take away performance. As a result, having sensors on chip, whether they are temperature sensors, or process or voltage allow you to get that little more performance out of your device and, or improve reliability.


4. What requirements does that place on temperature sensors?

The most important thing from where we sit is accuracy. The greater the uncertainty in the measured result, the less you can do with it. So for us the key motivation is accuracy. But beyond that the next thing is robustness and testability, because you are now using these sensors in application areas where their failure can cause system failure. This means you need to be able to test them, you need to be able to rely on them. So we are doing a lot in that sense to ensure that there is testability and there is robustness in our products.


5. How does Moortec address those requirements?

The first thing is that we meet the accuracy requirements and we aim to exceed them. In terms of testability and robustness we have done a lot of work to be able to provide online fault detection and diagnosis of our sensors.


This means you can interrogate them and understand if there is a fault. Firstly, it will tell you if there is a fault, and secondly you can then ask it what is wrong and it can give you certain amount of  health diagnosis. In addition,  we support scan chains to increase overall test coverage.


Then on top of that we believe ease of integration is an important factor. Not because it gives you a more accurate temperature sensor, but to make it easier for the customer to implement and use the product.


About the interviewee

Oliver King is the Chief Technology Officer of Moortec Semiconductor. Before joining Moortec in 2012, Oliver was part of the analogue design methodology team at Dialog Semiconductor and prior to that was a senior design engineer at Toumaz Technology. Oliver graduated from The University of Surrey in 2003 with a degree in Electrical and Electronic Engineering.


About Moortec Semiconductor

Moortec Semiconductor, established in 2005, provide high quality analog and mixed-signal Intellectual Property (IP) solutions world-wide specialising in die monitoring. Having a track record of delivery to tier-1 semiconductor and product companies, Moortec provide a quick and efficient path to market for customer products and innovations. For more information, please visit


Contact: Ramsay Allen, +44 1752 875133,

When the first car rolled off his production line in 1913, Henry Ford would have already envisioned just how prolific the automobile would become. However, would he have foreseen the extent to which monitors and sensors would become critical to the modern internal combustion engine?

Henry Ford.jpg

The requirement for energy efficiency, power performance and reliability in high volume manufactured vehicles has caused monitoring and sensor systems to increase in number and complexity in order to manage dynamic conditions and understand how each engine has been made. By the same principle, in-chip monitors are here to stay.


Understanding dynamic conditions (voltage supply and junction temperature) as well as understanding how the chip has been made (process) has become a critical requirement for advanced node semiconductor design. So, we should not only get used to in-chip monitors and sensors but also understand the problems they solve and what the key attributes are for good in-chip monitors.




Here are five reasons why in-chip monitoring is here to stay for low geometry designs on technologies such as 40nm, 28nm and FinFET.


1. Gate Density

The benefits of increased gate density drive the modern world by allowing for increased complexity of our electronics for a given area. However, there are drawbacks as increased gate density leads to greater power density and hence localised heating within the chip, or hot spots. Increased density also leads to greater drops to the supply voltage feeding the circuits. High accuracy temperature sensors and voltage supply monitors throughout the design of the chip will allow the system to manage and adapt to such conditions.


2. Product Differentiation

Capitalising on high accuracy monitors and sensors which are proliferated throughout the design will give products a leading edge in the marketplace. Sure, semiconductor design teams will be judged by product features and other ‘bells and whistles’ but what also counts is reliability and comparable performance to competitors.


3. Accepting Greater Chip Process Variability

The variability in how each semiconductor device is manufactured widens as geometries shrink. We’ve discussed how there are benefits to understanding how the dynamic conditions of voltage and temperature change on chip but how about fixed conditions of how each device has been manufactured? Process monitoring, hence determining the speed of the digital circuits, how they react to dynamic changes and how they will age, will allow for optimisation and compensation schemes, making the most of how each particular chip has been made.


4. Increased Reliability

The innate intolerance to electronic faults within the automotive and telecoms sectors is an attitude now spreading to the enterprise and consumer sectors. Accurate monitoring allows for fault detection and lifetime prediction, primarily by sensing the main contributors to circuit stress such as prolonged high supply voltage and the consequences of localised thermal heating with respect to electro migration.


5. Generational design improvement

Armed with the knowledge of how your last 10 million semiconductor devices served their market applications through their lifetimes is key information for your next generation of product design. How do your end customers utilise your devices? This will allow designers to tolerance their designs appropriately ‘next-time-round,’ by understanding the environmental conditions under which they are placed.


In summary, we should be prepared that our desire for more in-chip information, and not just simply data, to differentiate products will cause monitoring and sensing systems to evolve beyond what we can predict today. It is safe to say that in-chip monitoring, particularly for the advanced node technologies, is here to stay.


For more information about in-chip monitoring visit:

What to look out for when selecting an Embedded Temperature Sensor.




When selecting an embedded temperature sensor for use within your digital SoC it is important to make sure it ticks certain boxes.


Listed below are the 6 key things to look out for:



You can't beat accuracy! Improved accuracy is an opportunity for a higher level of optimisation and reliability. If your sensor accuracy is just 1 degree Celsius more accurate, you can expect reasonable power savings through the lifetime of your product. Be cautious as some analog macros offering high thermal sensing accuracy may also consume more silicon real estate.



Keep the cost of your silicon down. Increased area leads to increases in your bottom-line product cost. This is where sensor size and specification can be traded-off to address the primary requirement demanded by the end application. Do you need a robust temperature sensor to raise an alert when things get too hot, or do you need highly accurate sensing for a fine-grain DVFS scheme?



To keep production test costs down, ensure that any testing associated with the sensor is kept to a minimum. So ensure that the temperature sensor selected has easy to control test accesses (eg, SCAN, JTAG and IEEEP1500) and that a high level of test coverage is offered. If the IP needs calibration to hit accuracy specs, be wary if two-point temperature calibration is required, as this will eat into your test time budget.



During the lifetime of your chip, how do you know if your temperature sensor has failed? What would the consequences be if a stuck bit on your data output skewed your dynamic temperature measurements? A temperature sensor with self-checking on its critical functions will allow the system to make a fail safe response. Otherwise your chip, and therefore the product, could be exposed to the risk of physical damage.


Calibration Schemes

Not all applications will require the higher temperature sensing accuracy offered through trim or calibration. However, if calibration is needed then IP with complex and slow calibration schemes will increase the cost of your products and should be avoided. Ensure it has a quick, single-temperature-point calibration scheme that is easy for your production test team to implement. It will be even better, if your temperature sensor calibration scheme does not require any knowledge of the ambient temperature and can be calibrated at any temperature.


Easy to Integrate

Life is short and tape-out schedules even shorter. Designers are busy people, so digital interfacing offered with the thermal sensor will make the IP easier to integrate. If you want to be unpopular with your physical implementation team then choose a sensor that requires sensitive analogue signals needing to be routed! Optional interfaces such as AMBA APB and a pre-defined register map make it easier for your SoC team to integrate the IP and not have to worry themselves about dealing with proprietary interfacing and control schemes.


The in-chip thermal monitoring and management of advanced node designs has become a critical consideration for SoC developers. Ensure you clearly understand your requirement in light of the end application.


Finally, be sure to choose an IP vendor that is able to demonstrate consistent and successful circuit performance in volume production. If you have selected well, you will soon build a rapport and a good relationship with the vendor's design team, as after all, the success of your product is a shared undertaking.

Find out more about Moortec Embedded Temperature Sensors

As we quickly approach DAC, it's time to fill in your dance cards! Make sure to register today for our ARM-TSMC-Synopsys breakfast event at DAC on Monday and block out that breakfast slot in your calendar - it's a great way to kick off your week at DAC.

Last year's ARM-TSMC -Synopsys breakfast event was so popular, we had to turn a few people away at the door, so please register today!




As a reminder, last year we had a great event, including Denny Liu from MediaTek, talking about MediaTek's new 10-core tri-cluster Cortex-A72/Cortex-A53 SoC, enabled by ARM-TSMC-Synopsys collaboration. In addition, presenters from ARM, TSMC and Synopsys talked about 16nm and 10nm enablement as well as a Reference Implementation for the Cortex-A72 processor. You can view a recording of this session from last year.


This year, we'll again have expert technical presenters from ARM, TSMC and Synopsys talking about enabling design with advanced processors and process technology, and we'll also have a guest presenter talking about success based on ARM-TSMC-Synopsys collaboration.


In addition to this session, you'll also be able to catch some great ARM presenters in the Synopsys booth theater - I'll publish that schedule soon, so you can hold a few more slots in your calendar.


I look forward to seeing you there personally!

I'm looking forward to the Synopsys Users' Group (SNUG) Silicon Valley this week. I hope I'll get to see some of you there. Maybe 3GHz+ will get your attention? Read on...


It's always a great (and huge) gathering of leading edge designers sharing about their leading edge designs -- challenges, solutions, warts and all. That's the hallmark of SNUG - deep technical content, and a lot of it. SNUG is the event where designers learn, share and engage.


Thank you, ARM, for being a global Platinum Sponsor of SNUG - you'll see ARM participation in SNUG events worldwide. In preparation for SNUG, I sat down with Dipesh Patel, ARM EVP incubation businesses, and he shared his thoughts in two short videos, one on collaboration with Synopsys and another on what SNUG means to him and ARM.

In the videos, Dipesh comments on:


SNUG worldwide kicks off March 31, 2016 in Silicon Valley (Santa Clara Convention Center) for a two day run. There's great content for everyone, implementation, verification, security, and IP, and many presentations are especially relevant to ARM-based designs.


Looking forward to seeing you there, brendaw, sandralarrabee, sandrachang!


Here are a few ARM-related presentations that I plan to attend:

  • Optimized Implementation of 3GHz+ ARM® CPU Cores in FinFET Technologies (Broadcom)
  • Best Practices for High-Performance, Energy Efficient Implementations of the Latest ARM Processors in 16-nanometer FinFET Plus (16FF+) Process Technology using Synopsys Galaxy Design Platform [Ok, a mouthful, but this session will cover lessons learned not only from Cortex-A72 implementation, but also the latest, not-yet-announced ARM application CPU]
  • Best Practices for a Performance and Area Focused Implementation of High-Performance GPUs Using Galaxy Design Platform
  • VIrtual Platform Methodology for Large Scale Pre-Silicon SW Development (Broadcom)
  • High Level Performance Estimation on Virtual Prototypes Employing Timing Annotation (ARM)
  • Writing Efficient Timing Constraints and Accelerating Timing Closure with PrimeTime (ARM)


Visit our SNUG Silicon Valley website to see the full technical and social program.


Hope to see you there! If you can't make it, check back at the Synopsys SNUG website to see archived papers and presentations.

Understanding Your Chip’s Age

By Ramsay Allen


In this, the second instalment of the "Let's Talk PVT Monitoring" series I chat with Oliver King about understanding your chip's age. As Moortec’s CTO, Oliver has been leading the development of compelling in-chip monitoring solutions to address problems associated with ever-shrinking System-on-Chip (SoC) process geometries. An analogue and mixed signal design engineer with over a decade of experience in low power design, Oliver is now heading up the expansion of Moortec's IP portfolio into new products on advanced nodes.


Oliver_King (732x1024).jpg




1.Why is understanding your chip's age important?

Semiconductor devices age over time, we all know that but what is often not well understood are the mechanisms for ageing or the limits that will cause a chip to fail. In addition, there is bound to be a requirement for a minimum lifetime of a device which will depend on application but could be two or three years for a consumer device and up to twenty-five years for a telecommunications device. Given that lifetime requirement and often poorly understood ageing processes, many chips designed today are over designed to ensure reliable operation. If you understand that ageing process or better still can monitor the ageing process then you can reduce the over design and potentially even build chips that react and adjust for the ageing effect, or predict when that chip is going to fail.


Chips at the moment are not getting anywhere near their total lifespan because in most cases there isn’t any in-chip monitoring taking place. I sometimes use the analogy of a rental car which you want to give back with an empty fuel tank. If your chip has a defined lifetime, then you want to run it as hard as you can to just perform within spec for the lifetime, or looking at it the other way, you want to hand your rental car back just as you run out of fuel.


2. What are the effects and mechanisms of ageing?

There are a number of mechanisms which contribute, the most notable ones are electromigration, hot carrier effects, and bias temperature instability. Whilst some of this can be mitigated through design techniques, and CAD tools exist to help with that, they can only go so far. In the case of bias temperature instability, the mechanisms are not fully known. Whilst traditionally only negative bias temperature instability (NBTI) was considered an issue, now, with the introduction of high K metal gates at 28nm positive bias temperature instability (PBTI) is now a problem as well. The result of BTI is to raise threshold voltages, and the effect is very temperature dependant, so without a good model of device use it is hard to predict and thus design for. In addition, ageing effects in general are, by nature, hard to measure because it takes a long time even with acceleration techniques such as HTOL to get a device to end of life


3. How can we help predict device lifetime?

From Moortec’s perspective we are working on monitors that can be used to measure the ageing process of a device in the field, by having reference structures and comparing them to live structures, we can compare the two over time. This is one application that is being used at the moment, alongside using the information to adjust the supply to bring the chip back to the performance level that you expect, or need. This is actually quite common, particularly in devices where there is a requirement for a particular throughput.


4. How does this help with choosing the lifestyle or your chip?

The thing is that ageing is complex and very dependent on use case and environment. In most modern applications neither of those is well known and often will vary over time itself.


If we take the smartphone as an example, there will be modes where it is doing very little - where the clock frequency is low, the voltage supply is low. At the other extreme it will be playing HD video - the clock will be run at high rates and the supply will be correspondingly high. Obviously if you took that device and left it in the low power state it would age at a significantly lower rate than if you left it in the high power state. The trouble is at design time you don’t know what that ratio is. Of course this example is actually already a simplified case because more often than not there will be more than two states so you have to make assumptions about time spent in each state, and build margins in to cope with the unknowns. By allowing the system to monitor that ageing then potentially you can optimise DVFS schemes, you can predict lifetime or perhaps even reign in certain modes to insure that a particular lifetime is met.


Another example is the bitcoin mining application. This is at the other end of the scale, where devices are manufactured to sit in large arrays. Each chip will vary with process and they will age differently partly as a result of process variation, and partly because their loads won’t always be equal. If you can monitor all those conditions, then you can optimise each of those chips to run at peak performance.

About the interviewee

Oliver King is the Chief Technology Officer of Moortec Semiconductor. Before joining Moortec in 2012, Oliver was part of the analogue design methodology team at Dialog Semiconductor and prior to that was a senior design engineer at Toumaz Technology. Oliver graduated from The University of Surrey in 2003 with a degree in Electrical and Electronic Engineering.


About Moortec Semiconductor

Moortec Semiconductor, established in 2005, provide high quality analog and mixed-signal Intellectual Property (IP) solutions world-wide specialising in die monitoring. Having a track record of delivery to tier-1 semiconductor and product companies, Moortec provide a quick and efficient path to market for customer products and innovations.


For more information please visit


Contact: Ramsay Allen, +44 1752 875133,


SoC implementation

Posted by saiganeshk Feb 19, 2016

Good morning, I want a header file of Cortex-m4 to cross compile my C code in GCC  . Because the LR and PC not getting updated when i cross compiled. Can any one give the link to download that header file. Thanks in advance

Why pinpoint accuracy is so important when monitoring conditions on chip

By Ramsay Allen


Last week I had the opportunity to speak with Oliver King about accurate PVT monitoring as part of Moortec’s new “Let’s Talk PVT Monitoring” series. As Moortec’s CTO, Oliver has been leading the development of compelling in-chip monitoring solutions to address problems associated with ever-shrinking System-on-Chip (SoC) process geometries. An analogue and mixed signal design engineer with over a decade of experience in low power design, Oliver is now heading up the expansion of Moortec's IP portfolio into new products on advanced nodes.


Oliver_King (732x1024).jpg




1. Why is there an increasing requirement for monitoring on chip?


A. Since the beginning of the semiconductor industry, we have relied on a doubling of transistor count per unit area every 18 months as a way to increase performance and functionality of devices. Since 28nm, this has broken. As such, designers now need to find new ways to continue increasing performance.


Using the analogy of the internal combustion engine, for decades it was fine to have the fuel consumption and emissions that they had as the innovation was limited. Improvements to cars focused around adding features, adding things which made the car a nicer place to be. People bought new cars because they wanted the latest features. Then oil prices started going up, we became aware of the environmental impact, and with this innovation aimed at improving the efficiency of the engine. The result is quite astounding, the modern car engine delivers more power whilst consuming less fuel and emitting less harmful gasses.


The semiconductor industry is now in the position where it has to do the same. We can no longer rely on adding more transistors to make a better, faster, chip. The customer still wants their new computer, phone, or tablet, to be faster and have more storage than their old one.


One technique being deployed to provide the improvement is device optimisation. By being aware of a devices thermal and voltage environment and understanding where a given device is within the ever increasing sphere of device variation, allows the system architects and circuit designers to get more from a given piece of silicon. With the increase as well in the cost of advanced nodes, this is becoming even more important to ensure every last drop of performance is extracted from a die.


2. Is this just an issue for the advanced nodes?


A. In short, no. With the growth in the IoT market, we are going to see an explosion in the number of wireless devices. A majority of these will be on older process nodes, however, the same performance gains can be found on these nodes. These devices will typically be battery powered and sensitive to power consumption. Because of this there will be a drive to improve efficiency in these products rather than perhaps improving performance, but they are different names for essentially the same problem. By understanding where a given die is with respect to process, voltage, and temperature, a more optimum solution can be found, whether that optimum solution is measured by performance or efficiency doesn't matter.


Then we also have to consider automotive, which is a big growth area for the semiconductor industry as a whole. With the growth of the Advanced Driver Assistance Systems (ADAS) and Infotainment areas we will see more advanced nodes being used, certainly down to 28nm in the near future. Ideally some of these products would be on more advanced nodes but those are not qualified for automotive. As a result, the available technologies will have to be squeezed by designers to get the extra level of performance. In addition, the environment in automotive is harsh. So when you look at all of these together it is clear there is definitely room for die optimisation, as well as the requirement for basic monitoring purely for safety and reliability reasons.


3. How important is accurate monitoring?


A. Accurate PVT monitors are key to implementing die optimisation. We all know the relationship between power consumption and supply voltage of CMOS logic. Being able to reduce the supply by even a few percent based on that particular die’s process point, also combined with the environmental conditions that allow, will result in power savings worth having. The same is true with performance, if a given clock speed can be met with a lower supply. But none of this is possible if the monitors are not accurate.


4. How critical has monitoring become?


A. PVT monitors are not anything new. They have been used in the industry for a long time, however, not generally in what I call mission critical roles. Once you start looking at optimisation and potentially putting them into dynamic control systems, which is where we are now seeing customers use our latest generation of die monitors, then reliability and testability are absolutely critical.


Having features within the PVT monitors to ensure they can be tested easily in production is a minimum entry requirement, but in addition to that, being able to know the data from these monitors can be trusted is fundamental. As such, I believe having in-field, fault detection and reporting built into the monitors is key. Let us consider the situation where a chip within a smartphone or tablet contains a temperature monitor which fails and this failure effectively tells the system that the temperature is 0C when it is actually 50C, and because of this the system decides it can run with higher clock rates pushing the die temperature up even further. The result could be very serious indeed. Because of this robust operation of PVT monitors is becoming a primary concern.


Moortec's latest generation of monitors feature exactly this level of robust operation, and fault detection, complete with all the usual production testability which should be expected from such devices.


5. Where do you see the future of on chip monitoring?


A. On chip PVT monitoring is here now and it is here to stay. The costs of advanced node technologies are continuing to increase, and I think we are already starting to see a fragmentation, with the really advanced nodes becoming more niche for those devices which really need the performance. For those nodes, optimisation will be part of the architecture to ensure the cost of those expensive technologies is minimised.


As the rest of the industry moves down to smaller nodes, they will look to differentiate their products from their competitor and good die optimisation will play a part in that.


About the interviewee


Oliver King is the Chief Technology Officer of Moortec Semiconductor. Before joining Moortec in 2012, Oliver was part of the analogue design methodology team at Dialog Semiconductor and prior to that was a senior design engineer at Toumaz Technology. Oliver graduated from The University of Surrey in 2003 with a degree in Electrical and Electronic Engineering.


About Moortec Semiconductor


Moortec Semiconductor, established in 2005, provide high quality analog and mixed-signal Intellectual Property (IP) solutions world-wide specialising in die monitoring. Having a track record of delivery to tier-1 semiconductor and product companies, Moortec provide a quick and efficient path to market for customer products and innovations. For more information please visit


Contact: Ramsay Allen, +44 1752 875133,

Interesting article raising questions about how designers approach SoC architecture to achieve power and performance targets. Semiconductor Engineering .:. Do Circuits Whisper Or Shout?


If performance is your thing see my earlier blogs

Exploring the ARM CoreLinkTM CCI-500 performance envelope - Part 1

Exploring the ARM CoreLink™ CCI-500 performance envelope – Part 2

Following on from Tuesday's announcement from ARM of the CoreLink CCI-550 interconnect and CoreLink DMC-500 memory controller products it is clear there are a lot of performance advantages to be had from a well matched interconnect and memory system. You can discover some of those benefits in eoin_mccann's blog The Foundation for Next Generation Heterogeneous Devices. For those in Silicon valley next Thursday, you can find out all about Exploring System Coherency from jdefilippi's talk on the new products at ARM TechCon.

Configuring your SoC infrastructure to match the number and type of processors and DDR channels present in your design is a vital step to ensuring your product is competitive. Measuring the resultant performance is the proof you need to know you have met your design objectives.

At ARM TechCon on Nov 12th I'll be teaming up with nickheaton from Cadence to discuss some of the key interconnect configuration options, demonstrating tools for automating the configuration process and verify erformance of the interconnect and memory system.


In our presentation on Architecting and Optimizing SoC Infrastructure I'll be discussing ways to optimise cache coherency, avoiding stalls while waiting for transactions to complete, setting up QoS contracts and keeping the memory controller fully utilized and Nick will be demonstrating with performance verification tools the importance of correct buffer sizing and having multiple outstanding transactions in flight.



System optimization involves running Linux applications and understanding the impact on the hardware and other software in a system. It would be great if system optimization could be done by running benchmarks one time and gathering all needed information to fully understand the system, but anybody who has ever done it knows it takes a numerous runs to understand system behavior and a fair amount of experimentation to identify corner cases and stress points.


Running any benchmark should be:

  • Repeatable for the person running it
  • Reproducible by others who want to run it
  • Reporting relevant data to be able to make decisions and improvements


Determinism one of the challenges that must be understood in order to make reliable observations and guarantee system improvements have the intended impact. For this reason, it’s important to create an environment which is as repeatable as possible. Sometimes this is easy and sometimes it’s more difficult.


Traditionally, some areas of system behavior are not deterministic. For example, networking traffic of a system connected to a network is hard to predict and control if there are uncontroled machines connected to the network. Furthermore, even in a very controlled environment the detailed timing of the individual networking packets will always have some timing variance related to when they arrive at the system under analysis.


Another source of nondeterministic behavior could be something as simple as entering Linux commands at the command prompt. The timing of how fast a user is typing will vary from person to person and from run to run when multiple test runs are required to compare performance. A solution for this could be an automated script which automatically launches a benchmark upon Linux boot so there is no human input needed.


Understanding the variables which can be controlled and countering any variables which cannot be controlled is required to obtain consistent results. Sometimes new things occur which were not expected. Recently, I was made aware of a new source of non-determinism, ASLR.


Address Space Layout Randomization


Address Space Layout Randomization (ASLR) has nothing to do with system I/O, but the internals of the Linux kernel itself. ASLR is a security feature which randomizes where various parts of a Linux application are loaded into memory. One of the things it can do is to change the load address of the C library. When ASLR is enabled the C library will be loaded into a different address of memory each time the program is run. This is great for security, but is a hinderance for somebody trying to perform system analysis by keeping track of the executed instructions for the purpose of making performance improvements.


The good news is ASLR can be disabled in Linux during benchmarking activities so that programs will generate the same address traces.


A simple command can be used to disable ASLR.


$ echo 0 > /proc/sys/kernel/randomize_va_space


The default value is 2. The Linux documentation on sysctl is a good place to find information is the randomize_va_space:

This option can be used to select the type of process address space randomization that is used in the system, for architectures that support this feature.

0 - Turn the process address space randomization off. This is the default for architectures that do not support this feature anyways, and kernels that are booted with the "norandmaps" parameter.

1 - Make the addresses of mmap base, stack and VDSO page randomized. This, among other things, implies that shared libraries will be loaded to random addresses.  Also for PIE-linked binaries, the location of code start is randomized. This is the default if the CONFIG_COMPAT_BRK option is enabled.

2 - Additionally enable heap randomization. This is the default if CONFIG_COMPAT_BRK is disabled.

There is a file /proc/[pid]/maps for each process which has the address ranges where the .so files are loaded.


Launching a program and printing the maps file shows the addresses where the libraries are loaded.


For example, if the benchmark being run is sysbench run it like this:


$ sysbench & cat /proc/`echo $!`/maps


Without setting randomize_va_space to 0 different addresses will be printed each time the benchmark is run, but after setting randomize_va_space to 0 the same addresses is used from run to run.

Below is example output from the maps file.




If you ever find that your benchmarking activity starts driving you crazy because the programs you are tracing keep moving around in memory it might be worth looking into ASLR and turning it off!

We are so pleased to have Arenta people and technology join Synopsys!


The former Atrenta products, widely used by designers of ARM-based SoCs, complement Synopsys' solutions for ARM-based design.


Verification requirements have exploded as designs have become increasingly complex. Atrenta's early design analysis tools enable efficient, early verification and optimization of SoC designs at the RTL level. Combined with Synopsys' industry-leading verification technologies, Atrenta's leading static and formal technology further strengthens Synopsys' Verification Continuum™ platform and enables customers with this unique verification environment to meet the demands of today's complex electronic designs. Atrenta's SoC design analysis technology also fortifies the Synopsys Galaxy™ platform with additional power, test and timing-related analysis technologies. By integrating Atrenta's complementary technology into Synopsys' platforms, Synopsys can offer designers a more comprehensive, robust portfolio of silicon to software solutions for  complex electronic systems.


The Atrenta products include:


More information on the former Atrenta products is available at

SAN FRANCISCO—Design complexity is soaring. Node-to-node transitions now take a year to a year and a half, not several years. Market pressures mount.

This means third-party IP integration is crucial not only to managing system-on-chip (SoC) design complexity but getting to market in a reasonable amount of time. But IP integration has often been easier said than done. If done improperly, design teams can experience schedule slips, which means added cost and lost market opportunity. So is it worth the risk? Are there any real alternatives? These were fundamental questions a panel of experts addressed here at the 52nd Design Automation Conference in June.


DAC 2015 IoT Integration Panel.jpeg

The complexity problem

Albert Li, director with Global Unichip Corp, said his company is “hugely dependent” on IP but “there are a lot of problems with HW IP” in terms of implementation and verification.

“Things are getting complicated,” said Navraj Nanda, senior director of marketing for the DesignWare Analog and mixed-signal IP products at Synopsys. “In terms of technology nodes on one end they’re getting smaller. On the other end, (more technologically mature) nodes are getting refreshed.”

He said a key challenge is how does the industry serve those markets “with the same types of IP?”

Thomas Wong, director within strategic business operations at Cadence's IP Group, said with the node-to-node transition shrinking from two to three years to sometimes 18 months, that pace is “outstripping the capacity of smart engineers” to keep up and exploit the node benefits.

While it’s always cathartic to talk about the shared challenges when it comes to the evolution of electronics design, the panelists quickly coalesced around the notion that IP—for all its challenges—is here to stay but that optimization and efficiencies must be found.

“I don't think there's any other way of designing chips with a very small number of exceptions,” said Leah Schuth, director of technical marketing with ARM.

Schuth suggested that the industry address IP and tools the same way it looks at consumer devices. “We need to make the complexity almost invisible to the user” though increased standardization or some kind of certification, she said.

Feeling Bloated

File sizes are part of the integration problem, and here experts on the panel—which was moderated by Semiconductor Engineering Executive Editor Ann Steffora Mutschler—offered some jarring challenges as well as potential solutions.

Cadence’s Wong said that a customer recently told him that downloading libraries for a project was going to take seven days. And even delivering a hard drive with the terabytes of information on the drive took several days to download.

Schuth wondered how much data across IP is duplicated, bloating file sizes. Is there a way to not transmit “non-data” like header fields or duplicative data to cap file size, Schuth asked.

Nanda said he believes the file-size problem is actually worsening, even with EDA solutions to manage database sizes like OASIS (Open Artwork System Interchange Standard).

“You can be idealistic and say ‘hey let’s try to limit the data size because we understand the applications in the customer’s market,’” Nanda said, “but in reality our customers are in brainstorming mode so they want the whole enchilada.”

Wong noted that Cadence’s multi protocol IP offering can be one way of getting around the file size problem because you load a database once and get to use various protocols with various different designs.

“It was invented for that, but it’s a bonus,” he said.

Schuth said another way to improve IP integration challenges is to work hard to ensure the IP works “right out of the box” for customers, along the lines of ARM’s Socrates Design Environment or IP-XACT.

Wong suggested thinking about an integrated approach, and he summoned the ghosts of PC design past as an example. Chips & Technologies soared to prominence in the 1990s as a chipset vendor because it delivered complete motherboard reference designs into the market to ease and speed design, he said. This model carries over today into smart phone design, he added.

At the end of the day, in design engineering there are always challenges and usually gradual improvement. As the IP market and methodology mature, the integration stress eases and becomes a “100-piece puzzle instead of a 1,000-piece puzzle,” said Schuth. That’s because IP vendors are learning more and more about customer needs and then applying those lessons to subsequent engagements.

Related stories:

DAC 2015: IoT’s Long Winding Road

Interview: A brief history of IP integration

Interview: How to solve the IP integration problem

Recently, Carbon released the first ARMv8 Linux CPAK utilizing the ARM CoreLink CCN-504 Cache Coherent Network on Carbon System Exchange. The CCN family of interconnect offers a wide range of high bandwidth, low latency options for networking and data center infrastructure.


The new CPAK uses an ARM Cortex-A57 octa-core configuration to run Linux on a system with AMBA 5 CHI. Switching the Cortex-A57 configuration from ACE to CHI on Carbon IP Exchange is as easy as changing a pull-down menu item on the model build page. After that, a number of configuration parameters must be set to enable the CHI protocol correctly. Many of them were discussed in a previous article covering usage of the CCN-504. Using native AMBA 5 CHI for the CPU interface coupled with the CCN-504 interconnect provides high-frequency, non-blocking data transfers. Linux is commonly used in many infrastructure products such as set-top boxes, networking equipment, and servers so the Linux CPAK is applicable for many of these system designs.


Selecting AMBA 5 CHI for the memory interface makes the system drastically different at the hardware level compared to a Linux CPAK using the ARM CoreLink CCI-400 Cache Coherent Interconnect, but the software stack is not significantly different.


From the software point of view, a change in interconnect usually requires some change in initial system configuration. It also impacts performance analysis as each interconnect technology has different solutions for monitoring performance metrics. An interconnect change can also impact other system construction issues such as interrupt configuration and connections.


Some of the details involved in migrating a multi-cluster Linux CPAK from CCI to CCN are covered below.


Software Configuration

Special configuration for the CCN-504 is done using the Linux boot wrapper which runs immediately after reset. The CPAK doesn’t include the boot wrapper source code, but instead uses git to download it from and then patch the changes needed for CCN configuration. The added code performs the following tasks:

  • Set the SMP enable bit in the A57 Extended Control Register (ECR)
  • Terminate barriers at the HN-I
  • Enable multi-cluster snooping
  • Program HN-F SAM control registers


The most critical software task is to make sure multi-cluster snooping is operational. Without this Linux will not run properly. If you are designing a new multi-cluster CCN-based system it is worth running a bare metal software program to verify snooping across clusters is working correctly. It’s much easier to debug the system with bare metal software, and there are a number of multi-cluster CCN CPAKs available with bare metal software which can be used.


I always recommend a similar approach for other hardware specific programming. Many times users have hardware registers that need to be programmed before starting Linux and it’s easy to put this code into the boot wrapper and less error prone compared to using simulator scripts to force register values.The Linux device tree provided with the CPAK also contains a device tree entry for the CCN-504. The device tree entry has a base address which must match the PERIPHBASE parameter on the CCN-504 model. In this case the PERIPHBASE is set to 0x30 which means the address in the device tree is 0x30000000.


All Linux CPAKS come with an application note which provides details on how to configure and compile Linux to generate a single .axf file.


GIC-400 Identification of CPU Accesses

One of the new things in the CPAK is the method used to get the CPU Core ID and Cluster ID information to the GIC-400.


The GIC-400 requires the AWUSER and ARUSER bits on AXI be used to indicate the CPU which is making an access to the GIC. A number between 0 and 7 must be driven on these signals so the GIC knows which CPU is reading or writing, but getting the proper CPU number on the AxUSER bits can be a challenge.


In Linux CPAKs with CCI, this is done by the GIC automatically by inspecting the AXI transaction ID bits and then setting the AxUSER bits as input to the GIC-400. Each CPU will indicate the CPU number within the core (0-3) and the CCI will add information about which slave port received the transaction to indicate the cluster.


Users don’t need to add any special components in the design because the mapping is done inside the Carbon model of the GIC-400 using a parameter called “AXI User Gen Rule”. This parameter has a default value which assumes a 2 cluster system in which each cluster has 4 cores. This is a standard 8 core configuration which uses all of the ports of the GIC-400. The parameter can be adjusted for other configurations as needed. 


The User Gen Rule does even more because the ARM Fast Model for the GIC-400 uses the concept of Cluster ID to indicate which CPU is accessing the GIC. The Cluster ID concept is familiar for software reading the MPIDR register and exists in hardware as a CPU configuration input, but is not present in each bus transaction coming from a CPU and has no direct correlation to the CCI functionality of adding to the ID bits based on slave port.


To create systems which use cycle accurate models and can also be mapped to ARM Fast Models the User Gen Rule includes all of the following information for each of the 8 CPUs supported by the GIC:

  • Cluster ID value which is used to create the Fast Model system
  • CCI Port which determines the originating cluster in the Cycle Accurate system
  • Core ID which determines the CPU within a cluster for both Fast Model and Cycle Accurate systems


With all of this information Linux can successfully run on multi-cluster systems with the GIC-400.


AMBA 5 CHI Systems

In a system with CHI the Cluster ID and the CPU ID values must also be presented to the GIC in the same way as the ACE systems. For CHI systems, the CPU will use the RSVDC signals to indicate the Core ID. The new CCN-504 CPAK introduces a SoC Designer component to add Cluster ID information. This component is a CHI-CHI pass through which has a parameter for Cluster ID and adds the given Cluster ID into to the RSVDC bits.


For CCN configurations with AXI master ports to memory, the CCN will automatically drive the AxUSER bits correctly for the GIC-400. For systems which bridge CHI to AXI using the SoC Designer CHI-AXI converter, this converter takes care of driving the AxUSER bits based on the RSVDC inputs. In both cases, the AxUSER bits are driven to the GIC. The main difference for CHI systems is the GIC User Gen Rule parameter must be disabled by setting the “AXI4 Enable Change USER” parameter to false so no additional modification is done by the Carbon model of the GIC-400.



All of this may be a bit confusing, but demonstrates the value of Carbon CPAKs. All of the system requirements needed to put various models together to form a running Linux system have already been figured out so users don’t need to know it all if they are not interested. For engineers who are interested, CPAKs offer a way to confirm the expected behavior read in the documentation by using a live simulation and actual waveforms.

Filter Blog

By date:
By tag:

More Like This