1 2 3 Previous Next

ARM Processors

237 posts

ARM's partners are getting great use from Juno as a development platform, helping them get ready for 64-bit capable ARMv8-A based platforms.  You may have seen that ARM has supported this platform with ARM Trusted Firmware (A thin layer of secure 64-bit firmware running at Secure EL3).   ARM TF has proven very popular with partners who want to implement trusted boot and integrate a Trusted OS to create a Trusted Execution Environment.   So its great to see that Juno is now being supported by OP-TEE an open source TEE that Linaro have been working on.  You can find out more details by searching for "Github OP-TEE" where you can find the relevant git commits.

I recently had the opportunity to sit down with David Murray and talk about the current state of affairs for IP integration in the context of building systems. For those of you who do not know David, he is an incredibly enthusiastic technologist who previously held the role of CTO at Duolog before gaining the impressive-sounding title of IP Tooling Architect in ARM. An energetic and articulate man, he is always interesting to listen to and I hope you enjoy the interview below. Feel free to ask questions in the comment space below and David will answer them ASAP.

This blog post follows up on the interview I conducted with Norman Walsh a couple of weeks ago. Norman spoke about the history of IP integration and how it has evolved to the point we are at currently. You can read the interview here Interview: A brief history of IP integration

Hi David, what’s going on in the IP integration space?

Well - IP integration continues to be a key challenge in SoC development. We’ve seen consistent increases in IP reuse, IP configurability and system complexity within tightly bound schedules compound the problem of IP integration. The number of IP in  a system continues to grow, the complexity and configurability of that IP itself is growing and the overall integration scope is growing as it affects more and more teams from front-end to backend  e.g. software, RTL design, verification, physical implementation etc.


This is a problem area that we are very familiar with and have been architecting integration solutions over the last number of years.   One of the fundamental pillars of improving the IP integration process is the standardization of the data in the process. This is something that Norman Walsh has mentioned in his interview is that we need to standardize our IP data (particularly the interfaces)through the use of metadata. If an IP can communicate its interfaces in a standard way than that the whole IP and SoC integration processes a lot easier.  If we can have a formal definition (in some metadata format) of all the interfaces of an IP then we can use more automated intelligence about how it should be hooked up and enable other crucial flows.  For example being able to identify AMBA interfaces, clocks, resets, interrupts, DMA, debug and trace interfaces etc.  Also, it’s not just the hardware interfaces I’m talking about, it it’s equally important to have a good view of the hardware/software interfaces like the registers and memory maps within the IP.

So how is this interface information standardized?

Well, for me, the obvious first thing is to make it so that the IP actually uses industry standard protocols as much as possible such as AMBA (ACE, AXI, AHB, APB, etc).  These interfaces are quite configurable so it’s important to be able to define their content and configuration in a metadata format. The main standard that the industry uses is the IP-XACT format, originally developed under the SPIRIT consortium but now developed under Accellera.  This essentially specifies a definition in a machine-readable (XML) format that can describe the IP interfaces and memory maps as well as its contents and connectivity. We are currently working within ARM to increase the standardization of ARM IP so it will be easier to integrate.

As long as a design flow creates this IP-XACT then we can work from there and run queries on that IP-XACT. Because we know what the tool reads and interprets, we can work together with partners to help them define the necessary IP-XACT specs.

IP Standardization.png

Fast IP integration requires standardized IP







That sounds great but how does it work with lots of different 3rd party IP or a partner's internal IP?

ARM also produces IP-XACT standard bus definitions that can be downloaded from the ARM Website for anybody to use.  If other IP providers use these standard definitions then it will provide a much easier mechanism of connecting these IP in a sub-system or top-level.  Also, don’t forget that this is not just enabling more efficient integration – our partners will also benefit from the provision of better EDA solutions that can leverage this metadata.




So there’s a lot work being done in ARM at the moment?

This is something we’ve been working towards for the past 6 or 7 years, even within Duolog, because there is huge potential for reducing bugs and streamlining design and verification processes.  At the moment we’re very focused on increasing the level of standardization within ARM IP and even have an internal IP-XACT modelling definition group. We’re creating new bus definitions, new extensions and guidelines on IP-XACT usage and of course we’re leveraging ARM Socrates to create better IP-XACT flows. Also, from working in the Systems and Software Group at ARM we have a sub-system and SoC-level perspective so we’ve become avid consumers of IP-XACT which gives us a good feel for what our partners are experiencing. The main challenges that we face are probably fairly common in the industry. We’re trying to standardize all of the interfaces that we need in metadata format but firstly we need to understand all of the different stakeholders.


This is great because once we have standardized IP interfaces it makes the integration and verifications process significantly faster. However while you can standardize most interfaces with a relative minimum of fuss, we’re seeing a lot of IP blocks these days that can be tweaked in many different ways.  In some ways we see IP configurability as the biggest integration challenge.


The level of IP configuration that is available these days poses a problem with integration, because you can take the same IP block and configure them in different ways, and they will look and act totally differently based on this. So that makes it more difficult to then integrate successfully into a system.


I liken this to a mixing desk that you would have in a recording studio with hundreds of switches that can be turned one way or another to affect performance. The options enable designers to optimize their IP, but the amount of choice can also be confusing. What the user really wants is to be presented with the best configuration options for that particular IP block that represents the system constraints.





Multiple configuration options can often leave designers confused

So how are you tacking this configuration problem?

When we talk about IP configuration in general, there are three different types of configuration levels that an IP block can have. First off you have what is called ‘static IP’ which cannot be configured at all. This was what you would call ‘off the shelf’ IP that was more common in the past, where you would purchase it for a ‘plug and play’ type functionality. Nowadays even off-the-shelf IP requires a bit of user configuration according to each individual design.


The second type of configurable IP is a simplified version that has a fixed set of parameters that can be set. Having said that - it can be a challenge creating a configurable IP because let’s say for example you have 10 or even 20 parameters, the amount of possibilities makes it difficult to guarantee that your IP will work for every single configuration. Validation teams and modelling will ensure that the IP works fine for the most probable scenarios, but it’s hard to test for everything. You only have a finite amount of verification resources to ensure that it is all tested rigorously.


The third type of configurable IP is heavily dependent on the system for its configuration, an example of these would be system interconnects, debug and trace subsystems,  power,clock & reset, interrupts, I/O, memory systems. They are super configurable and the amount of permutations means you need a different type of strategy to properly handle these. Ideally you would have some form of highly intelligent solutions that can interpret the system requirements and interfaces so that users can easily configure these types of IP.  These are the challenges that we have been working through and steering the Socrates design environment into providing solutions in this area.


ARM are already providing a lot of IP in this area including bus interconnection IP such as ARM CoreLink NIC-400ARM CoreLink CCI-400 Cache Coherent Interconnect, and ARM CoreLink CCN-512 Cache Coherent Network) as well as ARM CoreLink GIC-500 and also CoreSight Debug and Trace IP. These IP will consume vast amounts of system connectivity e.g. a cascaded interconnect infrastructure and CoreSight Debug and Trace could consume upwards of 50% of a systems connectivity, so in some ways highly configurable IPs are one of the pillars to solving the integration solution. 

The new key ingredient that we are bringing to the table is to help manage the configuration of these IP so that it is aligned with its system context. If we can understand the contents of a system and its different interface requirements we can help to guide the configuration of IP.

How do we do this? – By having all system components in a metadata format, of course, and to have intelligent flows that can extract this information and perform this guided configuration – really it’s intelligent IP Configuration.




So this is how the IP integration problem can be solved?

Yes - The vision that we have been working towards with the ARM Socrates IP Tooling for the last number of years has been to create a ‘System in a Day’ by creating an intelligent IP configuration capability.  Back when we first released Socrates, over 6 years ago, the integration task was taking people many months to get an initial RTL netlist and several more months thereafter to get a viable system up and running.  With Socrates we began making significant reductions to that schedule, bringing it down to several weeks.  We saw however that each piece of IP was designed , built and integrated independently of each other. So for example the interconnect was built from a specification, and then people attempted to integrate it into the system from the same (probably outdated) specification.  The bottleneck of the ‘System in a Day’ was the creation and integration of these system-dependent IP. The solution that we centred in on was to seek an intelligent way of configuring these IP … within the context of their system. We are arriving at a solution to the IP integration problem through intelligent configuration of the IP itself. I believe that configuring every aspect of the system correctly is a highly effective way of increasing its overall connectivity. 


What we’re trying to do here is use the metadata to give a fast, correct configuration in a system context. What I mean by system context is that you can see how different system requirements have a knock-on effect on the configuration of each IP and the system as a whole. What that allows us to do is reduce the time that’s spent on actually integrating the parts into a system because 90% of that work will have been done through intelligent configuration. In order to realize our ‘System in a Day’ vision for IP integration we need to do it through intelligent configuration. You need to have a solution for these complex IP blocks so that they can reconfigure themselves as the system is being defined.


We’ve seen partners say that even just understanding the perspective of some of the more complex IP blocks within the system normally takes them several weeks to compile. In the past they have had to go through the TRMs and specs to understand what is required for the system.  We want to be able to provide this information instantly from the metadata of the IP in the system.

System in a Day.png

The IP integration problem will be solved through intelligent configuration


Verification is such a massive part of SoC design these days, how does Socrates fit into that story?

Going back to the IP-XACT metadata that I mentioned earlier, by working in this format we’re able to get a clear picture of the system  very early on and in an easily readable format (XML). We can then hand off this rich information about the system, its interfaces, the registers views and memory maps to our EDA partners and other ecosystem stakeholders.  Because the information is presented in a format that is standard and machine-readable they can work on verifying it immediately. For example Cadence Design Systems can take the metadata for its Interconnect Workbench and automatically creates a verification environment and out-of-box performance scenarios and analysis This feeds into our overarching goal of helping partners design and implement systems in a much shorter timespan.




This sounds very exciting for the future, but is any of that available today?

ARM Socrates is already a proven solution for IP standardization and integration and we’re now beginning to leverage an intelligent IP configuration methodology.  Now that we’re part of ARM we feel that there is a much greater value that we can bring to our partners as we’re working directly with the IP and can steer is standardization and streamline its integration. This is very exciting and you should see new solutions appear in the near future.




Final question for you here. IP Tooling Architect is an interesting title, how has your role changed from being the CTO at Duolog?

Well, when you are in a small company like Duolog – it’s more ‘roles’ than role.   I would have had to keep constantly tuned into our customers design flows and look for any hotspots in their development process. Once the main problems were identified, we had to work on envisioning and architecting a solution to this, and eventually, with a committed team, realizing a high-value product such as Socrates. That’s how we got into IP Integration – we didn’t chase it – it came to us, through our customers.   It was difficult however for a small company to carve out its niche so there was also a lot of evangelizing, writing white papers, blogs, presenting at conferences etc. and plenty of customer meetings.    Another one of the things I did within Duolog was to align ourselves with standards groups and try and progress them toward real solutions areas.  For a small company Duolog invested quite a lot of time and effort in driving the IP-XACT standard and this is definitely something I will continue to do within ARM, helping to progress both internal and industry standards for everyone’s benefit.


Now that I think about it, overall I have a pretty similar role in ARM.  The problem space is still IP integration – we’re still chasing the same dream of ‘System-in-a-day’ and I can boldly speculate that this WILL become  a reality – the big change is that now that we’re ARM,  we've got ARM IP in the equation and wow – that brings incredible potential.  Before, as Duolog, we had to partner with ARM to get limited access to the IP - Now we can work directly with the IP designers from a much earlier stage of development and facilitate intelligent IP integration from the bottom-up with standardised IP blocks – this will be a game-changer. 




OK great. Thanks for your time David

No problem, my pleasure. (Have I really been talking for 10 minutes?) -   I hope your readers find this interesting and ask them to leave a comment if they have any questions for me!





Samsung Electronics, a world leader in advanced semiconductor solutions, has released the latest chip in its flagship Exynos series. The Exynos 7 Octa is an octa-core SoC designed for use in mobile applications such as smartphones and tablets.  Not only is it the first SoC to be based on the ARMv8-A architecture, but the Exynos 7 Octa is also the first ARM-based SoC to feature a combination of four powerful Cortex-A57 cores and four efficient Cortex-A53 cores. It has been built with twin clusters of four Cortex-A57 and four Cortex-A53 processors to reap all the benefits of big.LITTLE™ processing; a power-optimization technology that delivers higher peak-performance capacity at significantly lower average power. It also utilizes big.LITTLE processing with Samsung's HMP (Heterogeneous Multi-Processing) solution, meaning every process can efficiently use the processing power in such an intelligent way that no matter what the multitasking needs are, or what application is being run, there will be no lags and ultimately no drastic power consumption.

One of the key components that enable the big.LITTLE processing is the ARM CCI-400 Cache Coherent Interconnect which provides full cache coherency between two clusters of multi-core CPUs. The CCI-400 enables faster performance all across the chip through system-wide hardware coherency and virtual memory management. This combination along with the improved feature sets of the Cortex-A57 and Cortex-A53 cores and the ARMv8 instruction set, has contributed to delivering the 57% performance uplift on the previous generation Exynos processor, as the Exynos 7 Octa brings advanced features to everyday mobile computing.


ARM CoreSight debug & trace technology was important to the Exynos 7 Octa’s successful release, as its real-time on-chip visibility was used to identify and eliminate bugs quickly. Using CoreSight minimized the risk of costly bugs, allowing more attention to be focused on maximizing performance on the SoC.


The graphics performance has been enhanced on the Exynos 7 Octa by the ARM Mali T-760 GPU, delivering stunning quality at even greater energy efficiency. This performance increase has paved the way for high-resolution games, face/eye recognition and image/video processing to all become a reality in Samsung’s next-generation devices. Users of a device containing the Exynos 7 Octa can simultaneously record high resolution video or pictures using both the front and rear camera, and even output UHD quality video to their TVs. The first device containing the Exynos 7 Octa to reach the market is the Samsung Galaxy Note 4. It was released in October of this year, and is a very popular smartphone (source). It is poised to enjoy even more success in the final quarter of 2014, a period with typically high sales volumes of smartphone devices.


Current smartphone users demand a device that enables them to be constantly connected with powerful performance, while expecting its battery to last the entire day. In summary the Exynos 7 Octa takes advantage of new power management features in its ARM IP to improve power consumption, while simultaneously providing superior performance. These updates allow Samsung to stay at the forefront of the market as a total integrated solution provider for SoC designs.


For more information: Samsung Exynos 7 Octa

HPCaward picture 21.JPGThis has been a really exiting week for ARM and partners at the Supercomputing 2014 in New Orleans this week. The ARM ecosystem for HPC is gaining momentum and we have several partners announcing exciting collaborations and partnerships at the show.

ARM has been working with PathScale on ARMv8-A support with their EKOPath compiler that supports our supports SIMD and AES instructions in addition to C99, C++ 2003, C++11, Fortran 90/95, and partial support for Fortran 2003 and 2008. The BLAS libraries have also been ported and optimized for the ARMv8-A architecture to provide developers with the framework to create ARM-based solutions for HPC applications.


ARM has also been working with Allinea to enable the Allinea DDT parallel debugger product on ARMv8-A which is a key ecosystem component for the HPC community of developers. The momentum has been building beyond the critical ecosystem components to key industry leaders in this space.


Cray and ARM have been working together as they explore alternative architectures to achieve the challenge demands that next generation HPC systems will demand by exploring power efficient and scalable architectures like ARMv8-A 64bit architectures for Supercomputers and Data Analytics systems. As a part of this exploration, Cray was awarded an R&D contract from the US Department of Energy’s Office of Science and the National Nuclear Security Administration with a program called Fast Forward 2.


As part of this effort, Cray was recently awarded a research and development contract from the United States Department of Energy's Office of Science and the National Nuclear Security Administration under a program called FastForward 2.


Our partner Applied Micro announced that they are working with Laurence Livermore National Labs to deliver a test platform for HPC and Big Data Analytics systems using a solution from innovative HPC OEM Cirrascale. ARM partner RedHat is providing the software for this solution with their Partner Early Access Program for ARMv8-A.


The partner momentum continues with Cavium demonstrating their 48 core Thunder-X silicon running key operating systems RedHat and Ubuntu and applications running on Java and applications like Apache Web Server. Great proof point on availability of additional innovative ARMv8-A architectures and platforms becoming available to developers and end users.


The momentum and the excitement that the ARM ecosystem is bringing to the HPC community was perhaps best exemplified by HPCwire readers and editors awarding the ARMv8-A 64-bit Cortex-A57 and Cortex-A53 processors.


My colleague Darren Cepulis wrote a great blog (ARM Partnering for effect in HPC: SC14) that highlights all the partners that will be with us at the Super Computing Show this week. We hope you will stop by and visit us at Booth#3458 at the show.


The Internet of Things is a brand new rising topics in the field of Embedded System. It is believe that more than 26 billion devices will be connected through the each other via internet till 2020. then why you don't make your things a Smart Things!!! Don't you??


A demo of IoTs using CC3200 which have ARM Cortex M4 + Wi-Fi on chip.


IoTs Smart Home - CC3200 Launch Pad interfacing with Android - YouTube

Software-defined networking (SDN) was one of the hot new technologies in 2013, as industry experts wanted to talk about its potential to revolutionize data processing. Not only that, but companies were keen to get in on the action that was forecasted by many to mushroom in demand over the next 5 years. The market size in 2018 is predicted to be somewhere between $8 billion and $35 billion depending on which analysts you pay attention to. This Google chart shows that while the level of interest in SDN may have dropped off somewhat over the last year, it still remains an area of interest for many people. Let’s go through some of the fundamentals of SDN and its benefits, along with the ways in which partners are going about implementing it.

   SDN trend graph.pngFrom no interest until June 2012, SDN has spiked in interest and remained relevant over the past 2 years (source)


Traditional networking approach

The way in which people use computers has advanced rapidly over the past three decades due to the massive proliferation of mobile connected devices and the sheer number of computing devices that are now in existence. Indeed, last month the number of mobile devices officially surpassed the number of people in the world, and they are multiplying five times faster than we are (which sounds very much like the beginning of the Terminator movies, but I digress). The networks themselves have become a critical component of all infrastructures in society and an important part of the growing public and private clouds. Despite this exponential growth in computers, the way that networking is done has remained virtually unchanged from the 1980s. Traditional networking approaches have become too complex, closed, and proprietary. They have become a barrier to creating new, innovative services within a single data center, on interconnected data centers, or within enterprises, and an even larger barrier to the continued growth of the Internet. Unfortunately it is not just a question of building more data centers to solve this problem, as the ability to absorb the costs of network expansion in the current paradigm is lacking.

The root cause of a network’s limitation is that it is built using switches, routers, and other devices that have become exceedingly complex because they implement an ever-increasing number of distributed protocols and use closed and proprietary interfaces. Each of the many processes of a router or switch are assigned to one of the following planes of operation: Forwarding Plane, Control Plane or Management Plane. This environment of complexity has made it nigh on impossible for network operators, third parties and even vendors to innovate. Operators cannot customize and optimize networks for their use cases that are relevant to their business and cannot offer customized solutions to their customers. The net result is legacy networks that are: difficult to optimize, difficult to customize, costly to run and inefficient.

SDN can be confusing because it’s used to create workload aware networks, so it has different implications depending on whether you’re running a Fortune 500 enterprise data center, a telecom carrier network, or some other application. Also SDN creates a set of abstractions for programming a data network, which can be subtle concepts. Goldman Sachs were one of the early adopters of this technology, dating back to before it was even called SDN.


SDN is a higher level of abstraction

In the simplest possible terms, SDN is a higher level of abstraction and entails the decoupling of the control plane from the forwarding plane and offloading its functions to a centralized controller. Rather than each node in the network making its own forwarding decisions, there is one centralized software-based controller (likely running on commodity server hardware) that is responsible for instructing subordinate hardware nodes on how to forward traffic. Because the controller effectively maintains the forwarding tables on all nodes across the network, SDN-enabled nodes don't need to run control protocols among themselves and instead rely upon the controller to make all forwarding decisions for them. The network, as such, is said to be defined by software running on the controller.


SDN Control Topology.pngA centralized system controller that can view the entire topology can then make more informed decisions about the network flow


Whereas in a legacy network a node can only see other nodes that it is connected to, an SDN controller has a view of the entire network topology that allows it to provide better quality of service and optimize its configuration according to dynamic requirements. However there are still questions that need to be answered in the event of the controller losing connectivity to one or several switches, so there is potential for hybrid deployments where non-SDN switching could be used as a backup.


We don’t need to reinvent all the functions of an existing network, but if SDN allows us to perform some functions better, faster, or more efficiently, then it has tangible value. Due to the fact that software is providing the commands, this means that the network can be programmed. And as all software engineers know, anything programmable can be automated and later optimized. It’s even better if SDN allows us to do something innovative, something that wasn’t economically practical or technologically feasible until now. Therein lays the real benefit of SDN; moving from a static, ‘one size fits-all’ style to a network that is more agile and flexible. Because it is easier to service and update, it also allows you to offer new features and functions that would be either too difficult or expensive to include in a legacy data center network.

SDN requires great hardware

Just as the new version of Android Lollipop or iOS will run best on the mobile devices that are tailored to its design, the same is true for SDN. The rise of this new level of software-based abstraction actually means that the hardware involved in these networks needs to adapt as well to provide a truly integrated solution. For example, the ARM® CoreLink™ CCN Cache Coherent Network family are all highly configurable and provide balanced service for both low latency and high bandwidth data streams, which enables scalable system coherency in heterogeneous processor systems. Similarly, the ARM CoreLink GIC-500 Generic Interrupt Controller  and ARM CoreLink MMU-500 System Memory Management Unit  both support the virtualization that is needed to get the most out of the software. With the release of IP blocks like these that are built with SDN in mind, it makes it easier for infrastructure architects to make the shift from traditional legacy networks.


An The specified item was not found. white paper goes into some detail on the use cases of SDN, in particular highlighting that it is “highly applicable to carrier networks since they are typically composed of heterogeneous hardware platforms and protocols, and offer several benefits to the unified carrier-datacenter network”. In recent months there has been a number of announcements from ARM partners increasing the momentum in the shift to SDN for infrastructure solutions, such as HP, Freescale and Broadcom.


The good news is that network equipment OEM's now have a range of options to choose from when making their choice of what medium to use to realize their equipment designs. There are already a range of ASSP silicon providers who provide SOC designs for a whole range of applications from enterprise networking through wireless infrastructure to core infrastructure networking applications. Announcements like the one from The specified item was not found.allow OEMs to benefit from integration of diverse functions and target extremely high performance compute subsystems that are ARM ISA compliant.


The SDN revolution that was on everybody’s minds a year ago is quietly gathering pace. Presently, its development is focused heavily on the large data center and virtualization space with some useful applications announced. It looks like it could very well evolve into a useful tool for the enterprise and service provider space in the near future, and I for one will be an interested observer to see what other applications will be found for this technology.

ARM is pleased to announce that Community favourite and Senior Embedded Technology Specialist Joseph Yiu has agreed to take part in a special interview to talk more about our Cortex-M processors.


Those of you who have had experience of using the Cortex-M processors might know Joseph from the publication of his book “Book: The Definitive Guide to the Cortex-M3 and Cortex-M4 Processors”, as well as others. His books have proved invaluable to those working with ARM technology, and have been taken up by a number of universities as part of their curriculum. He is seen as the authoritative figure when it comes to these processors in particular, and without wanting to inflate Joseph’s ego (!), he is one of the world’s most knowledgeable people when it comes to embedded computing. We are really looking forward to hearing from him about the family, and I, personally, am very excited to be interviewing him!


As the ARM Connected Community turned one year old last month, it seemed like a great opportunity to open this interview up to our members. You are therefore encouraged to post any questions you might have for Joseph on this blog. While it may not be possible to answer every question, we will try our best to do so.


To give you some inspiration for your questions, the topics that are likely to be covered are:

  • Joseph’s engineering background
  • Cortex-M and the evolution of microcontrollers
  • Cortex-M and IoT
  • The ARM Ecosystem of hardware and software developers, and many others


Please leave your questions in the comments section below and we will be sure to have a look at them all. Joseph’s interview will be taking place at the end of November, with the video likely to be published in early to mid-December.


Thank you to everyone for your contributions over the past year, and I look forward to receiving your questions and posing them to Joseph!

I hate exams.

But I'm glad to be able to report that I took two this last year and passed! (Of course I passed, or this post wouldn't exist...!)


I've worked for ARM for over 14 years, in a number of engineering roles - customer consulting, SoC projects team lead, lead developer on ARM development ASICs, and finally in applications engineering in the San Francisco bay area.

Last year ARM introduced a program to test knowledge of its processors and create a learning path to an ARM accredited certification after successfully passing an exam at a Prometric facility.

As a developer using ARM cores in my own projects, as well as professional life, I undertook the exams for personal challenge, and used the learning (wasn't it supposed to be revision!?) to sharpen my knowledge and skills.

Out of the gate, I have to say as ARM proliferates, these are great qualifications to have to demonstrate detailed knowledge of the cores. Even if you hate exams like me, I would still consider these whether you're motivated to stand out in a busy jobs market, or simply as a personal challenge to improve skills as I did.


So to the exams.

The first test was the ARM Accredited Engineer (AAE), which focused on ARM's application processors, and I took that in December 2013.

Sure, I have lots of experience, but I have to be honest and say I learned a lot through the process, and as well as the learning, the collateral that came with it is a real plus...

I did gain practical details on programming the cores and coupled IPs, but also around the architecture. Unfortunately I can't give specifics, as it would be unfair, but the mock tests and syllabus give an indication - although I feel the mock tests are on the easier side of the real tests...


The second test to be developed was aimed at the embedded MCU devices (Cortex-M / v6M, v7M), and the aptly named ARM Accredited MCU Engineer (AAME) exam was created. I took that in June 2014 on a Friday. I know it was a Friday because I passed and celebrated all weekend... I told you I hate exams...


Again, heavy on software and architecture, but I did feel there were a few questions in this one that were trying to trip me up! I'll leave it at that - but again I got a huge amount out of the process of study for this one too.


So how did I prepare, and what would I recommend?

Start with the syllabus and a lot of caffeine of course, and break it down like any problem. For me, I went back to basics of assembler programming and actually enjoyed it! I'm a hoarder of boards, and used Odroids and Atmel Xplained to play and tinker. Once you're happy with the basics and joy of ARM assembler, challenge yourself with something sticky like DSP or FP. Obviously having a real problem to solve would make all this more practical, and allegedly fun...

Going section by section through the syllabus is a decent approach. I stepped over gnarly sections and learned all about the interesting stuff first - just like you're not supposed to. Big rocks? Eating your vegetables!? Whatever works for you, do that - there's lots to cover...


One thing in good supply is information and bedtime reading. Each exam has a good list of recommended learning resources - architecture docs, manuals, Joseph Yiu books... Get them all - I get no cut from Joseph! -  they are great reference you likely should have as an ARM developer anyway, and I would be recommending to you as a new developer. The Cortex-A programmer's guide is likely one you didn't know about...


Other reference I used: William Hohl's Assembly Language book, and a real printed v5 architecture manual - the ARM ARM! Still good for the basics of assembler, and yes, I do know that as a Cortex-M developer you would likely never write any assembler. It's just good for you - and always good for debug.

Do you have to read it all? No. I have colleagues who claimed they barely skim-read the materials and passed. I hope they were bullied in school. My original thought was to actually read nothing and see what my residual knowledge would do. I'm glad I didn't, but I do hope you can use the learning to achieve something more than a certificate.


What's next?

I've personally challenged myself to take every exam as they are developed - yes even the graphics one...

Tests in preparation now are Cortex-R, (the unsung heroes of baseband and hard disks...), and SoC system design - applicable to FPGA as well as ASIC now of course.

I may or may not update you on my progress.



Ok, first some introductions…

So you might notice is that I am not Andrew N. Sloss. After many years in the role of driving the conversation and ARM business development in HPC, Andrew Sloss has handed off that task to the ARM Server Segment team and me in particular to keep the ARM HPC ball rolling. I am pleased to say that Andrew is still at ARM and still involved in HPC within ARM with a much more strategic focus. He will be one of many ARM folks attending SC14 in New Orleans this November. At last count we had at least a dozen R&D and marketing folks signed up to attend so please stop by booth (#3458) to meet us or contact us ahead of time so we can schedule a private chat (I can be reached via my email darren.cepulis@arm.com).


ARM Booth #3458 – where you can find us at SC’14



While our new and improved booth is located along the back right section of the exhibit hall, it is important to note that our partners will be hosting dozens of booths placed throughout the convention center.  ARM is a “behind the scenes” enabling technology for many HPC participants.  We provide the most prevalent industry-standard ISA, as well as the enabling technologies, ecosystem, and partnership for creating custom, application specific HPC SoCs.  The results of our technologies and our partners’ inspiration come in all shapes and sizes. At SC’14, we are excited to be demonstrating the latest 64-bit ARM-based HPC server hardware from an assortment of platform and SoC partners. Please see the Demo schedule below for more details…

ARM Booth Demo Schedule



Tuesday, Nov 18th 2014


  • Cirrascale highlights their RM1905D product that includes a pair of 64-bit ARM-based servers in 1U, each with an NVIDIA Tesla
    high-performance GPU.  Cirrascale is also hosting an Exhibitor booth, #2131 at SC14.
  • E4 Computer Engineering is in from Italy to display their short depth, 1U rack server, complete with Applied Micro X-Gene SoC, NVIDIA
    Kepler GPU, and infiniband. E4 can also be seen at their own Exhibitor’s booth, #4029 at SC14.
  • Pathscale shows off early work on their EKOPath Fortran compiler for ARMv8.


Wednesday, Nov 19th 2014


  • SoftIron touts their new production ***6408.2 server motherboard with integrated security and packet processing features.
  • MiTAC International Corp (parent company of Tyan) is on hand with their high-density and performance Datun (64-bit ARM-based) servers featuring 8x
    Applied Micro X-Gene SoCs in 1U.


Thursday, Nov 20th 2014


  • SoftIron with its ***6408.2 Server hardware.
  • Cirrascale highlights their RM1905D product.


ARM Booth Theatre

This year the ARM booth will host daily presentations by ARM and its partners discussing various HPC technologies and trends.  Please swing by our booth to check out the schedule.

SC14 Talks

ARM and its partners will be participating in several events as well as theatre talks throughout the week:


  • ARM’s John Hengeveld will present at the SC’14 Exhibitor Forum (Wednesday, Nov. 19th, 4:00pm – 4:30pm): HPC Futures and
  • ARM’s Wendy Elsasser will participate on a panel about the Future of Memory Technology for Exascale and Beyond (Wednesday, Nov
    19th, 3:30pm – 5:00pm)
  • Piero Altoe of E4 Company will present at the Exhibitor Forum (Tuesday, Nov. 18th, 11:00am – 11:30am): E4-ARKA: ARM64+GPU+IB
    is Here
  • ARM’s Eric Van Hensbergen will speak in the NVIDIA booth (Thursday, Nov. 20th, Noon – 12:30pm): The ARM Ecosystem from
    Sensors to Supercomputers


SC14 Announcements

As the clock ticks near to SC14, please keep an eye out for significant announcements and press releases in regards to ARM, its partners, and HPC.  I will update this space as they occur. 

A group of experts sat down last Friday to discuss some of the pressing topics surrounding the the semiconductor IP industry. As is always the case when there is a get together of people from different companies, the exchanges were refreshingly honest and reflected various perspectives. In an interesting and vibrant discussion, the guys spoke about where subsystems will fit into the market, whether it will leave room for smaller companies to operate and how the cloud might come into play in the near future. When the topic of IP management and security is brought up, you won't want to miss out on hearing how some IP customers unintentionally violate their licensing agreement! You can watch the full discussion in the video below, or watch it on YouTube: Enterprise IP Management eSeminar - YouTube



The key themes of the panel discussion are:

  • Required changes in design and verification flow
  • Efficient design data management
  • Approaches for IP security

  Respected Panelists Include:




ARM is well known for many things, not only does it design extraordinary processors and microprocessors  (hint: you probably have a chip based on one of its designs in your phone), but it is also the champion of low power consumption, and heterogeneous computing (with big.LITTLE). To further enhance the power efficiency of big.LITTLE processors, ARM has started to release patches for the Linux kernel (which is used by Android at its core) for a new piece of tech called Intelligent Power Allocation (IPA).

Keeping a SoC within a defined temperature range is essential for fanless designs (like your smartphone or tablet). The busier a processor gets, the more heat it generates. At the moment the Linux kernel has a simple thermal algorithm which basically throttles the processor when it gets too hot. However a modern ARM processor is a complex beast. It has high performance “big” cores (like the Cortex-A15 or the Cortex-A57), it has energy efficient “LITTLE” cores (like the Cortex-A7 or the Cortex-A53), and it has a GPU. These three different components can be controlled independently and by controlling them in unison a better power allocation scheme can be created.

Click the link Intelligent Power Allocation improves thermal management to see the full report.


Chinese Version 中文版:扩展 CoreLink 缓存一致性网络系列

It has been a busy month for ARM in the infrastructure space. ARM TechCon 2014 started it off with ARM silicon, OEM and ecosystem partners demonstrating their new SoCs, hardware and software platforms. The show also had several talks discussing the challenges within infrastructure and the need for innovation.  Neil Parris discussed one aspect in his recent blog, Heterogeneous Compute Requirements in Network Infrastructure, where he described how different cores for different tasks are required to optimize Software Defined Networking (SDN) and Network Functions Virtualization (NFV) solutions.


The month then wrapped up at the Linley Processor Conference where Ian Forsyth presented Scalable ARM-Based Solutions from Sensor to Server, from Edge to Core. Ian discussed how the growth of IoT and smart devices are stressing the current system and announced two new members to the CoreLink Cache Coherent Network (CCN) Family to help tackle the challenge, the CoreLink CCN-502 and CoreLink CCN-512.




So where do the new CoreLink CCN interconnects fit?

One common theme throughout the recent shows, blogs and talks - there is a need for more efficient, optimized solutions from edge to core.  ARM Cortex A-Series Processors with CoreLink System IP provide a common architecture across the spectrum, scaling from cost efficient home gateways to high performance core networking and server applications.  The CoreLink CCN-502 fits between the CoreLink CCI-400 and CoreLink CCN-504, enabling power and cost efficient small to mid-range solutions.  The CoreLink CCN-512 increases the compute density by supporting up to 48 cores.

ARM Cortex and CoreLink Scalable Solution.PNG

CoreLink CCN-512 – Maximize Heterogeneous Compute

On the high end of the performance spectrum, macro base station and cloud applications require dense, efficient compute platforms with the right-sized cores to match the appropriate workload.  High performance cores are required for server compute and control plane processing, efficient small cores are required to maximize packet throughput and customized accelerators are needed for Layer-1, security and content delivery processing.


With up to 12 CPU Clusters (48 cores), 4 channels of DDR4-3200 memory, and 32MB of Level 3 System Cache, the CoreLink CCN-512 is well suited to maximize heterogeneous processing on a single SoC while maintaining bandwidth up to 1.8 Tb/s.

CoreLink CCN-512 callouts.PNG

CoreLink CCN-502 – High Performance, Small Footprint

On the low power and cost efficient end, there is a need to deploy many smaller devices to fill gaps or connect devices on a budget.  If you look closely around office buildings or shopping malls, you’ll see cellular repeaters, small cell base stations, and WiFi access points scattered throughout to ensure our smart devices are always on, always available.

With up to 4 CPU clusters (16 cores) and optional Level 3 System Cache, the area optimized CoreLink CCN-502 is the ideal interconnect for these small systems that still demand performance.  It is up to 70% smaller than the CoreLink CCN-504 (at the 1MB L3 System Cache design point), yet still capable of maintaining bandwidth up to 0.8Tb/s.

CoreLink CCN-502 callouts.PNGCoreLink CCN Family Summary

With CoreLink CCN-504 SoCs already in production, the new family members build upon a proven architecture and offer the same enterprise class features; native AMBA 5 CHI interfaces for high frequency, non-blocking data transfers, end-to-end QoS (Quality of Service) and RAS with CoreLink DMC-520, and extensive clock gating and retention states for optimal power efficiency.


In summary, the CoreLink Cache Coherent Network Family provides a common platform for ARM silicon providers to customize scalable systems from edge to core; platforms with a common software framework across heterogeneous systems to meet diverse price, performance and environment requirements.


For more information on the CoreLink CCN Family, please visit the CoreLink Interconnect Homepage - CoreLink Interconnect - AMBA on-chip connectivity - ARM.

Over the past two decades, we have a wide range of innovation in the devices we use to access the network, the applications and services we depend on to run our digital lives, and the computing and storage solutions we rely on to hold all that “big data” for us. However, the underlying network that connects all of these things has remained virtually unchanged. The reality is the demands of the exploding number of people and devices using the network are stretching its limits. The networking world is undergoing a significant shift, moving from embedded systems to more open, virtualized systems based on common, standardized software stacks. Software-defined networking (SDN), network functions virtualization (NFV) and network virtualization (NV) offer new ways of designing, building and operating networks.

NetworkOperatorsRequireDS.pngThe introduction of SDN is also being promoted by the growing relevance of the Internet-of-Things (IoT), which will dramatically increase the number of network endpoints, while adding a huge amount of data that needs to be transferred in a secure manner. NFV further simplifies the deployment and management of such large number of endpoints, helping providers to cope with the increased needs of performance whilst reducing costs. NFV also brings more agile, serviceable networks to fruition by virtualizing entire parts of the network.  It is a multi-year evolution that is truly reshaping the industry.

Earlier this month, at ARM TechCon 2014, the Linux Foundation made an announcement about the formation of a new project named Open Platform for NFV. OPNFV will be a carrier-grade, integrated, open source reference platform intended to accelerate the introduction of Network Functions Virtualization platforms and architecture. ARM is a founding member of the OPNFV group. ARM has also been at the center of NFV definition activity through our work with standards bodies and with partners. The OPNFV project is the next step in advancing NFV with a common software base: bringing together multiple open source software blocks, integrating, testing, optimizing and also filling in gaps.  OPNFV is expected to increase performance and power efficiency; improve reliability, availability and serviceability; and deliver comprehensive platform instrumentation.

If we look now at the hardware platform, specifically an SoC we can see there are four classes of compute required in infrastructure:

•             Control plane processing which requires high compute performance

•             MAC scheduling which requires low latency

•             Data plane processing which benefits from high efficiency small cores

•             Specialized processing including accelerators and DSPs

These are just a few examples of the types of challenges we will discuss in more detail at the upcoming Linley Networking conference on the 22nd of October, 2014. Ian Forsyth, Director of Infrastructure Product Marketing at ARM will take to the stage to highlight the role that ARM CoreLink™ Cache Coherent Network products play, offering high bandwidth, low latency connectivity and highly integrated solutions for the new deployments of NFV and SDN. The key here is the scalability offered by these interconnect solutions that help address intelligence at the network edge. His talk will further detail how the power-efficient yet high performance ARM architecture delivers benefits for networking infrastructure and high core-count solutions in servers.

UPDATED: Take a look at Jeff's blog about the product launch of CoreLink CCN-502 and CCN-512 Extending the CoreLink Cache Coherent Network Family


Further information:

It was the philosopher George Santayana who first proclaimed “Those who do not heed history are doomed to repeat it” over a hundred years ago, and it remains highly relevant today to those who are tackling the issue of developing faster, smaller, cheaper SoCs. I recently sat down with Norman Walsh to discuss IP integration. As well as having 21 years in the IC design industry Norman is a keen historian and so I figured he would be a fountain of knowledge when it came to piecing together how we have gotten to this point in time. As it turned out, he proved to give a very interesting overview of the challenges facing IP integration, and how it all got to this point. I’m shortly due to sit down with David Murray and talk about the future possibilities and probabilities in the industry, so be sure to check that out as well. If you have any questions you would like David to answer, please PM me or ask in the comments below.



Hi Norman, can you give me a brief overview of how IP integration has gotten to where it is right now?

Well to give a proper explanation we must bring it back to Moore’s Law. I’m sure everybody knows this already, but Gordon Moore was an engineer working for Fairchild Semiconductor who stated back in 1965 that the number of transistors in an integrated circuit will double approximately every 2 years. It’s a cliché but it’s testament to its accuracy that only in the last few years has there been a shift away from Moore’s Law. Now this was obviously going to grow exponentially, from the very small systems that were around back then until very quickly it grew into millions of transistors and nowadays we’re looking at a system that can have anything from 7 to 10 billion transistors. The backstory to Moore’s Law was that he was highlighting this phenomenon as a challenge for the semiconductor industry to keep up with the growth. The way the IC design industry kept up with this growth was through periodic shifts in the level of abstraction in the designs. At that time in the 60’s and 70’s a chip was first sketched out on paper, and each transistor was designed individually. The people within the companies doing this at the time soon realised that this was not a reasonable way to continue, so they developed CAD (computer aided design) tools in-house to automate the process of putting these transistors together, and to give them a schematic for putting them together. In the early 1980’s some of these design teams went off on their own and formed what is now a large and vibrant EDA industry. As well as developing the tooling to do this, the level of abstraction had to improve as well. In the late 70’s and early 80’s the industry went from a design that focused on individual transistors to what was called a standard cell-based design. This was a way of defining commonly-used features on an IC like an inverter or an AND gate so that they could be called out instead of constantly having to describe these functions.


hp-daily.gif(Source) how chip designers felt with the introduction of AND and OR gates

So that was the first level of abstraction then? The introduction of AND gates?


Yeah. It doesn’t sound too complicated now but at the time it was pretty revolutionary stuff. So they created these libraries of logic gates that gave designers the ability to keep pace with the growth in transistors. This kept everything on an even keel up until the late 80’s early 90’s when it became really difficult again as we’re talking about hundreds of thousands of transistors on a chip, and even with the logic gates people were struggling to keep up. This is when a hardware description language came in to raise the abstraction another level to what we now know as RTL. This time it modelled the flow of digital signals between hardware registers and the logical operations performed on those signals. This was done in conjunction with the emergence of two hardware description languages, Verilog and VHDL (which was actually developed by the US Department of Defense back in the 1980’s). Using the RTL along with the synthesis tools developed by some of the larger EDA companies gave some breathing space back to system designers and allowed them to keep ahead of the Moore’s Law curve. Predictably, within 15 years the problem was back on the table and had reached critical levels by the time IP-XACT came along. IP-XACT was developed by the now defunct SPIRIT Consortium around 5 or 6 years ago, and raised the level of abstraction up to where designers were now describing interfaces between individual blocks of IP. Over the last number of years it has taken a while to be adopted, just as the other abstractions did, but in the last 2 years it has definitely become the standard across the industry. The adoption of IP-XACT contributed to the rise of commercial IP as more and more teams have been incorporating 3rd party IP or reusing old IP in their designs, because they now have a way to standardise the interfaces and make the integration quicker and easier. A number of people don’t see this growth of commercial IP as a shift in abstraction, but I believe it is. It essentially allows SoC designers to leave behind a lot of the repetitive tasks in chip design that do not make any overall difference, and allows them to focus on real design decisions that can differentiate their SoC. But if you really want take advantage of all of this IP, there are still some nuts to turn and things that are necessary to make them work together really well. That’s the real key point, as the end user cares more about the performance of the SoC (and the new device) than the performance specs of each individual piece of IP.

Slider-Electronic-integration-600x225.jpg(SourceSuccessful SoC design requires the integration of various components



It sounds like the industry has changed a lot over the last five to ten years. What does the landscape look like now?

Newer companies don’t have the collateral or library of IP to make it all themselves. If a company started out now it would take decades of man hours to come up with the IP necessary to build a chip in-house. So in that sense the growth of commercial IP has reduced the barriers of entry for this industry, and we are seeing new companies come into the market and become competitive by focusing on a particular niche. However as with all of the levels of abstraction and solutions that I mentioned earlier, it has presented its own set of problems to be solved.

There is always a big question over the quality of 3rd party IP. In the past there were doubts, about whether it had gone to silicon? It created a bit of a Catch-22 situation as nobody would trust IP that hadn’t gone to silicon, and some of the newer designs struggled to get off the ground because of that. Once it gets to a silicon-proven stage then people are more likely to use commercial IP, as they know that there is less of a risk involved with it.

The interoperability between IP blocks is always an issue. It’s becoming less so over the years as a lot of people have moved to the AMBA protocols like APB or AXI, but it still takes a bit of time for companies to move away from something they’ve developed internally. It also took a while for standard interfaces to be recognized and become adopted by the majority of the industry. Five years ago, a lot of IP was internal rather than commercially available. Interoperability was a problem. You had this thing that you grew internally that didn’t connect to anything very easily. Then you had this other IP with a standard interface and you couldn’t connect them. Nowadays people still build internal IP, but they build them in a way that they can be connected easily. The ARM AMBA specifications help because they are a standard, but it’s more of a catalog of things you could do with the interface. So there is still a lot of tweaking. You can see a trend, with the chips becoming more complex and the levels of abstraction being raised. To be honest we are probably overdue for another level of abstraction at this stage.


patch.jpg (Source) Simplified version of the interoperability issue



And what do you think that level of abstraction will look like?

You have to think outside the box on this one. There’s no magic bullet here, ESL has been talked about for a while but I’m not so sure, it just describes the chip in a different way. The way I see it is through IP-XACT and standardisation, we need to standardise formats. A shift in abstraction is all about improving productivity and making sure that formats are correct across different teams or different companies would absolutely make a difference. One of the results of commercial IP and IP reuse is that there we have seen a growth in subsystems as more and more parts become standardized. To a certain extent this makes the integration side of things easier to manage as there are fewer custom interfaces to deal with. It is essentially delivering an entire system within an IP block; the processor clusters, the interconnects and everything. There needs to be a more standardised methodology. There’s been a lot of talk about back-end methodology, and stitching things together in your block diagram. One of the challenges that people are going through right now is that the design on the front end is taking weeks and weeks to get right, and then it is being rushed through the back-end in a matter of days. If we could find a way to manage the front-end of design in a better way, that would cut down the design cycle considerably.


(Source) The next level of abstraction?


Sounds like you have plenty on your plate going forward. Finally, how important is it to have internal IP that can interact with ‘foreign’ IP?

The use of standard and intelligent formats is very important for the continued use of 3rd party IP. It’s one thing to have an IP-XACT language, but we need to have a consistent way of describing IPs using IP-XACT. This way IPs will come together a lot quicker and it needs to happen going forward. It’s something we’re working on in a big way at the moment, creating standardised formats for IPs to interact with each other. This needs to become standard across the industry but it needs to happen internally first. It needs to happen with the interfaces for all IPs to be 100% compatible and make it simpler for people who are not experts in this to input ARM IP or any other 3rd party IP. I think that once this happens we will see far more people open to the idea of truly vendor-neutral IP, and design times could be reduced dramatically.


I certainly found his answers quite helpful and enlightening to uncover some of the history within the IC design industry. If you have any questions for Norman, please enter them in the comments below and we'll get them answered as soon as possible.


The need for ever-connected devices is skyrocketing. As I fiddle with my myriad of electronic devices that seem to power my life, I usually end up wishing that all of them could be interconnected and controlled through the internet. The truth is, only a handful of my devices are able to fulfill that wish, but the need is there and developers are increasingly recognizing that we are moving to a connected life. The pressure to create such a connected universe is so immense that designers need a faster, more reliable way to fulfill our insatiable need.


One way to fulfill the need is for designers to adopt FPGA-based prototyping. This proven technique allows designers to explore their designs earlier and faster and thus proceed more quickly with hardware optimization and software refinement. In addition, recent capacity developments in prototyping have made it possible to realize the benefits of FPGA-based prototyping for even the largest designs. It has to be said that ARM and Xilinx have been at the forefront of enabling today’s embedded designs. It is critical that prototyping technology keep pace with the advancements from ARM and Xilinx.


S2C Inc. recently announced the availability of its AXI-4 Prototype Ready™ Quick Start Kit based on the Xilinx Zynq® device. The Quick Start Kit is the latest addition to S2C’s library of Prototype Ready IP and is uniquely suited to next-generation designs including the burgeoning Internet of Things (IoT).


The Quick Start Kit adapts a Xilinx Zynq ZC702 Evaluation Board to an S2C TAI Logic Module. The evaluation board supplies a Zynq device containing an ARM dual-core Cortex-A9 CPU and a programmable logic capacity of 1.3M gates. The Quick Start Kit expands this capacity by extending the AXI-4 bus onboard the Zynq chip to external FPGAs on the TAI Logic Module prototyping system. This allows designers to quickly leverage a production-proven, AXI-connected prototyping platform with a large, scalable logic capacity – all supported by a suite of prototyping tools.


Integrating Xilinx’s Zynq All Programmable SoC device with S2C’s Virtex-based prototyping system provides designers an instant avenue to large-gate count prototypes centered around ARM's Cortex-A9 processor.


To learn more about how S2C’s FPGA-based prototyping solutions are enabling the next generation of embedded devices, visit Rapid FPGA-based SoC & ASIC Prototyping - S2C.

Filter Blog

By date:
By tag: