1 2 3 Previous Next

ARM Processors

260 posts

For many, Tetris is simply a tile-matching video game originally designed and programmed by Alexey Pajitnov in 1984. However, for others, it inspires endless possibilities of Maker projects. Most recently, AdaCore’s Tristan Gingold and Yannick Moy have devised the highly-popular puzzle on an Atmel | SMART SAM4S ARM Cortex-M4 microcontroller.


“There are even versions of Tetris written in Ada. But there was no version of Tetris written in SPARK, so we’ve repaired that injustice. Also, there was no version of Tetris for the Atmel SAM4S ARM processor, another injustice we’ve repaired,” the duo writes.

The concept first stemmed from their colleague Quentin Ochem, who had been searching for a flashy demo for GNAT using SPARK on ARM, to run on the SAM4S Xplained Pro Evaluation Kit. Luckily, this kit features an OLED1 extension with a small rectangular display, which surely enough, immediately ‘SPARKed’ the idea of Tetris. Now, throw in the five buttons overall between the main card and the extension, and the team had all the necessary hardware to bring the project to life.

In total, the entire build took approximately five days to complete. Both Gingold and Moy advise, “Count two days for designing, coding and proving the logic of the game in SPARK, another two days for developing the BSP for the board, and a half day for putting it all together.”

For those unfamiliar with SPARK, it is a subset of Ada that can be analyzed very precisely for checking global data usage, data initialization, program integrity and functional correctness. Mostly, it excludes pointers and tasking, which proved not to be a problem for Tetris.

While we’ve seen the retro game played on everything from t-shirts to bracelets, we’ve never experienced the game literally on an MCU. As the team notes, all of the necessary sources can be downloaded in the tetris.tgz archive, while those interested in designing one of their own can find a detailed breakdown of the entire build here.

This post was originally shared on Atmel Bits & Pieces.

In an effort to make FPGA-based prototyping available to any engineer, S2C is offering its popular ProtoBridge AXI FPGA-accelerated verification tool along with its Vertex 7 SingleE and Kintex 7 Logic Modules for a limited time at an accessible price entry point. 

Protobridge AXI enables designers to read and write data from computers to AXI-based designs mapped to FPGA-based prototypes. By utilizing a rich set of C subroutine calls, ProtoBridge AXI users can easily implement algorithm validation, block-level prototyping, full-chip simulation acceleration, corner case testing and early SoC software development.

ProtoBridge AXI consists of a computer software component and a FPGA design component. The computer software component contains Linux/Windows drivers and a set of C-API/DPI routines to perform AXI transactions. The FPGA design component contains a PCIe, an interconnection module and AXI transactors to be instantiated in users’ design-under-test (DUT). With these enhanced product features, users can read and write at speeds of up to 500 megabytes per second through the PCIe interface, connect 16 Master devices and 16 Slave devices on the AXI bus, and take advantage of the patent pending Shared Memory technology that link the FPGA prototype with third party design tools.

Virtex 7 SingleE Logic Module


The S2C SingleE V7 Logic Module is the industry’s smallest form-factor (260mm X 170mm), all-purpose, stand-alone prototyping system based on Xilinx’s Virtex-7 2000T FPGA. The system utilizes S2C’s 5th generation technology, can handle up to 20M gate designs and features:

  • 960 I/Os on 8 high-speed connectors
  • Access to a library of over 70 daughter cards for quickly building a prototype target
  • On-board DDR3 SO-DIMM socket extending to 8 GB of memory running at 1600MB/s
  • Remote control management through Ethernet and USB for programmable clock generations, design resets, virtual I/Os and switches, I/O voltage setting, and monitoring of voltage/current/temperature and read back hardware status
  • Multiple SingleE V7 Logic Module management from one PC


Kintex 7 Logic Module


The K7 Logic Module features the largest number of user I/Os in its class with 432 I/Os on four Dedicated I/O connectors and 16 channels of GTX transceivers on two Differential I/O connectors. The GTX transceivers are capable of running up to 10Gbps with -2 grade FPGA devices. Users can easily download to FPGAs, generate programmable clocks, adjust I/O voltages and run self-test on hardware from S2C’s TAI Player Runtime software via a straightforward USB2.0 interface. With S2C K7 TAI Logic Module’s affordable pricing, project managers can deploy large number of FPGA-based prototypes to accelerate hardware verification and software development in parallel.


Ideal Solution for Block-Level and Algorithm Development


Coupled with S2C’s ProtoBridge™ software that accelerates FPGA verification using co-modeling technology, the SingleE V7, and Kintex 7 Logic Modules are the perfect platform for IP and algorithm creation. Engineers are able to leverage the strengths of system-level simulation and RTL-level design accuracy, shorten design and verification time, and ensure higher product quality through improved test coverage.

Designers can achieve these goals with the ability to

  • Link FPGA prototyping to ESL simulations for model availability and accuracy
  • Validate SoC software on the target architecture and design algorithm in real hardware
  • Read/write data at speeds of up to 500MB/second data through the PCIe Gen2 interface from computers to AXI-based designs mapped to the logic module
  • Create corner test cases in software and run exercises on the prototype
  • Run regression tests on their FPGA prototype utilizing vectors stored in the host computer


To learn more about these bundled solutions, please visit Rapid FPGA-based SoC & ASIC Prototyping - S2C  or contact us at S2C Contact Information.

Frustrating isn't it?  You're using your new smartphone or tablet to view pages on the Internet, watch a video or get the latest traffic information and the mobile communications just can't handle it.  You look at your screen and see a little symbol showing that the signal is dropping in and out of 2G, 3G or HSPA 3.5G connections, and then the device gives up altogether.  Unfortunately this scenario is still all too common because despite having the latest applications processor, graphics and software in your phone, we still often have to rely on patchy, low data-rate wireless coverage.

But all this is changing with the advent of new 4G LTE and LTE-Advanced communications which cellular operators are now busy deploying, and our ARM® Cortex®-R real-time processors are powering the latest wireless modem chips in your handset to deliver data faster and more reliably. Take, for example, the new Samsung Exynos Modem which has just been announced. The Exynos Modem 300 series use a Cortex®-R to run the 4G-LTE software protocols and manage signal processing for transmitting and receiving data.  In fact, the Exynos Modems aren't the only ones using a Cortex-R for this task; there are hundreds of millions of similar chips in phones and tablets already in use throughout the world.


Cortex-R processors are often hidden from view in applications like this, running underlying communications and control tasks in applications ranging from flash memory or hard disc storage to automotive braking, steering or instrument clusters.  Designers choose a Cortex-R processor because its microarchitecture and memory system are specifically designed for these tasks where lots of hard real-time events must be serviced within micro-seconds to maintain accurate control and signal processing.


However, technology marches on and the next generation of wireless modems will soon deliver even higher data rates of 300 Mbits per second or more and support so-called ‘carrier aggregation’ which lets wireless operators use a mix of different frequencies to reach all the devices connecting to a cell.  This will provide even more reliable communications and it enables operators to make best use of their precious wireless spectrum allocation.  Of course this requires yet more real-time processing throughput and the latest Cortex-R7 real-time processor fits the bill here, without increasing the energy consumption for battery-powered devices.  Modems for this have been developed and are currently in silicon and going through the testing and approvals process which they must pass before they're allowed to connect to the cellular network.  I’m looking forward to getting my next 4G phone in 2015 that will have one inside.


Thanks for reading. Chris Turner

DAC IP Track Submission Deadline January 20th

Don't miss your opportunity to deliver a compelling technical paper at the

Design Automation Conference, June 7-11, 2015.


Watch this short video by the DAC IP Track Committee Chair to learn more.

DAC 2015 IP Track Submission – Mac McNamara - YouTube


Click here to submit your paper abstract -- 100 words is all you need!




Netgear's major announcement at CES 2015 was the ReadyNAS 200 series of NAS units targeting the SOHO market and power users. This lineup has two members, a 2-bay RN202 and a 4-bay RN204. The ReadyNAS 200 series is based on a dual-core Cortex-A15 SoC from Annapurna Labs. The system has 2 GB of RAM and two GbE ports. 802.3ad dynamic link aggregation is supported, and transfer rates of around 200 MBps are possible (similar to what QNAP claims for their TS-x31+ series). The units run ReadyNAS OS 6.2 and have a MSRP of $360 and $500 for the 2-bay and 4-bay variants.

Source: AnandTech | Netgear Launches ARM Cortex-A15-based ReadyNAS 200 Series


QNAP has decided to use an Annapurna Labs SoC without the integrated 10G ports. We have two Cortex-A15 cores running at 1.4 GHz in the 28nm SoC that is part of the TS-231+ and TS-431+. The SoC also has two native GbE ports with enough performance for full-scale link aggregation.

Source: AnandTech | QNAP Releases Haswell-based TVS-x71 and Cortex-A15-based TS-x31+ NAS Lineups

Hello Everyone,


I hope you enjoyed the first part of the interview! If you are yet to see it, then you can find it here: Interview with Joseph Yiu: Part I

Here we look at why entrepreneurs are using ARM Cortex-M, as well as answering some of the questions sent in December on this post Interview and Question Time with Joseph Yiu. Thank you to Joseph Yiu again, and many thanks to all of you who submitted questions. If you have any further questions then do not hesitate to comment below


ARM Connected Community Interview with Joseph Yiu: Part II - YouTube

Hello everyone,


Thank you for your patience in waiting for this interview. I am pleased to say that the first part of it is here!


In this first part, I ask Joseph Yiu a number of questions about the ARM Cortex-M Series and the Internet of Things. The second part of the interview which features questions from a number of our users from Interview and Question Time with Joseph Yiu will be published early next week.


I hope you find Joseph's insights interesting and that you learn a little bit more about the ARM Cortex-M series. If you have any further questions, then do not hesitate to leave a comment below and Joseph will be sure to get back to you as soon as possible.


I would like to thank Joseph for taking the time to record this interview, and we will be doing more interviews like these over the coming months to give you an insight into what we are doing here at ARM.


ARM Connected Community Interview with Joseph Yiu: Part I - YouTube



Have you ever wondered about how your day-to-day lifestyle might affect your sleep quality? For many years, studying sleep was something only healthcare professionals could undertake. Nowadays, we can monitor the sleep cycle much more effectively. Sensor technology has made a huge leap in the last decade replacing clinical set-ups of helmets with wired electrodes and cameras to something everyone can do in an affordable way.


Improved sensor technology has been complemented superbly with improved microcontroller (MCU) technology. An ARM Cortex-M processor can now be simply integrated into an SoC alongside advanced sensors such as accelerometers, and gyroscopes, with the device used as a so-called ‘sensor hub’ to compile and process sensory data. These sensor hubs have become common place on an SoC designed for smartphones for a number of years, but we are now seeing the emergence of standalone wearable devices, with power and form-factor benefits making them well-suited to activity tracking. One of the best examples of such sensor hubs helping self-monitoring of sleep patterns on the market are devices such as the Misfit Shine (Figure 1).

MisFit Shine_19-580-90.JPG

(Figure 1: Misfit Shine in use)

Cortex-M sensor hubs – simple but smart; secure, small: ideal for devices like MisFit Shine


Inside the Misfit Shine is a Cortex-M3 processor. The Cortex-M3 processor is the industry-leading 32-bit processor for highly deterministic real-time applications, designed for the challenges of very low-power constraints. The processor also features an integrated Memory Protection Unit (MPU), an integrated nested vectored interrupt controller (NVIC) for writing all code in C and uses the Thumb®-2 instruction set to achieve lower code density. This makes it ideal for wireless battery-operated devices like the Misfit Shine. The device is powered by a standard watch 3V coin cell battery which lasts for up to four months before a replacement is needed.


The sensors are always on and continuously sampling data, with the Cortex-M processor remaining in deep sleep mode until a change in physical state requires attention. This event driven design means that power consumption is as low as possible. The Misfit Shine is one of the few wearable products on the market which does not require charging, meaning I need not worry about removing it to charge overnight and making it ideal for sleep tracking.

The challenge for all tracking devices is form factor. Size dictates both the number of sensors than can be integrated and the battery size. The Cortex-M processor family is ideally suited to these situations, with a range of processors providing solutions for different requirements. This has meant that a huge number of wearable devices currently on the market have selected different Cortex-M based processors to meet their requirements.

Sensorfusion.PNG(Figure 2: Cortex-M Processor as a Sensor Hub)


If are interested in more detailed explanation regarding the features of Cortex-M processor based wearables, you may want to check Diya Soubra’s recent article on his experience of sleep tracking.


And one Cortex-M3 based Misfit Shine can be yours this week!


Good news for everyone is that the Cortex-M3 based Misfit Shine has just entered ARM’s 2014 Epic Giveaway! In partnership with HEXUS, ARM is giving you the chance to win amazing new prizes this holiday season. Each prize draw will be open for seven days, so visit the dedicated competition page to keep tabs on what's up for grabs and what's coming soon.

Looking back at the continuously evolving smartphone market, it is amazing to note the role that the ARM Cortex-A7 core has played.


Previously the processor which initiated the uptake of multicore processing in mobile, the ARM Cortex-A7 is now an increasingly popular choice in energy-efficient mobile computing, enabling devices to achieve high-end functionality alongside all-day battery life and a very competitive price point. Devices based on the mature Cortex-A7 can now be typically found for well under $200, and this is driving widespread uptake in emerging markets such as Brazil and India.


A7.png(Figure 1: ARM Cortex-A7 processor design)


On top of its extreme energy efficiency, the Cortex-A7 incorporates many features of the high-performance Cortex-A15 and Cortex-A17 processors, including virtualization support in hardware, Large Physical Address Extensions (LPAE), NEON®, and 128-bit AMBA® 4 AXI bus interface. It provides up to 20% more single thread performance than the Cortex-A5 and provides similar performance to mainstream Cortex-A9 based smartphones in 2012. This profile makes it an ideal choice for smartphones aimed at providing an excellent specification within a limited cost envelope. As one of ARM’s most mature and energy-efficient processors, the Cortex-A7 is ideal for use in smartphones, and has long been one of the most popular choices in this market.


One device which has successfully utilized a Cortex-A7 processor design is the Moto G which this year became the most successful, highest-selling smartphone in Motorola's history. Inside it is a quad-core ARM Cortex-A7 based Qualcomm Snapdragon 400 SoC, with 1GB of RAM and 8GB of storage. The Cortex-A7 processor is a very energy-efficient applications processor designed to provide rich performance in entry-level to mid-range smartphones, high-end wearables and other low-power embedded and consumer applications. Figure 2 below shows the excellent score of the Moto G in a battery benchmark test.


Moto-G-charts.012.png(Figure 2: Arstechnica battery benchmark test)



Want to get your hands on a Moto G?


ARM has partnered with Hexus to give away a market-leading ARM-powered device everyday over the festive period. If you want to win a Moto G smartphone, head over to the Epic Giveaway page on Hexus.com where you can be in with a chance of winning!

Here comes a great campaign on Indiegogo--- Atom PC


Atom PC is a high performance Android & Ubuntu Mini PC, Gaming Console, Home Theater, Smart TV.


Overview – What’s Atom PC

Atom PC is an easy to setup and use Mini Desktop PC, which is powered by Quad core Cortex-A17 1.8G processor. It supports Dual OS (Android & Ubuntu OS). It's not only a Computer, but also a Home Media Center, Gaming Box, Portable Linux Workstation, Skype & Video Conference tool etc.


Atom PC features a Quad-Core Cortex-A17 CPU , Mali-T764 3D GPU , 2G of RAM, up to 32G of storage, VGA and HDMI output, 2.4G & 5G dual band Wi-Fi, 4 USB ports.


Atom PC supports 4K Ultra HD, huge 3D games and is suitable for various occasions, to meet a variety of needs. It is also designed to run the XBMC media center app (soon to be known as Kodi),  which supports most common audio, video, and image formats, playlists, audio visualizations, slideshows, weather forecasts reporting, and third-party plugins.

Key Features

1. Quad-Core Cortex-A17 CPU

Atom PC is equipped with Rockchip RK3288 Quad-Core Cortex-A17 CPU, which is called King of Performance, King of Ultra HD and King of Game. The Cortex-A17 processor offers over 60% performance uplift over the Cortex-A9 processor, the current leader in mid-range mobile market. So Atom PC with such a powerful CPU will provide you with Excellent speed, more compatibility and energy.


                  The first SoC based on Quad-core Cortex-A17 in the world

2. Mali-T764 3D GPU

Atom PC’s ARM Mali-T764 3D GPU is the latest powerful graphics processor. H.265 hardware decoder supports MPEG-2, MPEG-4, AVS, VC-1, VP8, MVC with up to 1080p@60fps and supports multi-format video decoder with up to 4Kx2K. It can be regarded as the leader in current Smart TV word. It’s not difficult to support smoothly high-resolution (3840x2160) display and mainstream game.

3. Android & Ubuntu Dual OS

Atom PC’s dual OS makes it available to carry out a variety of tasks such as web surfing, photo editing, emailing, social networking and much more.

Running Android, there are over a million apps and games readily available. You can also watch high quality TV programs and movies from around the world.

Atom PC also supports Ubuntu 12.04.5 LTS (Precise Pangolin) official version of the system, but uses a more friendly Unity welcome interface and a vastly improved and enhanced Ubuntu Software Center, which makes massive software installation easier and more user-friendly and fully functional.

4. Flexible Placement

There are three different placement ways. You have the flexibility to select the appropriate placement according to your needs.

  • If you like, you can just flat it on the desk.
  • You can also add a base to place vertically, which looks more beautiful.
  • To save space, you can hang it in the back of the monitor.


What can you do with Atom PC?

Android Home Computer

If you have a spare monitor, keyboard and mouse, then it’s time to set up a new computer. What’s more, compared with the traditional host, it will save a lot of space. And you do not need to worry about performance issues. The MINI PC that boasts a Quad-Core Cortex-A17 SoC is much more powerful than you can imagine.

Mobile Linux Workstation

Atom PC is an external device, can configured for use in the home or office as a compact computer. It’s able to run any of a number of Linux distributions and are best suited as for running media server, back-up services,file sharing and remote access functions .

You can run your office software, work on documents and even develop an application on the Ubuntu OS. It’s small and easy to carry, you can catch up on unfinished work anywhere and anytime, no need to stay at the office late.

Android Gaming Console with Millions Apps on Google Play

Connect the device with your TV or Monitor, pair with a wireless gamepad (such as race Wise Free Wireless Game Controller), and then enjoy your game! With a Quad-Core Cortex-A17 CPU and Mali-T764 GPU, you can even play huge 3D games on it, like "Clash of Clans" and so on.

Home Theater and Multimedia Center

Atom PC supports 4K Ultra HD; just connect a wireless remote control and you can enjoy online video (like Netflix, YouTube, Google Play and many more), listen to online radio or smoothly browse local and cloud photos on your big-screen TV from your sofa.

Video Chat and Conference Call

Connect a web camera to easily access your Skype, Google+, Hangouts or ooVoo account and start a video chat with your friends, family, or colleagues. No matter how far away they may be, you can easily get in touch with them on your big screen TV in real time. You can even have a video conference at home. Take advantage of the versatility of this MINI PC and feel the convenience it brings to your personal and work life.


More info on Indiegogo:


Giayee Android Tablet, Thin Clients and Mini PCs, OEM/ODM

Atomwear by Shuwen Liu — Kickstarter

In the recent ARM Connected Community event Interview and Question Time with Joseph Yiu

community member Gopal Amlekar asked the following question:


"How are ARM processors and especially the Cortex-M processors helping in making the IoT more secure, reliable and not prone to hacking?

Is it something to do with the TrustZone?

Even with all these, what care should be taken by developers to make their device more secure in the WWW of things?"


I recently recorded this interview and members should expect to see it very soon! However I would like to elaborate further on this question, and explain in detail about how Cortex-M is approaching security.


Security management on existing Cortex-M processors

In a large part of the microcontroller application space, the most likely security issue is with software. For example, there could be vulnerabilities in the application code or at the communication protocol stack.

Typically, some form of security management can be implemented using the privileged and unprivileged execution levels. By executing protocol stack and application code at unprivileged level, and by using the Memory Protection Unit (MPU), we can significantly reduce the risk of any hacking instance or efforts gaining full control of the device. The MPU can ensure that the stack and critical data used by the OS kernel are not corrupted by a rogue application task. It can also make the SRAM region non-executable so that even if malicious code is injected into the SRAM (e.g. if part of the SRAM can be used to store received packets), such code cannot be executed.

In the mbedOS, which will be available in Q4 2015 the µVisor in the OS also uses the MPU for their security management. On top of that, mbedOS has added a lot of other security features to enable software developers to create applications that need to securely communicate with other devices and server. For example, Datagram Transport Layer Security (DTLS) can be used to securely handle data communications.



Software components in the mbedOS


Can Trustzone for Cortex-A be used for Cortex-M

The software execution environments for Cortex-M processors are often quite different from Cortex-A processors. In the Cortex-A processors, the OS environment (e.g. Android, iOS) allows you to download applications from third parties, meaning you have multiple secure domains within the system. The secure contents need to be completely hidden away from these applications, making TrustZone technology the best way to manage security.

For microcontroller type applications based on the Cortex-M processors, however, software components are often compiled and linked together during the software development stage. As the software components are essentially "trusted", there is no need to hide contents from them. Given that the MPU can prevent hackers from injecting code and executing them, the risk is more about how the on-chip software handles secure contents, and whether it is possible for the software to leak secure contents accidentally.


Multi-core approach

In complex SoC designs, Cortex-M processors might be used for various subsystems (e.g. I/O subsystem, power management). In these systems, 3rd parties software components could be downloaded into the SRAM in the Cortex-M subsystems and executed from there. In these cases, additional security arrangements might be needed. For example, a number of SoC designs use multiple Cortex-M processors in the design, with at least one of them always in a secure domain, and with the others in a non-secure domain. This arrangement can work well with a TrustZone based (e.g. Cortex-A processor) system.


What next

We are continuously investigating future technology to see how we can provide better solutions for a wider range of applications.

In addition to the processors, the mbedOS will be an important part of the picture. The mbedOS will make it easier to develop secure IoT applications because the OS is designed with security management from ground up. A wide range of secure communication technologies will be integrated into the OS so that application developers can deploy these technologies easily, securely and efficiently. The mbedOS will be free to use, and the applications created can be exported into other toolchains for further modifications and optimizations if required.

If you have further questions about Cortex-M security then do not hesitate to comment below and I will get back to you as soon as possible.

I was delighted to see the announcement of the ODROID-C1 by Hardkernel last week!


The Odroid-C1 is the latest addition to a growing number of ARM-based Single-Board Computers (SBCs). At just $35, the ODROID-C1 represents one of the lowest cost SBCs on the market, but also comes with a high-performance specification. The board uses the Amlogic S805 SoC with four ARM Cortex-A5 CPUs, each capable of clocking up to 1.5GHz (translating to over 2300 DMIPS per CPU). Alongside the ARM CPUs are two ARM Mali-450 MP GPUs, each capable of clocking up to 600 MHz and which fully support the OPENGL ES 1.1/2.0 This certainly makes ODROID-C1 one of the most cost effective SBC that provides maximum compute power per dollar spent. The ODROID-C1 also supports 1GB of DDR3, has a MicroSD slot to support an 8GB or 16 GB UHS-1 card, and is capable of running the Ubuntu 14.04 or Android KitKat operating systems. It also packs several other features that can be found on ODROID-C1’s webpage and the December 2014 issue of ODROID magazine. The block diagram below gives more details of the board’s key components.



Source: hardkernel.com

Powering the ODROID-C1 is the Cortex-A5 processor, one of ARM’s most power-efficient and proven ARMv7-A processors. It has shipped in millions of smartphones and other devices since first being introduced to market in 2011. The Cortex-A5 enabled the entry-level smartphone revolution, bringing a high-end mobile experience into low-cost smartphone devices. With the Cortex-A5 now powering the ODROID-C1, it is starting a new trend of powerful, cost-effective single-board computing.


The ARM Mali-450 MP GPU has experienced tremendous success since its launch in 2012. Millions of smartphones, tablets and set-top-boxes are powered by Mali-450 MP, which has been designed for volume markets and optimized with a focus on energy and bandwidth savings. Now, the Mali-450 brings full OPENGL ES 1.1/2.0  support for enabling 2D/3D graphics applications in ODROID-C1.


There are a number of application use-cases, ranging from professional software engineering, through to modern computers built for work or gaming. An interesting application would be a low-cost but performance packed IoT gateway using this SBC. I am curious to see what innovative uses the DIY community will find for the raw compute power provided by this tiny but powerful SBC. With affordable SBCs now becoming increasingly powerful and feature-packed, it will be exciting to see some of the ideas developers are able to conceive in the near future.  I cannot wait to get an ODROID-C1 for my projects!


How about you?

A good paper describes ARM powered Server chips, you can refer to the link below. 

AnandTech | ARM Challenging Intel in the Server Market: An Overview

I was recently asked about the shift and extend operations in the A64 instruction set, and I realised that the ARMv8 ARM doesn't have a simple description of what each operation does. The ARMv8 ARM is very precise if you have the time to read and understand the pseudo-code descriptions for each instruction, but there is no quick reference.


This post tries to be just that: a quick reference for the shift and extend modifiers. I want to describe the various options so that when you see them, you know what they mean. I'm going to restrict this post to the operand modifiers. These modifiers are A64's equivalent of the flexible operand – often called Operand 2 – from ARM and Thumb.


Many of the operations are also available as standalone instructions, and there are several bitfield manipulation operations that are only available as standalone instructions. I won't describe these here, partly because the ARMv8 ARM already describes what they do, but also because it would make this article rather long. I might cover the standalone operations as a follow-up if there is enough demand.


General Form


The general form of these modifiers is quite simple: <operation> {#imm}


  • <operation> is one of the operations described in this article.
  • #imm is usually optional, and defaults to 0.


The modifier affects the register or immediate value that appears immediately before it in the instruction mnemonic. Here are a few examples:


// Subtract a shifted register.
sub x0, x1, x2, LSR #8          // x0 = x1 - ((uint64_t)x2 >> 8)

// Add a shifted immediate.
add x5, x6, #10, LSL #12        // x5 = x6 + (10 << 12)

// Load from an array using a signed index.
ldr w10, [x11, w12, SXTW]      // w10 = *(uint32_t*)(x11 + (int32_t)w12)


Note that not every modifier is available in every context. I won't try to explain what's available where; the ARMv8 ARM's instruction descriptions are quite clear in this regard.


Shift Operations


Shifts take a source register and shift it left or right by the specified number of bits, with optional sign extension. The shift operations are mostly the same as they are in 32-bit ARM and Thumb so if you're familiar with those, the A64 versions shouldn't be surprising.


The shift amount is encoded in the instruction (and is therefore constant). A significant difference from ARM is that there are no register-shifted-by-register forms. Such operations are possible in A64, but as in Thumb, they are standalone instructions with slightly different syntax.


For all shift modifiers, the size of the result is the same as the size of the source; there is no implicit widening or narrowing1.


LSL: Logical Shift Left




Shift bits left by the amount specified, and fill the new bits with zeroes.


  • Bits shifted out of the left are discarded.
  • New bits shifted into the right are set to 0.


This is equivalent to multiplication by 2n, where 'n' is the shift amount. This works for both signed and unsigned inputs.


LSR: Logical Shift Right




Shift bits right by the amount specified, and fill the new bits with zeroes.


  • Bits shifted out of the right are discarded.
  • New bits shifted into the left are set to 0.


This is equivalent to division by 2n, where 'n' is the shift amount. The result is rounded towards zero, like the udiv instruction, and like unsigned integer division in most languages (including C).


ASR: Arithmetic Shift Right




Shift bits right by the amount specified, and sign-extend to fill the new bits.


  • Bits shifted out of the right are discarded.
  • New bits shifted into the left are set to the same value as the source value's leftmost bit. (This is the two's complementsign bit.)
    • If the source value is positive, the leftmost bit is 0, so ASR and LSR are equivalent for positive inputs.


This is similar to signed division by 2n, where 'n' is the shift amount. However, with signed division, some care is needed to handle rounding when negative values are involved. For example, C's signed integer division almost always rounds towards zero2, but a naive ASR-based division will round towards minus infinity.


// C-style signed integer division by a power of two (2^n, where n > 0).
// Correct the result (by incrementing it) only if the bits shifted out
// are non-zero and the sign is negative.
tst x0, #((2^n)-1)
ccmp x0, #0, #0, ne
asr x0, x0, #n
cinc x0, x0, lt


ROR: Rotate Right




Rotate bits right.


  • Bits shifted out of the right are shifted in at the left.
  • Conceptually, a rotation isn't a simple shift operation, but the ARM architecture includes it in the set of shifted-register modifiers.


MSL: Masking Shift Left




This is just like LSL, but instead of inserting zeroes into the rightmost bits, it inserts ones.


This somewhat unusual shift form is only supported by one or two NEON instructions for forming immediate arguments, so you probably won't see it often. It is actually used by the 32-bit NEON instructions too, but it isn't given an explicit name and its use is always derived from a single immediate operand.


Extend Operations


The general principle is to take a sequence of consecutive bits from the source register, then sign- or zero-extend it to make it the required size. The result size is implied by the context.


Most of these extend operations exist in 32-bit ARM and Thumb, but A64 makes them more flexible and in many cases allows them to be used as an operand modifier.


Several contexts also allow extend modes to take an additional immediate left shift (like LSL). This shift has a very limited range (of 0-4 bits), and it applies after the extend operation. For example, SXTB #2 means "sign extend from the 8-bit source, then shift it left by two bits."


Because extend operands can take a shift, UXTX, and in some cases UXTW, are functionally identical to LSL for shifts 0-4. These are actually aliases in a few corner cases where extend modes are available but shift modes are not. For details, refer to the instruction descriptions in the ARMv8 ARM. The meaning of the assembly (and disassembly) does not change.


UXTB, UXTH, UXTW, UXTX: Unsigned extract from 8-bit byte, 16-bit halfword, 32-bit word, or 64-bit doubleword.




Extract the least significant byte, halfword, word or doubleword, zero-extend it to the result size, then (optionally) shift left.


  • These operations are the same as the SXT* operations, except that they do zero extension.
  • UXTX by itself has no effect (if it has no shift).
  • If the destination type is a W register, UXTW also has no effect (if it has no shift).


SXTB, SXTH, SXTW, SXTX: Signed extract from byte 8-bit byte, 16-bit halfword, 32-bit word, or 64-bit doubleword.




Extract the least significant byte, halfword, word or doubleword, sign-extend it to the result size, then (optionally) shift left.


  • These operations are the same as the UXT* operations, except that they do sign extension.
  • SXTX by itself has no effect (if it has no shift).
  • If the destination type is a W register, SXTW also has no effect (if it has no shift).



1Some NEON instructions explicitly do a shift-and-widen or shift-and-narrow operation, but the instruction descriptions explain the details so I won't cover them here.


2The situation in C is fairly complicated. Very roughly:

  • In C89, the rounding behaviour of signed divisions involving negative values – even `-a/-b` – is implementation-defined.
  • In C++98, round-towards-zero was recommended, but not required.
  • C99 and C++11 simplify everything matters by requiring round-towards-zero for all integer divisions.

Both ARM Compiler 5 and GCC-4.9.2 follow this convention for all C variants. ARM Compiler 6 is based on Clang, and it defers to the Clang documentation for such matters. The Clang documentation doesn't seem to cover this language detail, but given that Clang tries to be compatible with other compilers (including GCC), and it has to support round-towards zero for C99 and C++11 anyway, I would be surprised to see any behaviour other than round-towards-zero.

Filter Blog

By date:
By tag: