1 2 3 Previous Next

ARM Processors

180 posts

Valgrind is a GPL'd framework for building simulation based debugging and profiling tools.  The best known of these is Memcheck, a memory error detector, but in fact it is only one of eight tools in the standard distribution: two memory checkers, two thread checkers, two performance profilers and two space profilers.


The Valgrind trunk sources now contain a port to 64 bit ARMv8.  The port has reached a stage of being useful for real debugging, and, in turn, could do with wider testing.


Memcheck works and has a noise (false-error) level that is comparable with other targets.  All the basic functionality -- detection of uninitialised value uses, detection of invalid memory accesses, detection of leaks, and origin tracking -- is available.  Other tools -- Lackey, Cachegrind, Callgrind, Massif, DRD and Helgrind -- work to some extent, but have not been extensively tested.


Current limitations are:


  • Incomplete support of vector (SIMD) instructions.  Anything created by gcc-4.8.2 -O3, including SIMD code from autovectorization, works.  Completion of SIMD support is ongoing.


  • Incomplete syscall support.  Enough system calls are supported for large programs to work, but some are still unsupported.


  • Integration with the built in GDB server is incomplete.


There has been extensive testing of the supported integer, FP and SIMD instructions.  At least one large application -- Firefox 26 -- is able to start up and quit.  The port is under active development and will ship as part of the next release, Valgrind 3.10.


In general, if you have used Valgrind on 32 bit ARMv7 or other targets, you should find this new port very little different.


You can get hold of the trunk sources via Subversion, using the URL svn://svn.valgrind.org/valgrind/trunk.  See README.aarch64, at the root of the source tree, for instructions on how to build and run the port.


If you have questions or difficulties, feel free to mail our users or developers mailing lists, as described at http://www.valgrind.org/support/mailing_lists.html.  Bugs may be reported at https://bugs.kde.org/enter_bug.cgi?product=valgrind&format=guided.

Freescale have outdone themselves again this year at the 2014 Freescale Technology Forum (#FTF2014). As I mentioned yesterday in my first blog from the show; Collaboration at its best at Day One of Freescale Technology Forum with Crank Software, Carbon Design Systems, Greenhills Software and Freescale WaRP there are over 280 technical sessions. Counting these along with over 250 Technology Lab demonstrations, 400 hours combined of sessions and an expected attendance well over 2000, and you have a pretty successful set of show statistics.


After yesterday press announcements of the QorIQ LS2 ARM Cortex-A57 based networking devices and the second generation of the Kinetis MCU family, Freescale followed these up with two more announcements; the introduction of the the industry’s highest performance MCUs designed for automotive instrument clusters, enabling a new generation of premium graphics, utilising an ARM Cortex-M4 and Cortex-A5, and the demonstration of an IoT Gateway Platform for a One Box Solution to speed services deployment for the Internet of Things (IoT).


The first of these, the MAC57D5xx MCU family enables independent operation of an AutoSAR OS on the ARM Cortex-M4 core, and a graphics OS on the ARM Cortex-A5 core allowing for enhanced safety in next-generation instrument cluster designs.


To talk more about the LS2 platform news I met with Matt Short, Senior Marketing Manager from Freescale and Ian Forsyth from ARM. Matt explained how the introduction of the 64-bit ARMv8-A architecture benefited the LS2 family and Ian explained why the Cortex-A57 was a perfect match for the networking market requirements.



The broadening of the networking solutions based on ARM technology is also aligned with the recent news that Freescale has 'has entered into a definitive agreement to purchase the Comcerto® CPE communications processor business of Mindspeed Technologies, Inc. This business includes a series of multicore, ARM®-based embedded processors and associated software ...' (See Freescale enters into definitive agreement to purchase Mindspeed ARM processor business from MACOM)


I also met with Shawn A. Prestridge, Senior Field Applications Engineer at IAR Systems to talk about the solutions that IAR provide for Freescale ARM based products, and some of the safety critical aspects of the tools solutions.



Keith Sugawara from Silex Technology talked with me about what Silex Technology does and how the partnership with Freescale enables OEMs to get the best wireless solutions for both i.MX and Kinetis families.



As the lunch time Technology Lab session wrapped up I also met with Stuart Fisher from SYSGO to talk about the markets that were implementing the PikeOS product and also being the first to be multi-core SIL4 certification that they had achieved.




For more videos from FTF take a look at the FTF2014 Playlist over on the ARMFlix YouTube channel.




This week I'm in Dallas, Texas, for the Freescale Technology Forum, colloquially known as FTF2014 (watch out news at it happens on Twitter with the hashtag #FTF2014).


Tuesday afternoon saw the first series of technical sessions (of which there are a staggering 283 in total) with the afternoon rounded of by a drinks and dinner reception which took place in the Technology Lab, an area which was filled with the largest number of sponsors and exhibitors ever for FTF.


The first keynote on Wednesday was given by Freescale's CEO Gregg Lowe who gave an insight into Freescale's vision for a world of connected intelligence where big data, security and cloud computing meet. Fast forward to where embedded technology is heading – that’s the best way to describe Freescale CEO Gregg Lowe’s keynote this morning. Nothing short of inspiring and motivating, design engineers took with them a fresh outlook on the industry and market trends. For 90 minutes, engineers and marketers were captivated by the possibilities of their creations. Freescale-powered applications were demonstrated live by chief technology experts from Alcatel-Lucent, GM, Goji, Oracle, ARM, OrCam, GenShock and Line-6. Under the umbrella of the Internet of Things, Freescale featured its embedded leadership and ability to touch all areas of IoT – from microcontrollers to digital networking to secure data to connected cars to the power in electric cars to AirFast RF power solutions to RF lighting, to defense to aerospace to land mobile and cellular infrastructure.


The show started off with two ARM® related press releases, the first being the introduction of the new QorIQ LS2 family of ARM-based network processors designed to support SDN, and network function virtualization. The QorIQ LS2 architecture incorporates a processing domain built around the industry-leading 64-bit ARM Cortex®-A57 core. Take a look at the blog by my colleague John Fry titled Realizing Software-defined and Virtualized Networking with Freescale’s LS2 Family


Freescale also announced the second generation of Freescale Kinetis ARM Cortex-M4-based MCUs, 'taking scalability, performance, power efficiency and enablement to new levels' said Geoff Lees, senior vice president and general manager for Freescale’s MCU group in the press release.


At the show I met with Robert Redfield, Director, Partner Business Development, Green Hills Software, who talked with me about the solutions that Green Hills have for Freescale's ARM based product lines, including their comprehensive support for functional safety and security, and recent support for the newly announced Freescale QorIQ Layerscape LS2 platform.



I particularly like the 'Make IT Challenge' at this show, which offers the ability for all attendees to take their favourite 'thing' and making it a real part of the Internet of Things (IoT).The participants will get the latest Internet-ready, mbed™-enabled, ARM Cortex-M4 Freedom development platform. Also on hand will be some quick presentations and code samples, mentors for answering questions, some very basic tools, food to keep you going and, what some engineers might deem most importantly, caffeine to keep you awake. The contest will be judged on creativity, difficulty and execution. I think this is great way of motivating people to make use of the latest technology in a forum where they can really show off their ideas, freely ask what might seem the dumbest of questions, and possibly even win some great prizes.


Over at the ARM booth there are three demo pods in the Technology Lab, one highlighting the mbed platform with the Freedom platform, one showing ARM DS-5 being used to debug a dual core ARM Cortex-A5 and Cortex-M4 Vybrid platform, and the other showing the Keil MDK-ARM and ULINKpro Kinetis MCU development platform. This all aligns nicely with the ARM announcement of the DS-5 Ultimate Edition which adds support for ARMv8 architecture and the new ARM Compiler 6 to the already popular DS-5 Professional Edition.  DS-5 Ultimate Edition provides a complete development environment for ARMv8-A, including debug, compilation, performance analysis, and models, all incorporated into the Eclipse IDE.


I met with Sujata Neidig who chatted with me about 'wearables' and the WaRP Wearable Reference Platform, explaining how the 'wearables' story spans the complete range of processors available from ARM with processors such as the Cortex-M0+ for the fitness tracker type of application, up to the Cortex-A Series for the more demanding applications, such as those running Android.



Getting to market faster is always critical in any design and the ability to start writing software before hardware is available has always been an important part of this. I met with Brad Perdue, Strategic Design Manager at Carbon Design Systems to find out how they have been collaborating with Freescale to both get their products to market faster and to help out the developer enabling the earlier development of software.



Jason Clarke, VP Sales and Marketing at Crank Software spend some time with me explaining the benefits of the Storyboard Suite, a UI software suite that helps getting to market faster for embedded developers creating touch screen devices.



For more videos from FTF take a look at the FTF2014 playlist over on the ARMFlix YouTube channel.


Thursday's keynote will talk about The Internet of Things already being here and will have Freescale’s leadership team  bring to life the vast array of devices enabled by Freescale microcontrollers and the network that connects it all together, with my good friend Geoff Lees, senior vice president and general manager of Microcontrollers, alongside James Bates is senior vice president and general manager - Analog & Sensors and Tom Deitrich is senior vice president and general manager – Digital Networking




As the second day of EELive! comes to an end in a sunny and gently warming up San Jose, California, I thought I’d take a few minutes to give a recap of some of the show floor activity.


This morning I started off the day talking with Andreas Eieland (@AndreasMCUguy), who looks after the Atmel SAM D Cortex-M0+ based family of devices. I got a great picture of Andreas stood next to the Atmel Tech On Tour truck, which has been travelling around the USA visiting cities and promoting the Atmel product lines, including those based on the ARM Cortex-M series and the ARM Cortex-A5. The truck is huge and took up a large area of the show floor, and there was a large demonstration area inside the extending unit.


2014-04-02 10.58.04.jpg 2014-04-02 10.57.26.jpg


Yesterday, also at the Atmel booth, Ronan Synnott presented a talk on ARM DS-5 support for Atmel SAMA5D3 Devices. Ronan described how, with DS-5 Professional Edition, ARM provides a leading-edge software development tool chain for bare-metal, RTOS, and Linux based projects. For the SAMA5D3 devices it provides full debug support out of the box when used in conjunction with DSTREAM or ULINKproD JTAG debug units, the Streamline System Performance Analysis tool, and the highly optimizing ARM C compiler.


NXP had a classroom set up at the show, similar to last year, which offered a number of sessions throughout the day, on a variety of subjects including 'Rapid Embedded Development for LPCXpresso' and 'USB - it doesn't get easier than NXP'. The second of these is also on again on Thursday at 12 noon, and talks about the LPC11U6x product, which is based on the ARM Cortex-M0+. Another class session on tomorrow is 'Simplifying BLDC and FOC motor control with NXP's LPC1500, based on the Cortex-M3. Take a look at the video I shot back at Embedded World earlier this year with Ross Bannatyne.


  2014-04-01 13.30.47.jpg


NXP launched the new ARM Cortex-M0+ based LPC11E6x family at the show, with the expansion of its portfolio of LPC microcontrollers rated for temperatures up to 105 degrees centigrade - see the press release here: New LPC Microcontroller Portfolio Expands Devices Rated up to 105°C


NXP also announced that the version 2 release of its popular LPCXpresso development boards as now fully supporting the ARM® mbed™ platform, giving embedded engineers more choice and flexibility when developing advanced applications for LPC and mbed microcontroller platforms. More details are available here: NXP LPCXpresso and ARM's mbed platform now fully aligned


STMicroelectronics also had a series of technical sessions taking place on their booth. This session was describing the STM32Cube which includes the STM32CubeMX which is a graphical software configuration tool that allows generating C initialization code using graphical wizard.


2014-04-02 11.57.12.jpg


ST also had a full Anki drive setup which I got to play with at CES earlier this year - a definite crowd puller.



I also met with Chad Jones, Vice President, Product Strategy at Xively, who support ARM mbed with their Xively Cloud Services. You can quickly create an IoT device with ARM mbed and connect to the Xively cloud infrastructure services with the minimum of effort.


Check out this video of Chad explaining more about Xively, and keep an eye out on the community for more details of what was being shown at the Xively booth.






Smart and Connected

There is currently a lot of talk around Advance Driver Assistance Systems (ADAS) in the car market. Car makers are trying to differentiate their vehicles to end customers by implementing more and more so called driving assistant features in the name of increased road safety - but, what are those car electronics assistants doing in detail? Where are they located? What is to be assisted? Where does ARM® come into play? What is the future of Autonomous driving?



The “Eyes” of a Car – Sensors …


Modern cars contain a lot of sensors, which are located on many positions on the outside surface of the car. These sensors can be thought of as the eyes of the car to get a continuous picture of the outer world. The graphic below illustrates where these sensors are, and which areas they are supervising.



Typically those sensors work in different physical dimensions like ultrasonic sound for the near field used for parking aid beepers, infrared light and cameras for the mid range to control wipers during rain or detect pedestrians. Radar is used for long distance control to stop the car for obstacles, such as a pedestrian jumps in your way or for keeping a safe distance with respect to the car in front. Sensors have some local intelligence to pre-process their detection tasks as long wiring cables limit a full bandwidth transport for information to a centralized processing unit in many cases. ARM Cortex®-M cores address areas like ultrasonic parking detection, whereas ARM Cortex-A cores are used for high performance requirements in Field Programmable Gate Arrays (FPGA) for camera systems.


Finally all information delivered by the sensors is processed in a centralized high performance computing cluster ECU (data fusion). ARM addresses this performance requirement levels with its 64-bit ARM Cortex-A50 processor family cores.



The Fundamentals of ADAS …


ADAS systems which will help drivers on various driving conditions, are currently assistance and helping functions only - leaving final decision making to the car driver. The driver is still the topmost ranking decision maker. The driver must have ability to override the electronic assistance in all conditions to prevent failures. The driver is legally responsible for his driving (e.g. you still can use the breaks to get to a full stop even if cruise-control wants to accelerate). You might want to still be able to accelerate if the on-board radar confuses a traffic sign with a pedestrian.


This leads us to the most important thing – don’t smile, it’s vital - there has to be an active driving person in the car while moving. Sorry, no sending the car to a drive-in baker shop in the morning, no nap while driving to the mother-in-law yet, no Knight Rider & K.I.T.T. today ...


Mainly this is required by the “Convention on Road Traffic Agreement” signed by many international states at an UN-conference in Vienna/Austria on the 8th of November1968 and currently being deposited at United Nations General Secretary archive. Therefore, almost every autonomous prototype car driving on public roads is granted an exceptional permission from local authorities today.


The fundamental of ADAS is that, in some emergency conditions car electronics aids already can do better than the average commuter - but, are electronics and related software reliable and safe enough to be trusted? Currently this is not the case for all driving conditions. ARM addresses this shortcoming  with  the ARM Cortex-R cores which are designed to address safety related applications for ADAS systems. Current ARM developments even go beyond that level by introducing ARMv8-R architecture features like virtualization to support different operating system flavors on one core. ARM Cortex-R already provides a reliable and centralized brain-of-hardware for silicon inside an ECU-box in the car today.



The Route …


Car makers have decided to go towards a route-of-building-trust to ADAS functionality by introducing car ADAS in a step-by-step approach. They are increasing their own experience levels for a while now and are currently focused on intermediate assistant functions like highway-driving assistance or traffic-jam guidance to get to full autonomous driving as the overall final goal.


Premium cars will be the first to enjoy such technology, with limited autonomous driving in production by ~2020.

ARM is ready to go today …

With ARM entering the server space, a key technology in play in this segment is Virtualization. Virtualization is not a tool solely for servers and the data center, it is also used in the embedded space in segments like automotive and it is also starting to be used in mobile.


This is not a new technology, IBM pioneered it in the 1960s, and there are many different hypervisors implementing different methods of virtualization. In the Open Source realm there are two major hypervisors: KVM and Xen. Both are interact directly with the Linux kernel, however KVM is solely in the Linux domain whereas Xen works with Linux, *BSD and other UNIX variants.


In the past it was generally accepted that there are two types of hypervisor, Type1 (also known as bare metal or native) where the hypervisor runs directly on the host server and controls all aspects of the hardware and manages the guest operating systems, and Type2 (also known as hosted) where the hypervisor runs within a normal operating system; under this classification Xen falls into the Type1 camp and KVM fell into the Type2 camp. However the modern implementations of the hypervisors has now blurred the lines of distinction.




This time round I’ll be taking a look at the Xen Hypervisor, which is now one of the Linux Foundation’s collaborative projects. Here is a brief overview of some of Xen’s features:


  • Small footprint. Based on a microkernel design, it has a limited interface to the guest virtual machine and takes up around 1MB of memory.
  • Operating system agnostic. Xen works well with BSD variants and other UNIX systems, although most deployments use Linux.
  • Driver Isolation. This allows the main device drivers of a system to run inside the VM, which enables the VM containing drivers to be rebooted in the event of a crash or compromise without affecting the host or other guests. In the Xen model the majority of the device drivers run in virtual machines rather than in the hypervisor, as well as allowing reusing of existing OS driver stacks this allows the VM containing the driver to be rebooted without affecting the host of other guests. Individual drivers can even be run in separate VMs in order to improve isolation and fault tolerance or just to take advantage of differing OS functionality.
  • Paravirtualization (PV). This style of port enables Xen to run on hardware that doesn’t have virtualization extensions, such as Cortex-A5/A8/A9 in ARM's case.  There can also be some performance gains for some PV guests, but this requires the guests to be modified and prevents “out of the box” implementations of operating systems.
  • No emulation, no QEMU. Emulated interfaces are slow and insecure. By using hardware virtualization extensions and IO paravirtualization, Xen removes any need for emulation. As a result you have a smaller code base and better performances


The Xen hypervisor runs directly on the hardware and is responsible for handling CPU, Memory, and interrupts. It is the first program running after exiting the bootloader. Virtual machines then run atop of Xen. A running instance of a virtual machine in Xen is called a DomU or guest. The controller for the guest VMs is a special host VM called Dom0 and contains the drivers for all the devices in the system. Dom0 also contains a control stack to manage virtual machine creation, destruction, and configuration.


Diagram showing the Xen Architecture (courtesy of the Xen Project    )


Pieces of the puzzle:

  • The Xen Hypervisor is a lean software layer that runs directly on the hardware and as mentioned is responsible for managing CPU, memory, and interrupts. The hypervisor itself has no knowledge of I/O functions such as networking and storage.
  • Guest Domains/Virtual Machines (DomU) are virtualized environments, each running their own operating system and applications. On other architectures Xen supports two different virtualization modes: Paravirtualization (PV) and Hardware-assisted or Full Virtualization (HVM). Both guest types can be used at the same time on a single Xen system. On ARM there is only one mode for virtualization that is more of a hybrid of the two, which effectively is hardware based with paravirt extensions. Some call this mode PVHVM. Xen guests are totally isolated from the hardware and have no privilege to access hardware or I/O functionality. Which is where DomU comes in, standing for unprivileged domain.
  • The Control Domain (or Dom0) is a specialized Virtual Machine that has special privileges like the capability to access the hardware directly, handles all access to the system’s I/O functions and interacts with the other Virtual Machines. It also exposes a control interface to the outside world, through which the system is controlled. The Xen hypervisor will not function without Dom0, which is the first VM started by the system.
  • Toolstack and Console: Dom0 contains a control stack (known as Toolstack) that allows a user to manage virtual machine creation, destruction, and configuration. The toolstack exposes an interface that is either driven by a command line console, by a graphical interface or by a cloud orchestration stack (OpenStack or CloudStack.)
  • Xen-enabled operating systems: Running an operating system as Dom0 or DomU requires the operating system kernel to be Xen enabled. However porting an operating system to Xen on ARM is simple: it just needs a few new drivers for the Xen paravirtualized IO interfaces. Existing open source PV drivers in Linux and FreeBSD are likely to be reusable. Linux distributions that are based on a recent Linux kernel (3.8+) are already Xen enabled and usually contain packages for the Xen hypervisor and tools.


The latest version of Xen is 4.4.0, which was released in March and has support for both ARMv7 and ARMv8. For this exercise I’ll be looking at using Xen on ARMv8 with the Foundation Model.

Please consult the Xen Wiki for more information on using Xen with Virtualization Extensions and using Xen with Models. For discussion/review/information and help there are mailinglists and IRC.


Development set up:

You can use whichever Linux distribution you prefer, so long as you have a suitable cross-compilation environment set up. I’m using openSUSE 13.1 with the Linaro Cross Toolchain for AArch64.


Typographic explanation:

host$ = run as a regular user on host machine

host# = run as root user on host machine (can use sudo if you prefer)

chroot> = run as root user in chroot environment

model> = run as root user in a running Foundation Model


The first steps are to build Xen and a Linux kernel for use in both Dom0 and DomU machines. We then package Xen and Linux along with a Device Tree together for Dom0 to be used in the model using boot-wrapper.


Build Xen:

If using Linaro’s toolchain, ensure the /bin directory is in your $PATH

host$ git clone git://xenbits.xen.org/xen.git xen

host$ cd xen

host$ git checkout RELEASE-4.4.0


There is a small build bug due to use of older autotools which will be fixed in the 4.4.1 release. Rather than wait for the next release, we’ll just backport it now.

host$ git cherry-pick 0c68ddf3085b90d72b7d3b6affd1fe8fa16eb6be


There is also a small bug in GCC with PSR_MODE see bug LP# 1169164. Download the attached PSR_MODE_workaround.patch

host$ patch -i PSR_MODE_workaround.patch -p1


host$ make dist-xen XEN_TARGET_ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- CONFIG_EARLY_PRINT=fastmodel

host$ cd ..


Build Linux:

host$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

host$ cd linux

host$ git checkout v3.13


Create a new kernel config:

host$ make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- defconfig

host$ sed -e 's/.*CONFIG_XEN is not set/CONFIG_XEN=y/g' -i .config

host$ sed -e 's/.*CONFIG_BLK_DEV_LOOP is not set/CONFIG_BLK_DEV_LOOP=y/g' -i .config

host$ make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- oldconfig

Make sure to select Y to all Xen config options

I have attached a kernel.config which has all the required options enabled for reference.


host$ make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- Image

host$ cd ..


Obtain the Foundation Model:

In a browser go to http://www.arm.com/products/tools/models/fast-models/foundation-model.php

Scroll to the bottom and select “Download Now”

This should provide FM000-KT-00035-r0p8-52rel06.tgz

Extract the tarball

host$ tar xaf FM000-KT-00035-r0p8-52rel06.tgz

Build Boot Wrapper and device tree.

It is common to run the models without real firmware. In this case a boot-wrapper is needed to provide a suitable boot time environment for Xen, which allows booting into Non-Secure HYP mode and providing boot modules etc.:

host$ git clone -b xen-arm64 git://xenbits.xen.org/people/ianc/boot-wrapper-aarch64.git

host$ cd boot-wrapper-aarch64

host$ ln -s ../xen/xen/xen Xen

host$ ln -s ../linux/arch/arm64/boot/Image Image


Use the attached foundation-v8.dts to build the device tree blob

host$ dtc -O dtb -o fdt.dtb foundation-v8.dts

host$ make CROSS_COMPILE=aarch64-linux-gnu- FDT_SRC=foundation-v8.dts IMAGE=xen-system.axf

host$ cd ..


Run the Model to make sure the kernel functions, it will panic as we haven’t setup the rootfs yet:

host$ ./Foundation_v8pkg/models/Linux64_GCC-4.1/Foundation_v8 \

     --image boot-wrapper-aarch64/xen-system.axf


Chroot Build Environment

Next we create a suitable chroot build environment using the AArch64 port of openSUSE. We will use the qemu-user-static support for AArch64 to run the chroot on the (x86) host.

First we build the qemu binary, then construct the chroot, finally we build Xen in the chroot environment.


Building qemu-aarch64-user

host$ git clone git@github.com:openSUSE/qemu.git qemu-aarch64

host$ cd qemu-aarch64

host$ git checkout aarch64-work


Install some build dependencies:

host# zypper in glib2-devel-static glibc-devel-static libattr-devel-static libpixman-1-0-devel ncurses-devel pcre-devel-static zlib-devel-static


host$ ./configure --enable-linux-user --target-list=arm64-linux-user --disable-werror --static

host$ make -j4

host$ ldd ./arm64-linux-user/qemu-arm64

     not a dynamic executable

This last step is to verify that the resulting binary is indeed a static binary. We will copy it into the chroot later on.


We now need to enlighten binfmt misc about aarch64 binaries:

On openSUSE:

host# cp scripts/qemu-binfmt-conf.sh /usr/sbin/

host# chmod +x /usr/sbin/qemu-binfmt-conf.sh

host# qemu-binfmt-conf.sh


On Debian:

host# update-binfmts --install aarch64 /usr/bin/qemu-aarch64-static \

   --magic '\x7fELF\x02\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\xb7' \

   --mask '\xff\xff\xff\xff\xff\xff\xff\x00\xff\xff\xff\xff\xff\xff\xff\xff\xfe\xff\xff'

host$ cd ..


Build the chroot environment

host$ wget http://download.opensuse.org/ports/aarch64/distribution/13.1/appliances/openSUSE-13.1-ARM-JeOS.aarch64-rootfs.aarch64-1.12.1-Build32.2.tbz

host$ mkdir aarch64-chroot

host# tar -C aarch64-chroot -xaf openSUSE-12.3-AArch64-JeOS-rootfs.aarch64-1.12.1-Build5.13.tbz


Install the qemu binary into the chroot environment

host# cp qemu-aarch64/arm64-linux-user/qemu-arm64 aarch64-chroot/usr/bin/qemu-aarch64-static


host# cp /etc/resolv.conf aarch64-chroot/etc/resolv.conf


Build the Xen tools in the chroot environment (finally)

Copy the Xen sources into the chroot

host# cp -r xen aarch64-chroot/root/xen


Chroot into the aarch64 environment

host# chroot aarch64-chroot /bin/sh


We now need to install some build dependencies

chroot> zypper install gcc make patterns-openSUSE-devel_basis git vim libyajl-devel python-devel wget libfdt1-devel libopenssl-devel

If prompted to trust the key, I’ll let you choose whether to trust permanently or just this time (personally I chose to always trust the key).


chroot> cd /root/xen

chroot> ./configure

chroot> make dist-tools

chroot> exit


The Xen tools are now in aarch64-chroot/root/xen/dist/install


Root filesystem and image:

We will create an ext3 formatted filesystem image, we will also use a simplified initscript to avoid long waits while running the model.

host$ wget http://download.opensuse.org/ports/aarch64/distribution/13.1/appliances/openSUSE-13.1-ARM-JeOS.aarch64-rootfs.aarch64-1.12.1-Build32.2.tbz

This is the same rootfs tarball as use for the chroot. You can re-use the previously downloaded tarball if you wish.


host$ dd if=/dev/zero bs=1M count=1024 of=rootfs.img

host$ /sbin/mkfs.ext3 rootfs.img

Say yes, we know it’s not a block device


host# mount -o loop rootfs.img /mnt

host# tar -C /mnt -xaf openSUSE-13.1-ARM-JeOS.aarch64-rootfs.aarch64-1.12.1-Build32.2.tbz


Install the Xen tools that we built earlier

host# rsync -aH aarch64-chroot/root/xen/dist/install/ /mnt/


Create the init scipt:

host# cat > /mnt/root/init.sh <<EOF


set -x

mount -o remount,rw /

mount -t proc none /proc

mount -t sysfs none /sys

mount -t tmpfs none /run

mkdir /run/lock

mount -t devtmpfs dev /dev

/sbin/udevd --daemon

udevadm trigger --action=add

mkdir /dev/pts

mount -t devpts none /dev/pts


mknod -m 640 /dev/xconsole p

chown root:adm /dev/xconsole


/sbin/klogd -c 1 -x



cd /root

export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

exec /bin/bash



host# chmod +x /mnt/root/init.sh


Get missing runtime dependencies for Xen

host$ wget http://download.opensuse.org/ports/aarch64/distribution/13.1/repo/oss/suse/aarch64/libyajl2-2.0.1-14.1.2.aarch64.rpm

host$ wget http://download.opensuse.org/ports/aarch64/distribution/13.1/repo/oss/suse/aarch64/libfdt1-1.4.0-2.1.3.aarch64.rpm

host# cp libyajl2-2.0.1-14.1.2.aarch64.rpm libfdt1-1.4.0-2.1.3.aarch64.rpm /mnt/root/

host# umount /mnt


Start the model

host$ ./Foundation_v8pkg/models/Linux64_GCC-4.1/Foundation_v8 \

     --image boot-wrapper-aarch64/xen-system.axf \

     --block-device rootfs.img \



Silence some of the harmless warnings

model> mkdir /lib/modules/$(uname -r)

model> depmod -a


Install the runtime dependencies:

model> rpm -ivh libfdt1-1.3.0-9.1.1.aarch64.rpm libyajl2-2.0.1-12.1.1.aarch64.rpm

model> ldconfig


Start the Xen daemon, you can ignore the harmless message about i386 qemu if it appears.

model> /etc/init.d/xencommons start

If you get an error of missing file for /etc/init.d/xencommons re-run ldconfig.


Confirm that Dom0 is up:

model> xl list

Name                                        ID Mem VCPUs      State   Time(s)

Domain-0 0   512     2 r-----      13.9


Congratulations, you now have a working Xen toolstack. You can now shut down the model for now.


Creation of a DomU guest

For the guest rootfs we will use a smaller OpenEmbedded based Linaro image rather than a full openSUSE image purely for space constraints.

host$ wget http://releases.linaro.org/latest/openembedded/aarch64/linaro-image-minimal-genericarmv8-20140223-649.rootfs.tar.gz


host$ dd if=/dev/zero bs=1M count=128 of=domU.img

host$ /sbin/mkfs.ext3 domU.img

Again say yes, we know it’s not a block device


host# mount -o loop domU.img /mnt

host# tar -C /mnt -xaf linaro-image-minimal-genericarmv8-20140223-649.rootfs.tar.gz

host# umount /mnt


Make the DomU rootfs and kernel available to the Dom0

host# mount -o loop rootfs.img /mnt

host# cp domU.img /mnt/root/domU.img

host# cp linux/arch/arm64/boot/Image /mnt/root/Image


Create the config for the guest

host# cat > /mnt/root/domU.cfg <<EOF

kernel = "/root/Image"

name = "guest"

memory = 512

vcpus = 1


extra = "console=hvc0 root=/dev/xvda ro"


disk = [ 'phy:/dev/loop0,xvda,w' ]



host# umount /mnt


Start the model again:

host$ ./Foundation_v8pkg/models/Linux64_GCC-4.1/Foundation_v8 \

     --image boot-wrapper-aarch64/xen-system.axf \

     --block-device rootfs.img \



model> losetup /dev/loop0 domU.img

model> /etc/init.d/xencommons start


Create the DomU using the config

model> xl create domU.cfg


View the guest’s info on the Xen console

model> xl list


Screenshot of the Dom0 host


Start the Dom0

model> xl console guest


Screenshot of the DomU guest


Now all that’s left is to have a lot of fun!


The march of ARM big.LITTLE™ multi-processing technology has been gaining further pace with multiple devices now shipping and new names joining in. Companies including Allwinner, Fujitsu Semiconductor, LG, Mediatek and Samsung Electronics announced their plans in 2013 and recently, we have seen many announcements by other leading mobile chip manufacturers for new SoCs based on big.LITTLE processing. These developments indicate the inevitable wide take up of this technology in leading devices across the mobile computing market.

Here’s a roundup of how big.LITTLE processing technology has been making waves:

Chip related announcements






Allwinner announced a new octa-core chip, the A80. Featuring a quad-core ARM Cortex®-A15 and quad-core Cortex®-A7 big.LITTLE processor implementation, this device is focused on tablets and digital TV applications



Following the successful MT8135, MediaTek announced its octa-core SoC MT6595 featuring four ARM Cortex®-A17 CPU cores, running @ 2.2GHz and four Cortex-A7 @ 1.7GHz


Samsung announced Exynos 5 Octa (5422) processor equipped with quad-core Cortex-A15 (up to 2.1GHz) and quad-core Cortex-A7 (up to 1.5GHz) with GTS enabled as the HMP implementation. This will power some international versions of Galaxy S5.


Samsung also introduced a new SoC from the Exynos family: the Exynos 5260. This is the first hexa-core mobile CPU - it uses two big cores and four LITTLE cores in a similar big.LITTLE pairing of Cortex-A15 and Cortex-A7 CPUs.

The big.LITTLE software implementation of Global Task Scheduling (GTS), which was released last year, is now fully deployed and in production devices supporting the fully heterogeneous mode of operation. GTS has been integrated into the Linux Kernel scheduler, so that software can support asymmetric topologies like 2x4 without any application level changes.


Device related announcements




































One of the first devices in 2014 to feature the Allwinner A80 will be the Teclast T97 Air, an Android tablet with a 9.7”2048x1536 display



Mediatek has shown outstanding performance of the MT8135 Tablet SoC on heavy web downloading, hardcore gaming, high quality premium video viewing or rigorous multitasking



The new hexa-core Samsung Exynos 5260 is now shipping in Galaxy Note 3 Neo already.

In addition to this we are also excited to hear Samsung announce that the 11.6-inch model of new Samsung Chromebook 2 is powered by Exynos 5 Octa based on four big cores (Cortex-A15 processors @ 1.9GHz) and four LITTLE cores (Cortex-A7 processors @ 1.3GHz) and the 13.3-inch model is also powered by Exynos 5 Octa, configured with big.LITTLE technology (Cortex-A15 processors @ 2.0GHz and Cortex-A7 processors @ 1.3GHz). Both models also incorporate powerful features of ARM Mali™-T628 GPU and ARM Artisan™ physical IP.


“We’re particularly excited to see the new 13-inch form factor from Samsung, powered by Exynos 5 Octa processor, which brings Chromebooks up market with high quality components and capabilities, like a full, bright HD screen and top-notch audio for a rich Google+ Hangouts experience.” said Caesar Sengupta, vice president of product management, Google.

Many of these announcements were made at Mobile World Congress (MWC) 2014 where ARM had a prominent presence across the show (More on this on @Ian Ferguson’s MWC summary, and some interesting mobile trends by Brian Jeff)

big.LITTLE technology developments news

    • TSMC recently announced tape-out of 64-bit ARMv8 processor in big.LITTLE combination featuring ARM Cortex-A57 and Cortex-A53 CPUs on TSMC’s leading edge 16nm FinFET manufacturing process, which can potentially offer >40% faster speed at the same total power, or alternatively reduce >55% in total power  at the same speed over 28HPM. The new SoC closely resembles the kind of test chip that a customer might build in a shipping product.

Why is this important?

ARM big.LITTLE technology focuses on answering one of the biggest industry challenges: How to create a System on Chip (SoC) that provides both high performance and improved energy efficiency? big.LITTLE processing brings scalable and efficient performance to the tightly-coupled combination of two ARM CPU clusters. This arrangement is transparent to computer programs, with the big.LITTLE multi-processing (MP) software automatically choosing the right processor for the right job. This is one of the most efficient ways to build an ARM multi-core system and delivers up to 40% SoC energy savings while providing peak performance.

big.LITTLE offers SoC designers and device designers a vast range of choice in power gating, core configurations – with a wide variety of topologies, from three or four core solutions to hexa-core and octa-core solutions, allowing SoC vendors to target their performance vs silicon area savings for low-cost, high-performance SoCs and ultimately enable consumers devices with significant efficiency advantages at all price points. Core configuration is going to be an area I shall personally be watching quite closely in the near future.

What interests you most regarding recent big.LITTLE technology developments?

We unveiled the ARMv8 architecture back in 2011 and since then, much activity has been going on at ARM and our partners. It’s no surprise therefore that this year has already seen a number of partner announcements around their plans for 64-bit ARMv8-A architecture-based devices, and by the end of 2014 we will definitely see ARMv8 silicon and numerous real devices in the market.


Recently Nandan Nayampally, vice president of marketing in the CPU Group, presented a webinar and Q&A session to the financial analyst community explaining the history and benefits of the ARMv8-A architecture.


The target audience was the financial community, including the 35 or so sell-side analysts that cover ARM. The call was attended by more than 80 investors and analysts.


The reaction from the audience was very positive. Here are a few excerpts from notes we have seen so far:

  • Espirito Santo: “More to v8 than just 64-bit”; “A wide addressable market”; “Higher royalty rates”; “v8 Licensing still early days”.
  • CENKOS TECH: “ARM’s presentation on the v8 architecture yesterday highlighted several key points. It’s a 64-bit future proof platform spanning the design requirements of smartphones, desktop computing and high-end enterprise servers and networking equipment. What is means really is faster data for applications in the Cloud and Mobile which in turn will allow OEMs to innovate. It also features beefed up security features and up to 2x higher performance (in terms of processor speed) and energy efficiency. All this ought to cement ARM’s technical and competitive leadership in processor design. Meanwhile Intel looks hamstrung as over 200 chipmakers continue to innovate around ARM.”

With interest building very fast in the ARMv8-A architecture, I have just published a guide to porting to the 64-bit A64 instruction. This is the 64-bit instruction set supported by processors like the Cortex-A57 and Cortex-A53 from ARM. If this features in your roadmap, please do take a look and let me know if you find it useful.


You can find it here Porting to ARM 64-bit



Linaro's Connect event was held in Macau where they talked about plans for Open Source Software on ARM.   It included this presentation on the work the Security Working Group intend to do on an open source Trusted OS. 



Secure World source code is often not easily available to developers so Linaro's efforts in this area is newsworthy.  The work planned by Linaro will build on top of the ARM Trusted Firmware open source project which focuses on low level trusted firmware.  You can find the 0.3 release here (including SMC code):

ARM-software/arm-trusted-firmware · GitHub


What do you think about Linaro working on open source Trusted OS projects?

Almost 30 delegates attended the ARM Global Training Partners Conference in Kuala Lumpur earlier this month.  On Tuesday 11th March, Daniel Dearing, Program Manager of the ARM Accredited Engineer Program, gave a presentation and led a workshop about the AAE program.  Chris Shore, ARM Training Manager, gave an overview on ARM and an training update to the ARM Approved Training Centers.

     ARM Training Partners Conference 2014.jpg

Delegates also enjoyed hearing from Steve Carr, Director of Vertical Markets, Future Electronics.  Steve explained why Future Electronics decided to become  the first distributor to sign up to the AAE program.  He explained that they believe the accreditation will help them to gain competitive advantage as customers will ask them to be involved in their projects at an earlier stage as they will have better ARM technical knowledge.


On the 12th and 13th March, Ed Player and Chris Shore gave two days of training on Cortex-M and Cortex-A respectively.  These ‘train the trainer’ sessions are an important part of the process of fully enabling our training partners and re invaluable in ensuring that ARM training partners are technically proficient, have a high level of ARM knowledge and will provide good quality training courses to their students. All of the delegates gave very positive feedback on the training and the overall
event, which was heralded as a great success.


The day before the ARM conference, Monday 10th  March , Daniel and Chris gave presentations to over 60 delegates at a ‘Techtalk’ seminar in Kuala Lumpur organised by the Multimedia Development Corporation (MDec). MDec directs and oversees Malaysia’s National ICT (Information & Communication Technology) Initiative, the MSC Malaysia (formerly known as the Multimedia Super Corridor in Malaysia). Chris and Dan agreed to participate in the seminar as MDec kindly provided the facilities for the ARM conference. This event was attended by local industry and faculty members from regional universities.


My Personal View of ARM

Posted by farabionan Mar 20, 2014

After a few years research and study about ARM, I was impressed by the ARM cortex, not just because it can be created as cheap as possible by a highly advance machine, but at the simplicity of the design and the performance. I have experience using Intel processor at low level, which mean, programming it using assembler, even there is a time where I can memorize the OPCODE of each instruction. Basically, ARM had an understandable bitfield on its opcode, time to time ARM is grow and change, from a 16-bit addressed device, to a 26-bit and now a 32-bit and even 64-bit addressed device. For those who doesnot know, 16-bit addressed device mean that device only had access to a 64k of memory. And a 26-bit mean you can use approximately 64 Mbytes of RAM, and 32 bits mean, 4 GB of device. Even though the newest device is 32-bits, it doesnot mean all of the BUS line for the address register on the processor is used for RAM. On ARM, hardware port is not like an intel one which is accessed using another instruction and not treated as like it was a part of an addressable memory. On ARM, when you use for example r0 as the base address for the memory lets say, 1000h, when you wanted to access a hardware port, you can change the register value to some address on the register just as it was part of the memory, this is very amazing because it make the complexity less. I dont know about you but ARM is like, a dream come true machine for me. I found lots of difficulty programming on Intel (honestly, because of my lack of comprehension) using a port.  Well, basicaly it is the same, it was just, for me, it looks pretty good when you just check certain part of the memory for the I/O status than checking on the port where in the end, you stored it on the memory. By using the hardware port status as if it was a part of the memory, you dont need to store your data on a buffer anymore, and the hardware status is freely accessible if the kernel wanted it.

Intel and ARM is a whole different things, and it is difficult to compare between it. I did not know which criteria to see which had the same specification to check wheter is better, because at some part Intel had more power on certain instruction and ARM is better at some instruction, and we cant measure a processor based on only a single or a few instruction, we need to compare it thoroughly. I personaly didnot know much about ARM but it is very interesting to know one. I had lots of device powered by ARM, and most of it had a face recognition system, something that very amaze me. Once Im making a face recognition application on Intel processor, and on my old laptop it ran only 10 FPS for only detect how much face on the screen, but using my Samsung device, a 800 Mhz processor capable doing the same thing. I know its hard to measure a device which had different instruction, because, it could be on some part there is an instruction that are spesific to do particular things but not good to do another things, but, if we measure it on a reallife applications, ARM win myheart. Now that Im going to do the same for ARM, I'll start studying everything, even try to design my own OS by using any freely available part on the internet or making everything on my own. There is no particular reason Im doing this, just for fun. I worked as a PHP programmer on my real life.



I guess that is all I wanted to say. I want to say to the ARM designer team, you created a state of the art design. Amazing.

VEX Robotics is an organization that promotes robotics education from as young as 8 years old through college. Every years thousands of teams from around the world participate in the robotics competition. The best of the best gets to compete at the World Championship. This year it takes place on April 24-26 at the Anaheim Convention Center. Last year, over 700 teams converged in Anaheim for the World Championship, some coming from as far as New Zealand. The brain in the VEX Cortex Microcontroller is powered by an ARM Cortex-M3 and STMicroelectronics STM32.


Check out the video from the California State Championship which took place this past weekend.


Also, VEX Robotics is always looking for mentors and judges. Those of you die-hard engineers should be out there to mentor the next generation of engineers, and in the process you will be caught up in the excitement as well. Check them out at  VEX Robotics Competition and see if there's a VEX Robotics team near you.

I thought it might be useful to put a summary of all of the content from the Embedded Wold show in a list.


Daily ARM Blogs:


ARMFlix Embedded World 2014 video playlist


Embedded World 2014 Wrap up video on YouTube



Awards from the show:

ARMv8-R architecture wins hardware award at Embedded Systems conference


Interviews from the show:

ARM's v8-R architecture to enable new types of MCU? New Electronics talks with Chris Turner


Papers from the show:

Embedded World 2014 - ARM® Cortex®-M Processor based System Prototyping on FPGA by Joseph Yiu


Daily blogs from Mark Saunders from Cypress


Pre-show blogs by Philippe Bressy

Embedded World - Ich bin bereit !

Embedded World - The countdown has started


ARM Connect Community partner new product releases:

New LPC Microcontroller Streamlines Motor Control from NXP - see the video here NXP LPC1500 series (EW 2014) - YouTube

STM32L0 Series from  STMicroelectronics - see the video here STMicroelectronic's STM32 L0 (EW 2014) - YouTube

Freescale Shrinks World’s Smallest ARM-Based MCU by an Additional 15 Percent from Freescale - see the video here Freescale Kinetis KL03 MCU (EW 2014) - YouTube


Did I miss anything ? Feel free to comment below and I'll add to the list




1) Cortex-R processor are widely used across many embedded applications

Often the Cortex-R Series are used in devices such as storage controller processors, LTE modems and industrial and automotive applications where the key attributes are needed:

    • Fast: High processing performance at high clock frequencies
    • Real-time: Deterministic processing always meets real-time constraints
    • Reliable: Dependable with safety features and high error resistance

Cortex-R Series are not always as visible as the Cortex-A Series application processors or the Cortex-M microcontrollers, where the ARM brand adds value to our partners’ products and demonstrates there is a wide eco-system of engineers that have skills in programming them.

The safety features are especially important when implementing automotive and industrial embedded control systems where features such as memory protection, error-correcting codes and lock-step, using a redundant copy of the processor to detect errors, deliver high error resistance.

Many LTE modems use Cortex-R processor and in storage the Cortex-Rs are very popular. To date (3Q13) 900+ million devices have shipped that incorporate Cortex-R processors, proving the processors to be very mature and reliable.



2) Tightly Coupled Memory (TCM) for performance and determinism

TCM is memory connected closely to the processor core. This memory is very fast for the processor to access. Typically it will hold interrupt service routines and data tables that need to be accessed quickly. As soon as an interrupt arrives the Cortex-R processor can switch to interrupt privilege mode and quickly start working on the interrupt code that is held there. Without TCM if the interrupt service routine code, or any data it needed to access, was not held locally in the cache then the cache would need to fetch the code from main memory and this may take many clock cycles while the processor must wait until the code and data is available. With TCM then the worst case number of cycles to start running the interrupt code is known and hence the Cortex-R processors are deterministic.

image009.pngMemory access above the dotted line the Cortex-R processor is always fast and deterministic

In a system with a Memory Management Unit then if the code or data is not available in the cache then a page table walk may be required and this could take hundred of cycles. TCM enables fast deterministic response to interrupts which makes the Cortex-R series ideal for real time systems and .


3) SIMD instructions and CMSIS-DSP Library functions add DSP capabilities

The Cortex-R Series provide native ability to do perform Single Instruction Multiple Data (SIMD) and Multiply and Accumulate (MAC) instructions. These enable multiple operations to be performed in a single clock cycle and includes saturating maths that clips rather than overflows results that are too large.

The CMSIS-DSP library is a collection of 61 algorithms that utilise the SIMD capabilities and include:

  • Basic maths: vector multiply, add, subtract, scale, shift, negate...
  • Statistics: root mean square, mean, standard deviation...
  • Fast maths: sine, cosine, square root...
  • Complex maths: conjugate, dot product, magnitude, multiply by real...
  • Filters:  FIR, IIR, convolution, correlation..
  • Matrix algebra: addition, multiplication, scale...
  • Transforms: Fast Fourier, discrete cosine...
  • Controller: PID motor control, (Inverse)Park transform, (Inverse)Clarke transform...
  • Interpolation: linear and bilinear...
  • Support functions: type conversion, copy, fill...

By including these capabilities in the processor a much simpler, more cost-effective and easier to debug system can be created than by having a separate DSP. The performance and width of SIMD data processed is not as advanced as some of the very high-end standalone DSPs but in many applications, use of these capabilities can make the system more efficient and lower power.


Example motor control application where Park and Clarke transforms are handled by the SIMD/DSP capabilities through the CMSIS-DSP library


4) Branch shadow and branch prediction


The Cortex-R Series enhance performance through advanced branch prediction techniques. In a pipelined processor multiple actions happen in each clock cycle. In Cortex-R, both instruction fetch and data read/write access are extended to two cycles allowing longer memory access time, enabling either larger memories or slower memories that can be denser or lower power. This removes memory system limitations on processor clock frequency. Plus another additional decode stage that accommodates branch prediction (conditionals, loops and function returns) and an instruction queue to keep the data processing unit fed with instructions. If a branch happens without prediction then the processor must stall and wait until the pipeline is reloaded with instructions from the new address to refill the pipeline and reach the data processing unit. Branch prediction determines the most likely outcome of any branch instruction and either continues as normal, if it predicts the branch will not be taken, or starts loading the pipeline with the instructions from the branch address so that the data processing unit will not stalled. Branch prediction can significantly improve the performance of processors. The Cortex-R7 approaches 100% branch prediction accuracy compared to ~80% for Cortex-R4/R5.


5) Error Correcting Code (ECC) generation/checking is built into the processor pipeline

ECC is a method of checking that the memory location data is correct and has not been corrupted. If a single bit error is detected then it can be automatically corrected and written back to the memory location. The memory has additional bits added and a code is generated and stored in these additional bits whenever information is written to memory. When the memory is read back the code is checked to ensure the data and code still match. This could be the case if there has been a Single Event Upset (SEU) such as radiation hitting the memory location and flipping the bit, or if there is a physical error in the memory. In the Cortex-R Series the ECC code generation and checking is done automatically and does not cause any performance impact, unless of course and error is detected. EEC is an optional feature on all of the Cortex-R Series.

Pipelined ECC.png

Example of ECC on TCM as part of the Cortex-R Series pipeline

Filter Blog

By date:
By tag: