1 2 3 Previous Next

ARM and Keil Tools

112 posts

You may have seen the announcements of Texas Instruments new and exciting MSP432P4x MCUs based on the ARM Cortex-M4 core. Keil MDK Version 5 offers out of the box support for these devices with TI's MSP432 Device Family Pack. Learn how to use the Pack to develop, program and debug applications using µVision on our Cortex-M Learning Platform. Refer to the news on keil.com for more information.

Embedded

We received hundreds of project proposals, and already shipped out more than 200 boards to participants.

The discussions in the contest forum are gaining steam, with the technical questions rolling in.

 

Now with one more week to go to register your project, I have great news to share:

 

WE-Logo_2007_A4_CMYK_DOS.jpgWürth Elektronik, one of the world’s leading manufacturers of electronic and electromechanical components, is offering free components for all contest participants. By sponsoring the contest, Würth Elektronik allows participants to design the most efficient boards and present their innovative solutions. Accepted entrants can request power and filter inductors, wireless charging coils, capacitors, LEDs and connectors. Check out www.we-online.com for details about their portfolio.

 

We are looking forward to receiving the final entries until 1st of April.

Hello,

 

after the huge success of the XMC™ Developer Days last year, Infineon is going to have this event again. There will be two one-day training events: one in Milano on 16 April 2015 and one in Munich on 12 May 2015. If you want to register for the event (required but free-of-charge), please visit: www.infineon.com/xmcdeveloperday. Participants will receive a XMC4500 Relax Kit and a XMC1200 Boot Kit for free. You will have the ability to try the hardware in technical training sessions using various development tools, for example Keil MDK Version 5. The interface between MDK and the all new DAVE will also be explained. Each participant will receive a time limited license for MDK-Professional.

 

ARM will participate in both events. Milano will be supported by our Italian distributor Tecnologix and Munich will be supported by the local German ARM team.

 

See you in Milano and Munich!

Embedded

The main function of the compiler is to translate source code to machine code but, during the development of a project, we inevitably make some mistake.

An important aspect of a good compiler is the capability of generating clear and precise error and warning messages. Clear error messages help to quickly identify mistakes in the code whereas warning messages are potential issues the compiler found in the code that might be worth investigating.

ARM Compiler 6 is based on the well-known Clang compiler which provides more precise and accurate warning/error messages compared to other tool-chains. Let's have a look at some examples:

 

 

Assignment in condition

 

Let’s try to write an example code:

#include <stdio.h>

int main() {
int a = 0, b = 0;
if (a = b) {
        printf("Right\n");
 } else {
        printf("Wrong\n");
 }
return 0;
}


















 

We made a mistake in the code by using ‘=’ instead of ‘==’. The code above is legitimate C language; it is the error in logic that will trip you up. Let’s see how different compilers help the developer identify the logic error:

ARM Compiler 5

"main.cpp", line 6: Warning:  #1293-D: assignment in condition

      if (a = b) {

          ^

ARM Compiler 6

main.cpp:6:11: warning: using the result of an assignment as a condition without parentheses [-Wparentheses]

    if (a = b) {

               ~~^~~

main.cpp:6:11: note: place parentheses around the assignment to silence this warning

    if (a = b) {

                  ^

       (     )

main.cpp:6:11: note: use '==' to turn this assignment into an equality comparison

    if (a = b) {

          ^

          ==

GCC 4.9

main.cpp: In function ‘int main()’:

main.cpp:6:14: warning: suggest parentheses around assignment used as truth value [-Wparentheses]

As you can see from the different outputs, ARM Compiler 6 not only indicates what it thinks is wrong in the code, but it also suggests two different ways to resolve the issue. The warning messages quote the entire line and highlight the specific part which requires attention from the user.

 

Templates

Templates are a great feature of C++ but they are sometimes a source of head-aches when troubleshooting problems. Let’s have a look at another example, this time involving C++ templates:

 

int sum_vector(const std::vector<int> &input) {
       std::vector<float>::const_iterator i = input.begin();
       int sum = 0;
       for(; i != input.end(); ++i) {
              sum += *i;
       }
       return sum;
}










 

The root of the error is that the template type used for the const_iterator is float whereas the vector requires int. How do different compilers identify this error?

ARM Compiler 5

"main.cpp", line 5: Error:  #144: a value of type "const int *" cannot be used to initialize an entity of type "const float *"

      std::vector<float>::const_iterator i = input.begin();

                                             ^

ARM Compiler 6

main.cpp:5:40: error: cannot initialize a variable of type 'std::vector<float>::const_iterator' (aka 'const float *') with an rvalue of type 'const_iterator' (aka 'const int *')

'std::vector<float>::const_iterator i = input.begin();

                                    ^   ~~~~~~~~~~~~~

GCC 4.9

main.cpp: In function ‘int sum_vector(const std::vector<int>&)’:

main.cpp:5:56: error: conversion from ‘std::vector<int>::const_iterator {aka __gnu_cxx::__normal_iterator<const int*, std::vector<int> >}’ to non-scalar type ‘std::vector<float>::const_iterator {aka __gnu_cxx::__normal_iterator<const float*, std::vector<float> >}’ requested

main.cpp:7:26: error: no match for ‘operator!=’ in ‘i != (& input)->std::vector<_Tp, _Alloc>::end<int, std::allocator<int> >()’

main.cpp:7:26: note: candidates are:

In file included from /usr/include/x86_64-linux-gnu/c++/4.7/./bits/c++allocator.h:34:0,

                 from /usr/include/c++/4.7/bits/allocator.h:48,

(additional 98 lines following)

The error messages produced by ARM Compiler 5 and ARM Compiler 6 are self-explaining and give a clear suggestion to the user on how to fix the issue: the mismatching template types are clearly indicated in the error message and the caret points the user to the precise point within the code. GCC, on the other hand, generates about 100 lines of hard-to-understand error messages.

 

 

Automatic Macro Expansion

Another very useful feature of diagnostic messages in ARM Compiler 6 is the automatic macro expansion. Consider the following code:

 

#define LOG(PREFIX, MESSAGE) fprintf(stderr, "%s: %s", PREFIX, MESSAGE)

#define LOG_WARNING(MESSAGE) LOG("Warning", MESSAGE)

…
       LOG_WARNING(123);
…









 

LOG_WARNING has been called with an integer as an argument even though, by expanding the two macros, you can see the fprintf function is expecting a string. When the macros are close together in the code it’s easy to spot these errors: what if they are defined in different part of the source code, or worse, in some external libraries ?

ARM Compiler 6 automatically expands each macro involved so that you can easily see what’s wrong in the call chain as shown below:

ARM Compiler 6

main.cpp:8:14: warning: format specifies type 'char *' but the argument has type 'int' [-Wformat]

        LOG_WARNING(123);

        ~~~~~~~~~~~~^~~

main.cpp:5:45: note: expanded from macro 'LOG_WARNING'

#define LOG_WARNING(MESSAGE) LOG("Warning", MESSAGE)

                                            ^

main.cpp:3:64: note: expanded from macro 'LOG'

#define LOG(PREFIX, MESSAGE) fprintf(stderr, "%s: %s", PREFIX, MESSAGE)

^

 

Summary

In summary we have seen how ARM Compiler 6 is able to give precise and detailed warning and error messages, simplifying the important task of troubleshooting coding errors. The quality of the error and warning messages not only helps the developer spot potential bugs in the final product, but also to quickly understand and fix them by providing sensible and easy to decipher suggestions.

I hope that you found this blog post useful and you are looking forward to using ARM Compiler 6 ! If you still don’t have DS-5, download a free 30-day evaluation .

Feel free to post any questions or comments below.

 

Ciao,

Stefano

Recently I have been preparing one of the demos for ARM's booth at Embedded World 2015 (which occurred last week 24th - 26th February) to showcase ARM Development Studio 5’s (DS-5) debug and trace functionality. This demo makes use of Freescale's SABRE board based on their newly announced i.MX 6SoloX Applications Processor, containing an ARM Cortex-A9 and ARM Cortex-M4. The image below shows the board with an attached LCD screen and connected to an ARM DSTREAM - ARM’s high-performance debug and trace unit.

ds5-imx6sx-photo.jpg

 

The i.MX 6SoloX is a multimedia-focused processor aimed at a wide variety of applications - achieved by combining the high performance of a Cortex-A9 and low power of a Cortex-M4 to maximise power efficiency.

 

Thanks to the co-operation between ARM and Freescale I was able to bring up the board very quickly, allowing me to get Linux and Android both booting on the board within a day. This is especially impressive given the board was pre-production at that point.

 

Once I had Linux successfully booting I then went about getting the DS-5 Debugger connected to it using the DSTEAM via the board's JTAG-20 connector. DS-5 Debugger support for the board was also added (via the configuration database) during the board's pre-production stages allowing full DS-5 support from release. This makes connecting the debugger as simple as selecting the platform in the Debug Configuration editor, choosing the appropriate connection to use and configuring it. This also enables a smooth, rapid transition from receiving the board to debugging or profiling a target application from the product's launch.

 

As the board has both a Cortex-A9 and Cortex-M4 on, it was a good candidate to demonstrate DS-5’s multicore debugging. These cores use Asymmetric Multiprocessing (AMP), as opposed to Symmetric Multiprocessing (SMP), meaning they are running completely independently (rather than under a single Operating System). I used DS-5 to connect and debug Linux on the Cortex-A9 as well as simultaneously using a separate connection to load a small Freescale MQX RTOS based example onto the Cortex-M4.

 

 

DS-5 Functionality

When debugging, we have access to DS-5's full suite of tools including, among others:

 

Debug

DS-5 allows multicore debug of targets (in both SMP and AMP configurations) using hardware and software breakpoints, with tools to provide a wide range of additional functionality from MMU page table views to custom peripheral register maps.

ds5-imx6sx-debug.png

 

Instruction Trace

Collecting trace from the target using the DSTREAM allows non-intrusive instruction-level trace which can be used to help with: debug, especially when when halting the core is undesirable; post-crash analysis; and profiling the target.

ds5-imx6sx-trace.png

 

RTOS OS Awareness

DS-5 offers additional information views (e.g. memory use, running tasks etc.) for a number of the most popular RTOS (pictured below is one of the views for MQX).

ds5-imx6sx-mqx-awareness.png

 

Linux OS Awareness

Specialised connections are available to Linux targets to allow debug and trace of the Linux Kernel and Kernel Modules as well as visualisations of the threads, processes and resources.

 

 

Linux Application debug

Applications may be debugged by downloading the target application and compatible gdbserver to the target for debugging using the DS-5 interface (concurrently with Bare Metal or OS-level debug and/or trace as desired).

ds5-imx6sx-app-debug.png

 

Streamline

Streamline allows visualisation of the target’s performance for profiling and optimizing target code. Requiring no additional hardware or debug connections, Streamline operates only via TCPIP (or optionally an ADB connection in the case of Android) and can profile many aspects of a CPU/GPU/interconnect in high detail at varying resolutions, including power analysis (via external or on-chip energy probes). The image below shows a Streamline trace from Linux on the i.MX 6SoloX under heavy load.

ds5-imx6sx-streamline.png

 

 

Summary

Thanks to a stable pre-production platform and Freescale's assistance adding the new board to the debug configuration database the whole bring up experience was seamless. For information on creating debug configurations for new platforms, see the "New Platform Bring-Up with DS-5" post. ARM encourages platform owners to submit any debug configurations for their platforms back to the DS-5 team so we can include them by default in subsequent releases - thus allowing the same, fluid out-of-box experience for end users.

February has been an exciting period for ARM: from the announcement of new products for the Premium Mobile segment to the release of the new DS-5 v.5.20.2

DS-5 v5.20.2 includes ARM Compiler 6.01, the latest LLVM based ARM Compiler. The main highlights of ARM Compiler 6.01 are:

  • Support for the latest ARM CPUs, including Cortex-A72
  • Extended support for the Cortex family of processors
  • Support for bare-metal Position Independent Executables (PIE)
  • Support for link time optimization (LTO)

 

Support for new CPUs

This release brings to the market the most advanced compilation technology in ARM Compiler 6.01 for the new ARMv8-A Cortex-A72 (-mcpu=cortex-a72) processor. With the support of the Cortex-A72, ARM Compiler continues to provide early-support for new cores to enable our customers to start developing as soon as possible and reduce the time-to-market.

 

Extended support for the Cortex family of processors

The new release of ARM Compiler adds support for ARMv7 Cortex-A class processors, enabling customers to take advantage of the new compiler for a wider range of products.

ARM Compiler 6.01 brings the advanced code generation used to build ARMv8 code to the more consolidated 32bit world, speeding up the adoption of the new compiler for companies not yet using ARMv8.

There’s more: ARM Compiler 6.01 adds initial (alpha quality) support for both Cortex-M and Cortex-R families to let engineers familiarise themselves with the new features and start the evaluation as soon as possible.

For more details see the release notes for ARM Compiler 6.01.

 

Bare-metal PIC support

Security has become a crucial aspect of applications, especially when connected to the network or available on the internet. One of the most common attacks to gain privilege on a system is through buffer overflows: this anomaly could potentially lead to the execution of malicious code, jeopardizing the security of the entire system through code injection.

Different techniques have been created to make a hackers’ life harder, one of the most commonly used to reduce the risk of attacks is to randomize the address space layout (ASLR). This technique is widely used in several high-level Operating Systems like Android, iOS, Linux and Windows. With ARM Compiler 6.01 it’s possible to extend the usage of this protection also on bare-metal applications.

ARM Compiler 6.01 allows the creation of bare-metal Position Independent Executables (PIE) which allows the executable to be loaded anywhere in the memory and the code will automatically recalculate the new addresses. More details are on infocenter about the Compiler command line option –fbare-metal-pie and Linker command line option –fpic.

 

Link Time Optimization

ARM Compiler is able to optimise code generated from each source file to get the best performance out of ARM processors. But what about optimising the different modules? ARM Compiler 6.01 introduces initial support for Link Time Optimization (LTO), which extends the capability of the compiler and the linker to perform optimizations by looking at the whole program and not just a single compilation unit, giving an extra performance boost!

To enable Link Time Optimization in ARM Compiler 6.01, take a look at the documentation in infocenter about Linker command line option –lto.

 

I hope that you found this information useful and you are ready to use ARM Compiler 6.01! If you still don’t have DS-5, download a free 30-day evaluation .

Feel free to post any questions or comments below.

 

Ciao,

Stefano

robkaye

4 Years On - Fast Models 9.2

Posted by robkaye Feb 23, 2015

At the end of last week (20/Feb/2015) we released Fast Models 9.2.   This year we are moving to a quarterly release cycle.  The more frequent releases enable the accelerating rate at which we are developing and deploying new Fast Models, and the speed at which partners pick them up and put them to use.  The main focus for Fast Models 9.2 was the release of the Cortex-A72 and CCI-500 models following the announcement of the IP earlier in the month.   Lead partners had been developing virtual prototypes with these models for several months already.   We also included critical fixes for partners and completed the next stage of performance improvement work.  For the latter, this release cycle our emphasis has been on how to models behave when used in SystemC simulations with many Fast Model components: something that is important to many of our partners.

 

It's been just over 4 years since I joined the Fast Models team.  I've just been working on a summary of how the solution has evolved in that time for an internal conference and thought it would be interesting to look how things have moved ahead in those four years.

 

Firstly, we have seen a rapid growth in usage: more and more partners are leveraging virtual prototypes as part of their SoC development process.   We are also seeing the models used in many more different ways.   Early software development remains front and center in our thoughts, but we have seen increasing use of the models in software driven validation of the hardware, in performance estimation and device compliance validation.

 

In 2011 we were working on the first models for ARMv8 cores.  That year we introduced models for four new cores at either beta or release status.  In 2015 it will be close to treble that amount.  On the System IP front it's the same story: approximately three times as many models will be rolled out this year compared to 2011.  Fast Models for Media IP (GPU, video and display processors) were just a concept on a road map in 2011 but this year we have several in the works along with a range of platform models that combine the media models with CPUs, system IP.   These platforms are aligned with the availability of IP and software stacks from sister teams in ARM to provide partners with a complete solution.

 

The underlying tools that support these models must move forward with the models deliveries. I've already mentioned the burgeoning use cases, then combine that with increasing complexity of the platforms being designed and the advance of host workstation operating systems,  To support this we have a comprehensive road map of feature support (such as checkpointing and timing annotation) to complement the continuous improvements in performance and quality and OS/tools support.

 

It's definitely an exciting and challenging part of the ARM story to be involved with.  I'm looking forward to 2015 and beyond with great anticipation.

 

Spectrum.pngTwo ends of the spectrum: Virtual Prototypes for an ARMv8 big.LITTLE mobile platform and a Cortex-M7 MCU.

Performance and power optimization are critical considerations for new Linux and Android™ products. This blog explores the most widely used performance and power profiling methodologies, and their application to the different stages in the product design.

 

The need for efficiency

In the highly competitive market for smartphones, tablets and mobile Internet devices, the success of new products depends strongly on high performance, responsive software and long battery life.

 

In the PC era it was acceptable to achieve high performance by clocking the hardware at faster frequencies. However, this does not work in a world in which users expect to always stay connected. The only way to deliver high performance while keeping a long battery life is to make the product more efficient.

 

On the hardware side the need for efficiency has pushed the use of lower silicon geometries and SoC integration. On the software side performance analysis needs to become an integral part of the design flow.

 

Processor instruction trace

Most Linux-capable ARM® processor-based chipsets include either a CoreSight Embedded Trace Macrocell (ETM) or a Program Trace Macrocell (PTM).

 

The ETM and PTM generate a compressed trace of every instruction executed by the processor, which is stored on an on-chip Embedded Trace Buffer (ETB) or an external trace port analyzer. Software debuggers can import this trace to reconstruct a list of instructions and create a profiling report. For example, DS-5 Development Studio Debugger can collect 4GB of instruction trace via the ARM DSTREAM target connection unit and display a time-based function heat map.

 

Instruction trace generation, collection and display.PNG

Figure 1: Instruction trace generation, collection and display

 

Instruction trace is potentially very useful for performance analysis, as it is 100% non-intrusive and provides information at the finest possible granularity. For instance, with instruction trace you can measure accurately the time lag between two instructions. Unfortunately, trace has some practical limitations.

 

The first limitation is commercial. The number of processors on a single SoC is growing and they are clocked at increasingly high frequencies, which results in higher bandwidth requirements on the CoreSight trace system and wider, more expensive, off-chip trace ports. The only sustainable solution for systems running at full speed is to trace to an internal buffer, which limits the capture to less than 1ms. This is not enough to generate profiling data for a full software task such as a phone call.

 

The second limitation is practical. Linux and Android are complex multi-layered systems, and it is difficult to find events of interest in an instruction trace stream. Trace search utilities help in this area, but navigating 4GB of compressed data is still very time-consuming.

 

The third limitation is technical. The debugger needs to know which application is running on the target and at which address it is loaded in order to decompress the trace stream. Today’s devices do not have the infrastructure to synchronize the trace stream with kernel context-switch information, which means that it is not possible to capture and decompress non-intrusively a full trace stream through context switches.

 

Sample-based profiling

For performance analysis over long periods of time sample-based analysis offers a very good compromise of low intrusiveness, low price and accuracy. A popular Linux sample-based profiling tool is perf.

 

Sample-based tools make use of a timer interrupt to stop the processor at regular intervals and capture the current value of the program counter in order to generate profiling reports. For example, perf can use this information to display the processor time spent on each process, thread, function or line of source code. This enables developers to easily spot hot areas of code.

 

At a slightly higher level of intrusiveness, sample-based profilers can also unwind the call stack at every sample to generate a call-path report. This report shows how much time the processor has spent on each call path, enabling different optimizations such as manual function inlining.

 

Sample-based profilers do not require a JTAG debug probe or a trace port analyzer, and are therefore much lower cost than instruction trace-based profilers. On the downside they cause a target slow-down of between 5 and 10% depending on how much information is captured on every sample.

 

It is important to note that sample-based profilers do not deliver “perfect data” but “statistically relevant data”, as the profiler works on samples instead of on every single instruction. Because of this, profiling data for hot functions is very accurate, but profiling data for the rest of the code is not accurate. This is not normally an issue, as developers are mostly interested in the hot code.

 

A final limitation of sample-based profilers is related to the analysis of short, critical sequences of code. The profiler will tell you how much processor time is spent on that code. However, only instruction trace can provide the detail on the sequence in which instructions are executed and how much time each instruction requires.

 

Logging and kernel traces

Logging or annotation is a traditional way to analyze the performance of a system. In its simplest form, logging relies on the developer adding print statements in different places in the code, each with a timestamp. The resulting log file shows how long each piece of code took to execute.

 

This methodology is simple and cheap. Its major drawback is that in order to measure a different part of the code you need to instrument it and rebuild it. Depending on the size of the application this can be very time consuming. For example, many companies only rebuild their software stacks overnight.

 

The Linux kernel provides the infrastructure for a more advanced form of logging called “tracing”. Tracing is used to automatically record a high number of system-level events such as IRQs, system calls, scheduling and event application-specific events. Lately, the kernel has been extended to also provide access to the processor’s performance counters, which contain hardware-related information such as cache usage or number of instructions executed by the processor.

 

Kernel trace enables you to analyze performance in two ways. First, you can use it to check whether some events are happening more often than expected. For example, it can be used to detect that an application is making the same system call several times when only one is required.  Secondly, it can be used to measure the latency between two events and compare it with your expectations or previous runs.

 

Since kernel trace is implemented in a fairly non-intrusive way, it is very widely used by the Linux community, using tools such as perf, ftrace or LTTng. A new Linux development will enable events to be “printed” to a CoreSight Instrumentation Trace Macrocell (ITM) or System Trace Macrocell (STM) in order to reduce intrusiveness further and provide a better synchronization of events with instruction trace.

 

Combining sampling with kernel trace

Open source tools such as perf and commercial tools such as the ARM DS-5 Streamline performance analyzer combine the functionality of a sample-based profiler with kernel trace data and processor performance counters, providing high-level visibility of how applications make use of the kernel and system-level resources.

 

For example, Streamline can display processor and kernel counters over time, synchronized to threads, processes and the samples collected, all in a single timeline view. For example, this information can be used to quickly spot which application is thrashing the cache memories or creating a burst in network usage.

 

Streamline Timeline View.png

Figure 2: Streamline Timeline View

 

Instrumentation-based profiling

Instrumentation completes the pictures of performance analysis methodologies. Instrumented software can log every function – or potentially every instruction - entry and exit to generate profiling or code coverage reports. This is achieved by instrumenting, or automatically modifying, the software itself.

 

The advantage of instrumentation over sample-based profiling is that it gives information about every function call instead of only a sample of them. Its disadvantage is that it is very intrusive, and may cause substantial slow-down.

 

Using the right tool for the job

All of the techniques described so far may apply to all stages of a typical software design cycle. However, some are more appropriate than others at each stage.

 

 

Low Cost

Low Intrusiveness

Accuracy

Granularity

System Visibility

Logging

•••

•••

•••••

••

Kernel trace

•••••

••••

•••••

•••

•••

Instruction trace

•••••

•••••

•••••

Sample-based

•••••

•••

•••

••

••••

Instrumentation

•••••

•••••

••••

 

Table 1: Comparison of methodologies

 

Instruction trace is mostly useful for kernel and driver development, but has limited use for Linux application and Android native development, and virtually no use for Android Java application development.

 

Performance improvements in kernel space are often in time-critical code handling the interaction between kernel, threads and peripherals. Improving this code requires the high accuracy and granularity, and low intrusiveness of instruction trace.

 

Secondly, kernel developers have enough control of the whole system to do something about it. For example, they can slow down the processors to transmit trace over a narrow trace port, or they can hand-craft the complete software stack for a fast peripheral. However, as you move into application space, developers do not need the accuracy and granularity of instruction trace, as the performance increase achieved by software tweaks can easily be lost by random kernel and driver behaviour totally outside of his control.

 

In the application space, engineering efficiency and system visibility are much more useful than perfect profiling information. The developer needs to find quickly which bits of code to optimize, and measure accurately the time between events, but can accept a 5% slow-down in the code.

 

System visibility is extremely important in both kernel and application space, as it enables developers to quickly find and kill the elephant in the room. Example system-related performance issues include misuse of cache memories, processors and peripherals not being turned off, inefficient access to the file system or deadlocks between threads or applications. Solving a system-related issue has the potential to increase the total performance of the system ten times more than spending days or weeks writing optimal code for an application in isolation. Because of this, analysis tools combining sample-based profiling and kernel trace will continue to dominate Linux performance analysis, especially at application level.

 

Instrumentation-based profiling is the weakest performance analysis technique because of its high level of intrusiveness. Optimizing Android Java applications has better chances of success by using manual logging than open-source tools.

 

High-performance Android systems

Most Android applications are developed at Java level in order to achieve platform portability. Unfortunately, the performance of the Java code has a random component, as it is affected by the JIT compiler. This makes both performance analysis and optimization difficult.

 

In any case, the only way to guarantee that an Android application will be fast and power-efficient is to write it - or at least parts of it - in native C/C++ code.  Research shows that native applications run between 5 and 20 times faster than equivalent Java applications. In fact, most popular Android apps for gaming, video or audio are written in C/C++.

 

For Android native development on ARM processor-based systems Android provides the Native Development Kit (NDK). ARM offers DS-5 as its professional software tool-chain for both Linux and Android native development.

 

By Javier Orensanz, Director of Product Management - Tools at ARM

MCUDesignContest.pngHere at ARM we’re very excited about the launch of the ARM MCU Design Contest. In cooperation with Elektor Magazine, we’re calling all engineers, hobbyists and enthusiasts to create impressive, fun and sophisticated MCU applications. Enter the competition and you could be in with a chance to win one of our cash prizes ($5,000, $3,000, $1,000, 2x $500).

 

Courtesy of the participating ARM partners Freescale, Infineon, NXP and STMicroelectronics, we are providing a total of 400 ARM Cortex-M4 development boards. Together with a free 6-month license of Keil MDK-Professional, we’ll equip you with all that you need so that you’re are ready to dive into your project right away.

 

The focus of this contest is on CMSIS software components and middleware. CMSIS, the Cortex Microcontroller Software Interface Standard, has recently been expanded with the CMSIS-Pack and CMSIS-Driver specifications to simplify development, management and deployment of software components for embedded applications. We want you to make use of the existing components, expand them or write your own from scratch.

 

Here are the hardware platforms you can choose from:

 

All of these boards have different capabilities, sensors and peripherals, making them suitable for a wide range of applications. We provide several example projects on the ARM MCU Design Contest webpage that show you how to use the drivers and middleware together.

 

How can I participate?

In three easy steps:

  1. Submit your project proposal via the Elektor ARM MCU Design Contest website
  2. Once accepted, receive your free board and MDK-Pro license
  3. Get developing!

 

Proposals are accepted until 1st of April, but we expect high demand for the boards, so the sooner you sign up, the better.

The winners will be selected by a panel of ARM engineers and Elektor editors and will be announced in the Elektor September issue and online.

 

Have any questions about the contest?

Comment on this blog, or join the ARM MCU Design Contest Community forum.

 

Good luck!

Hello,

we now can offer a detailed MDK-workshop called:

 

"USB Host Application with File System and Graphical User Interface"


Please enter our website at:

http://www2.keil.com/mdk5/learn/usb_host

 

Note that in case of any question please contact me at:

support.intl@keil.com

 

Have fun, best regards,

Ralf

 

Ralf Kopsch

Senior Applications Engineer

ARM Germany GmbH

Chinese Version 中文版: Cocos Code IDE 1.1.0:集成ARM DS-5,高效调试C++

As an important product of the Cocos Developer Platform, Cocos Code IDE has finally brought to us its version 1.1.0integrating with ARM DS-5 and enabling efficient C++ debugging. In this new version, Cocos Code IDE has possessed authorization from ARM® to issue ARM Development Studio 5DSTM-5Community Edition, aiming at further smoothening the development process and enhancing user experience.

 

01.png

 

 

DS-5 is a powerful tool chain that has integrated many ARM-exclusive features into Eclipse platform; based on Eclipse development environment, DS-5 offers superior window management, project management and C/C++ source code editing tools, and supports C++ developing and debugging on Android devices.

 

In version 1.1.0, granted by ARM, Cocos Code IDE is motivated to offer great help and convenience to developers by providing DS-5 Community Edition for free.

 

02.png

 

Cocos Code IDE is a cross-platform IDE based on Eclipse, especially for Cocos2d-x Lua & JavaScript developers. With IDE, developers can easily create game projects, compile code and debug on different platforms. Moreover, developers are able to check the effect real time and in the end publish a ready-to-go package.

 

At the moment, ARM DS-5 Community Edition has been perfectly integrated into Cocos Code IDE as well as its cool feature of C++ code debugging on Android device, meaning that you can debug game logic written with script languages or C++ in the same environment. At the same time, DS-5 provides a C++ development environment for Cocos Code IDE, so that developers can now develop some key or sensitive logic with C++, and then compile, pack and debug on Android devices.

 

03.jpg

 

DS-5 Community Edition tool kit, based on Professional Edition, provides necessary debugging and system analysis to create reliable and highly optimized applications for devices based on ARM processor without complexity and inefficiency that are often found on the scattered open source tools. With DS-5, Cocos Code IDE is equipped with a powerful C++ code debugging on Android platform feature, supporting debugging on Android devices based on ARM9/ARM11 and Cortex®-AARMv7-Aprocessor architecture. The efficient debugging of C++ logics will greatly accelerate the Android application development process.

 

04.jpg

 

Android platform performance analysis is another highlight of Cocos Code IDE 1.1.0. Developers just need to click “Start Capturing” in the data view of ARM Streamline to collect information of the target for analysis and click “Stop” to check the Performance Analysis Report when the analysis is finished. Simple GPU / CPU functional analysis will list out the key information like the most time-consuming code segments, the most time-consuming function lists, etc. lowering the burden of GPU/CPU and uplifting the user experience.

 

The world of mobile Internet is changing dramatically every day, and so is the mobile development technology. Cocos Code IDE focuses on script game development based on Cocos2d-x engine, and is motivated to enable a smoother and faster development process and a better development experience; Cocos Code IDE also actively embraces opportunities to combine advanced features from others to help developers gain a favorable position in this rapidly changing market.

 

Chinese Version 中文版: 使用DS-5从FVP中收集Trace数据

One of the new features for ARM DS-5 Development Studio in v5.20 is instruction trace for our Fixed Virtual Platform (FVP) simulation models. This enables you to capture a trace of program execution in the models that are included by default in DS-5: an ARMv7 FVP (in Professional Edition and Ultimate Edition) and ARMv8 FVP (in Ultimate Edition). If you want to try it out, you can download DS-5 Ultimate Edition and generate a 30-day eval license.

 

What is trace and why is it useful?

 

Trace is the continuous collection of information which represents the execution of software on a system. In real hardware, it is non-invasive, meaning it doesn't slow a processor down. The raw trace data is highly compressed and must be decompressed to be useful. In the case of DS-5, it helps us to see the proportion of time spent in a function, along with the machine instructions which were executed at any point in the trace capture. ARM's debug and trace infrastructure is called CoreSight (which is not modeled directly in our Fast Models and FVPs).

 

It is used at every stage of a design process, from modelling the system through to in-field failure analysis. Trace is particularly useful for bare-metal or Linux kernel debugging.

 

Since ARM Fast Models and FVPs are instruction accurate, collecting instruction trace is a natural extension to the debug functionality that DS-5 already provides for models, making the experience as close as possible to writing software for a real device. It's important to remember that Fast Models and FVPs are not cycle accurate, so the actual time it takes to execute a function won't correlate with the time it would take on the real silicon (though the proportion of time spent in that function would remain fairly consistent). Now that debug configurations have been added for the three previously mentioned FVPs, it’s easy to use it in practice. Currently, there is no support for adding trace to other models in DS-5 or for collection of data trace (where address and register values are also recorded from load and store instructions). Edit - if you have your own Fast Model based platform and would like to see trace supported on it in DS-5, then please contact ARM.

 

Making use of model trace

 

You will notice in the DS-5 Debug Configurations panel that model trace is only available for bare-metal and Linux kernel debug connections. Trace isn't the right solution for debugging Linux applications, as the extra level of complexity that an OS adds would mean sifting through an unmanageable amount of trace data.

 

Model-Trace-Debug-Configuration.png

 

The best way to test out model trace is to import an example. There are bare-metal examples for everything from simple Hello World programs to more complex RTOS programs. In the example below, I've imported the "traffic lights" program which runs on the Keil RTX RTOS.

 

All of our FVP bare-metal examples have been reconfigured to collect model trace, but you can also use your own images. Once trace is configured, it behaves exactly like trace on real hardware, using the same trace view in DS-5.

 

Our trace is being collected into a circular buffer. This is fairly common in ARM SoCs, which use an Embedded Trace Buffer (ETB) to collect a record of software execution, which is constantly overwritten and refreshed (or alternatively, just filled once).

 

Trace will start automatically whenever you run through a debug session, unless you set trace start and stop points manually. This can be useful for just tracing the function that you’re interested in. In the screenshot below, you can see the trace collected on start-up of the RTX Traffic Lights example (which was set to debug from main):

 

Model-Trace-Initial-Collection.png

 

It’s important to note that DS-5 doesn't automatically overwrite the contents of the trace view each time you collect more trace. If you've set trace start and stop points, clearing the trace view before running will show only the trace between these two points. In this example, I've started collecting trace when the traffic light timer is between the defined start and end times that the user sets in the program.

 

Model-Trace-Start-Point.png

 

How does model trace work?

 

Model trace isn't a direct model of CoreSight. Instead, it collects model events (instructions executed and exceptions) directly. DS-5 then interprets this and displays it in the same trace view used for CoreSight trace. Model trace will slow down the execution of the model on your host machine, unlike a real life system, where the CoreSight infrastructure reports trace non-invasively.

My family are American, and this is a time of year when their thoughts turn towards the family and friends in the USA who are celebrating the Thanksgiving holiday.  For me, late November also happens to coincide with the biannual release of Fast Models.

 

It has been a busy six months for the team who have been working on a wide and varied range of models, as well as new functionality and product enhancements.  As you can infer from the the title, there has been a big emphasis on Cortex-M class models in this release cycle.  Although the majority of Fast Model licensees are deploying them into platforms utilizing the Cortex-A models, there is a sizable contingent of users for the Cortex-M family as well.  Cortex-M7 is the recently announced high-end micro-controller, whereas Cortex-M0 and Cortex-M0+ are more mature cores with very small footprints.  From a modelling point of view they both leverage the new Cortex-M architecture model which will also form the basis of other new models to be announced in 2015.

 

There are several other new models being made available alongside this release, for Media and System IP products.  These are available to lead partners, and will be in due course introduced to the standard portfolio.

 

Outside of the new models, the main focus for work in this release has been around performance improvements.  There have been three aspects to this: improvements to the underlying simulation engine, improvements in the bridges to SystemC and improvements in the way that Fast Models interacts with the host workstation keyboard and mouse.  The results of this work - and it's an ongoing task to maintain performance as the systems being simulated become more complex - will yield benefits for most partners and most applications.

 

We also added support for Visual Studio 2013 and gcc 4.7.2 as tool chains for building the simulation platforms.  Leveraging newer compilers also provides performance improvements as they generate more optimal code.

 

Another area that has been worked on is on the link with the ARM DS-5 tool suite.  The latest release of DS-5 (5.20) provides support for viewing trace information generated by a Virtual Platform with Fast Models.

 

2014 was a a busy year, both in product development and supporting the rapidly growing adoption of Virtual Prototypes as part of the SoC development process.  Our ecosystem partners have continued to integrate Fast Models into their solutions in a variety of ways.  One that has generated a lot of interest in 2014 has been "Hybrid" virtual platforms, otherwise known as co-emulation.  In these a processor subsystem running in the virtual prototype is bridged to an emulator which is used to simulate other parts of the system. A typical scenario would be for platforms that have a GPU.  The hybrid approach has yielded impressive performance gains for simulating these complex systems.

 

You can get an overview of what we are talking about here (a joint presentation with Cadence at the ARM TechCon last month): Reducing Time to Point of Interest With Accelerated OS Boot

 

Now have a moment or two to draw breath before diving into the development cycle for the 2015 releases.  We have a full road map of new products to model, a focus on providing more hooks in the models for profiling the software running on them and of course, the ongoing performance work.

 

Happy Thanksgiving!

R for Real-time

We are very excited to have partnered with Renesas Electronics to introduce support for the recently announced Renesas RZ/T1 product series in ARM DS-5 Development Studio. The new device family comprises ten ARM Powered® products aimed at industrial applications that require both high performance and real-time predictability. Based on an ARM® Cortex®-R4 processor operating at up to 600 MHz, the product line also includes configurations that feature a Cortex-M3 core to enable highly integrated asymmetrical multi-processing (AMP) applications. Visit Renesas website (English/Japanese) if you want to learn more about the RZ/T1 series.

 

Renesas RZ/T1 in DS-5

ARM DS-5 is a complete software development tools solution for RZ/T1 users. It includes efficient C/C++ code generation for both ARM cores and full support for synchronous and asynchronous AMP debug.

 

Some benefits of DS-5 for RZ/T1:

  • ARM Compiler 5: industry reference C/C++ compiler for Cortex-R4 and Cortex-M3 processors, compatible with the widest range of RTOS, middleware and third-party tools
  • Simultaneous debug connections to both ARM processors
  • Collection, decode, synchronization and visualization of trace data from ARM CoreSight™ ETM (Embedded Trace Macrocell) and ITM (Instrumentation Trace Macrocell) units for faster bug finding
  • TÜV SÜD certified compiler and compiler qualification documentation for functional safety certification
  • Built-in OS awareness for leading commercial real time operating systems (RTOS)


Renesas-RZT1-configDB.PNG



Target connections for your every need

Depending on the stage you are in the software development project and your budget, you may select different technologies to connect DS-5 to your RZ/T1 target. See below summary of the target connection options, for you to pick the right one for your needs.

 

Target connection typeBest forTrace capture
DSTREAMBoard bring-up, high-performance debug and ETM trace-based analysis

off-chip (4 GB DSTREAM)

on-chip (4 KB ETB)

ULINKproFast software debug with on chip trace  (Note: ULINKpro trace is not supported in DS-5)on-chip (4 KB ETB)
ULINKpro DFast software debug with on-chip traceon-chip (4 KB ETB)
ULINK2Basic software debugon-chip (4 KB ETB)
CMSIS-DAPSilicon evaluation on development boards (USB connection to board, no debug hardware required)on-chip (4 KB ETB)

Renesas-RZT1-target-connection.png

 

Availability

RZ/T1 platform configuration file is available for DS-5 version 5.20 users upon request. If you require access to it now, get in touch.

Hi, It would be really useful for the team here at ARM, if you could have a few moments to complete a short survey of your experiance with Juno ARM's Development Platform for ARMv8. I can feed this back into the requirements for future plaftoms and also try to address any issues you incountered either with the software or hardware. Click here to complete survey

 

Thanks,

Liam

Filter Blog

By date:
By tag: