Browse By Tags

  • Vulkan Samples: Bandwidth and Throughput Optimizations for Mobile

    Vulkan is changing the landscape of graphics, ushering in a new age of visual fidelity for Android devices. While powerful, the Vulkan API can be quite complex for mobile developers. Therefore, at GDC 2019, Arm released a set of Vulkan samples that illustrated…

  • Arm Mali Best Practices 2.1 Released

    The latest update of the Arm Mali Best Practices Developer Guide is now available. It has a number of important updates for mobile developers to read through, so they can get the most out of their projects.

    First, the guide has been released alongside…

  • Automated Performance Advice for Android Games

    Mobile gaming is advancing rapidly in terms of sophistication, with increasing numbers of successful high-fidelity 3D games exploiting the performance capabilities of the latest smartphones. As the complexity of content rises, the traditional approach…

  • Development tools support for Arm Custom Instructions

    In 2019, Arm Custom Instructions were announced. This is a new standard feature of the Armv8-M architecture, allowing developers to implement use case-specific workload acceleration, pushing performance and longevity of devices ready for the fifth wave of…

  • Partial register dependency neon

    I'm having trouble finding any informations on partial neon register dependencies.

    Take for example the following code:

    ld2 {v0.16b, v1.16b}[0], [x0]
    ld2 {v0.16b, v1.16b}[1], [x1]
    ld2 {v0.16b, v1.16b}[2], [x2]
    ...

    Does the second load have to wait…

  • Accelerate your shaders with Mali Offline Compiler 7.0

    Shader programs for OpenGL ES and Vulkan are one of the most important inputs an application provides to render a scene because they define the processing operations executed by the GPU shader core hardware. They are also one of the hardest aspects of…

  • compile time constant expressions

    Note: This was originally posted on 12th March 2009 at http://forums.arm.com

    I am seeing code like this generated from the armcc v4.0 compiler (just downloaded it a couple hours ago):

      MOV r0,#5
      CLZ r0,r0
      RSB r7,r0,#0x3f

    So I would think…
  • Optimization difference between C and C++

    I  noticed a optimization difference between compiling a simple source code with ARM GCC in C and C++. The C++ version seems to optimize a lot less stack usage.

    To demonstrate this problem, I compiled the following code with arm-none-eabi-gcc version gcc…

  • Stack usage on function call

    Hello,

    I am encountering a stack usage problem : if a function (or Statement expression  or Lambda) returning a structure is called directly as a argument of another function, a new structure is added on the stack and memory is not reused.

    The example…

  • How to use #Pragma to change the specific file optimization level in IAR

    Static clustering for a specific file is creating issue. i want to exclude that specific file from optimizing to high level to no optimization.

    #pragma optimize= none :-Tried before the function definition but it is not giving the same effect what excluding…

  • ARM Compiler 6 - Optimization guidelines

    In the V5 to V6 compiler migration document is the following with respect to optimzations

    -O0 No Optimization. Not recommended for use in ARM Compiler 6.6

    -O1 Limited Optimization. This is currently the recommended level for source level debugging.

  • BaseFVP: undef behavior or emulation bug?

    Version: Fast Models [11.4.37 (Jun 19 2018)] (Free) on Linux x86_64

    Repro: here.

    The program creates two binaries s.bin and ns.bin, and concatenates them into pkg.bin. The package is provided to the BaseFVP as its secureflashloader's file.

    All (four…

  • Arm keil optimization

    Hi sir,

    My IDE is arm keilv5. And I want to know what the compiler really do when the optimization is turn on(-o1 -o2 -o3 and cross module optimization).

    I just found some brief descriptions. But is there any documentation that describe it in detailed…

  • Code not working on LPC2148 board

    I bought an ARM LPC2148 Development board and started uploading code in it.  I wrote the code to blink the on-board 4 LEDs.  It gets uploaded successfully in the board but the LEDs never blink (leave alone the blinking according to the code).  I have rechecked…

  • Optimize Your Linux/Android System

    In this video learn how to use ARM DS-5™ Streamline to find out which processes, threads, functions, and even source code lines are slowing down your Linux/Android system by generating excessive number of CPU performance events such as cache misses and…

  • Android NDK options: What compiler flags should I use for my libraries and apps to get the best performance across the widest range of SoCs?

    Compiler options is one of those subjects that can get decidedly more complicated as you descend the rabbit hole. Undoubtedly, developers using or creating C/C++/Assembly libraries in Android are seeking to compile the most optimal binary for as many…

  • Using Streamline to Guide Cache Optimization

    Introduction

    Poor cache utilization is something which can have a big negative impact on performance and improving the utilization will typically have very little or no trade off. Unfortunately detecting poor cache utilization is often difficult to do…

  • Arm Guide for Unity Developers v3.1 is available

    Unity is a multi-platform game development engine used by the majority of game developers. It enables you to create and distribute 2D and 3D games and other graphics applications.

    At ARM, we care about game developers. We know we can now achieve console…

  • ARM NEON optimization

    Welcome to the ARM NEON optimization guide!

    1. Introduction

    After reading the article ARM NEON programming quick reference, I believe you have a basic understanding of ARM NEON programming. But when applying ARM NEON to a real-world applications, there…

  • Bridging the Gap Between Arm Physical IP and Academic Research

    Logic IP, such as Standard Cell (SC) libraries, are the foundation for the entire backend design and optimization flow in modern application-specific integrated circuit designs. Arm has long been building high-quality logic IP, such as SC libraries, with well…

  • A Deep Learning Survival Guide for Computer Architects

    Over the last few years, there’s been a terrific amount of interest in artificial intelligence, and specifically the branch of machine learning known as 'deep learning'. To help computer architects get “up to speed” on deep learning, I co…

  • 在持续集成系统中使用Arm Forge

    本文翻译自Continuous Integration with Arm Forge

    为了提高软件,特别是有多人参与的大型软件,的整合性和质量,持续集成(CI)在软件工程中广泛使用。随着代码的增多,优化的深入,高性能计算(HPC)应用也可以通过Jenkins之类的的CI框架来确保软件符合精度和性能的需求。

    CI工具其实就是管理项目的在一堆代码和依赖资源上进行并行构建,测试的机器人。他们可以和版本控制 软件,构建系统或者单元测试框架进行对接,以在开发流程中实现更好的集成系统。最后它还可以收集并聚合测试数据来显示应用的健康程度…

  • Continuous Integration with Arm Forge

    Continuous integration or CI is widely used in software engineering to improve software integration and quality, especially for large projects that involve a lot of developers. Naturally, high performance computing (HPC) applications can benefit from…

  • Three Dimensions in 3DIC - Part III

    In part I and part II of this blog series, Greg used Figure 1 below to partition the 3D space (no pun intended) into three parts, and discussed the left and middle portions encompassing '3D-SIC' and '3D-SoC'. In this third and final part of the blog, he…

  • Three Dimensions in 3DIC - Part II

    In the second of a three-part series exploring three-dimensional integrated circuits, Arm Research Fellow Greg Yeric moves onto what the future could hold for this technology.

    In the first part of this series, I used Figure 1 below to partition the 3D…