• Coding for Neon - Part 4: Shifting Left and Right

    Martyn
    Martyn

    Chinese Version 中文版:NEON编码 - 第4部分: 左右移位

    This article introduces the shifting operations provided by Neon, and shows how they can be used to convert image data between commonly used color depths.

    Previous articles in this series:

    • Part 1: Loads and S…
    • over 7 years ago
    • Processors
    • Processors blog
  • What is eXecute-Only-Memory (XOM)?

    Joseph Yiu
    Joseph Yiu

    An introduction to eXecute-only-Memory

    eXecute-Only-Memory (XOM) is a firmware protection technique to help prevent 3rd parties from stealing or reverse engineering firmware, and at the same time allowing 3rd parties to add additional software to the…

    • over 3 years ago
    • Processors
    • Processors blog
  • Function Parameters on 32-bit Arm

    Niall Cooling
    Niall Cooling

    Function call basics

    Typically when teaching a class about embedded C programming, one of the early questions we ask is "Where does the memory come from for function arguments?"

    Take, for example, the following simple C function:

    void test_function…

    • over 6 years ago
    • Processors
    • Processors blog
  • LZ4 decompression routine for Cortex-M0 and later

    Jens Bauer
    Jens Bauer

    Introduction

    Recently I spoke about a LZ4 decompression routine I converted from 6502 code into a Arm Cortex-M0 code.

    For some reason, I could not find my decompression routine, so I decided to convert it again. The result is below; the routine is now…

    • lz4cut.tar.bz2
    • over 4 years ago
    • Processors
    • Processors blog
  • Statistical Profiling Extension for ARMv8-A

    Michael Williams
    Michael Williams

    The Statistical Profiling Extension is an optional feature in ARMv8.2. This article will provide an overview of the Extension, describe how it works, and the advantages it provides over other profiling mechanisms.

    Recently, Will Deacon posted a request…

    • over 3 years ago
    • Processors
    • Processors blog
  • A fairly quick Count Leading Zeroes for Cortex-M0

    Jens Bauer
    Jens Bauer

    The Basics

    Some of us need to find out how many leading zero-bits there are in a 32-bit word. Such a feature is useful on many occasions, especially when writing a fast divide subroutine.

    The Cortex-M3 and later have a CLZ instruction which can count…

    • over 6 years ago
    • Processors
    • Processors blog
  • How to debug: CoreSight basics (Part 3)

    Eoin McCann
    Eoin McCann

    This is the third in a series of blogs that gives a technical introduction to the ARM CoreSight Debug and Trace technology and architecture. You can check out my previous blogs How to debug: CoreSight basics (Part 1) and How to debug: CoreSight basics…

    • over 5 years ago
    • Processors
    • Processors blog
  • Detecting Overflow from MUL

    Jacob Bramley
    Jacob Bramley

    Detecting Overflow from Arithmetic Operations

    I discussed in a previous blog post that it is possible to set some condition flags based on the result of an arithmetic operation. Consider the following code:

    adds    r0, r0, r1
    bvs     <some_address>
    
    …
    • over 7 years ago
    • Processors
    • Processors blog
  • Extended System Coherency: Part 2 - Implementation, big.LITTLE, GPU Compute and Enterprise

    Neil Parris
    Neil Parris

    Chinese Version中文版:扩展系统一致性 - 第 2 部分 - 实施、big.LITTLE、GPU 计算和企业级应用

    This is the second part of a series of blogs about hardware coherency. In the first blog I introduced the fundamentals of cache coherency. This part talks about the implementation of hardware…

    • over 6 years ago
    • Processors
    • Processors blog
  • How to debug: CoreSight basics (Part 1)

    Eoin McCann
    Eoin McCann

    Let's be honest, debug can be a bit of a pain. At the best of times it's a nuisance and in the worst case scenario a complex web of wires that need to be configured properly in order to diagnose and solve your SoC design problems.

    A study conducted…

    • over 5 years ago
    • Processors
    • Processors blog
  • Page Colouring on ARMv6 (and a bit on ARMv7)

    Jacob Bramley
    Jacob Bramley

    Page colouring is a technique for allocating pages for an MMU such that the pages exist in the cache in a particular order. The technique is sometimes used as an optimization (and is not specific to ARM), but as a result of the cache architecture some…

    • over 7 years ago
    • Processors
    • Processors blog
  • Divide and Conquer

    Chris Shore
    Chris Shore

    Division on ARM Cores

    “At the end of the day, we must go forward with hope and not backward by fear and division.” – Jesse Jackson.

    It often surprises me how many people believe that “ARM doesn’t do division” or “ARM cores don’t have…

    • over 6 years ago
    • Processors
    • Processors blog
  • Virtualization on ARM with Xen

    Andrew Wafaa
    Andrew Wafaa

    With ARM entering the server space, a key technology in play in this segment is Virtualization. Virtualization is not a tool solely for servers and the data center, it is also used in the embedded space in segments like automotive and it is also starting…

    • 2972.zip
    • over 6 years ago
    • Processors
    • Processors blog
  • Condition Codes 1: Condition Flags and Codes

    Jacob Bramley
    Jacob Bramley
    This post is part of a series:
    • Condition Codes 1: Condition Flags and Codes
    • Condition Codes 2: Conditional Execution
    • Condition Codes 3: Conditional Execution in Thumb-2
    • Condition Codes 4: Floating-Point Comparison Using VFP

    Every practical…

    • ccdemo.tar.gz
    • over 7 years ago
    • Processors
    • Processors blog
  • Memory access ordering part 2: Barriers and the Linux kernel

    Leif Lindholm
    Leif Lindholm

    My previous post provided an introduction to the concept of memory access ordering. It did not however provide any solution to the problem, or necessarily specify where such ordering can be significant.

    Now, not all software developers need to be deeply…

    • 1940.zip
    • over 7 years ago
    • Processors
    • Processors blog
  • Using the Stack in AArch64: Implementing Push and Pop

    Jacob Bramley
    Jacob Bramley

    As described in my last article, AArch64 performs stack pointer alignment checks in hardware. In particular, whenever the stack pointer is used as the base register in an address operand, it must have 16-byte alignment.

    The alignment checks can be very…

    • over 5 years ago
    • Processors
    • Processors blog
  • Condition Codes 3: Conditional Execution in Thumb-2

    Jacob Bramley
    Jacob Bramley
    This post is part of a series:
    • Condition Codes 1: Condition Flags and Codes
    • Condition Codes 2: Conditional Execution
    • Condition Codes 3: Conditional Execution in Thumb-2
    • Condition Codes 4: Floating-Point Comparison Using VFP

    Note: Armv8 deprecates…

    • over 7 years ago
    • Processors
    • Processors blog
  • Getting Started with Arm Microcontrollers and Assembly Programming

    Laxmi Kant Tiwari
    Laxmi Kant Tiwari

    Hello and I welcome you to my Arm programming tutorial series. I would like to give a big thank you to Abhishek Agrawal, a Final Year Undergraduate Student at IIT Kharagpur for his help to complete this blog.

    Let’s start with basics. RISC machines have…

    • over 6 years ago
    • Processors
    • Processors blog
  • Branch and Call Sequences Explained

    Jacob Bramley
    Jacob Bramley

    What Does a Branch Do?

    A branch, quite simply, is a break in the sequential flow of instructions that the processor is executing. Some other architectures call them jumps, but they're essentially the same thing. The following is a trivial, and hopefully…

    • over 7 years ago
    • Processors
    • Processors blog
  • Coding for Neon - Part 3: Matrix Multiplication

    Martyn
    Martyn

    In part 1 of this series we dealt with how to load and store data with NEON, and part 2 involved how to handle the leftovers resulting from vector processing. Let us move on to doing some useful data processing - multiplying matrices.

    Matrices

    In this…

    • matrix_asm_sched.s.txt.zip
    • over 7 years ago
    • Processors
    • Processors blog
  • Coding for Neon - Part 5: Rearranging Vectors

    Martyn
    Martyn

    This article describes the instructions provided by Neon for rearranging data within vectors. Previous articles in this series:

    • Part 1: Loads and Stores 
    • Part 2: Dealing with Leftovers
    • Part 3: Matrix Multiplication
    • Part 4: Shifting Left and Right

    In…

    • over 7 years ago
    • Processors
    • Processors blog
  • How to Call a Function from Arm Assembler

    Dave Butcher
    Dave Butcher

    Once you move beyond short sequences of optimised Arm assembler, the next likely step will be to managing more complex, optimised routines using macros and functions. Macros are good for short repeated sequences, but often quickly increase the size of…

    • over 7 years ago
    • Processors
    • Processors blog
  • Coding for Neon - Part 1: Load and Stores

    Martyn
    Martyn

    Arm's Neon technology is a 64/128-bit hybrid SIMD architecture designed to accelerate the performance of multimedia and signal processing applications, including video encoding and decoding, audio encoding and decoding, 3D graphics, speech and image…

    • over 7 years ago
    • Processors
    • Processors blog
  • Coding for Neon - Part 2: Dealing With Leftovers

    Martyn
    Martyn

    In part 1 of this series on Neon about loads and stores we looked at transferring data between the Neon processing unit and memory. In this post, we deal with an often encountered problem: input data that is not a multiple of the length of the vectors…

    • over 7 years ago
    • Processors
    • Processors blog
  • Running AlexNet on Raspberry Pi with Compute Library

    Gian Marco Iodice
    Gian Marco Iodice

    If you’d like to develop your Convolutional Neural Networks using just the Compute Library and a Raspberry Pi, this step-by-step guide will show you how… and it comes complete with all the tools you’ll need to get up and running.

    If…

    • over 2 years ago
    • Processors
    • Processors blog
  • View related content from anywhere
  • More
  • Cancel
>