• floating point performance benchmark
    My benchmark of an Arm processor ( arm64-v8a) on a Householder QR decomposition, written in assembly language, is yielding a computation rate of about 2 Gflops/second. This seems exceptionally high and...
  • Performance of Memory Benchmark Slowly Improves on Cortex-A72
    I have a 16-core machine with Cortex-A72 processors. The physical layout is shown at the end of the post. Each core has its own 48KB L1i cache and 32KB L1d cache. Clusters of 4 cores have a shared 2048KB...
  • Cortex M3 - Conditions for IT folding
    Hi folks, Some weeks ago, I discover the mechanism of IT instruction folding supported by the cortex-M3. As mentionned in 'Cortex-M3 Devices Generic User Guide', "In some situations, the processor can...
  • Cortex-M3:Little endian
    Hi All, The cortex M3 in STM32F100xx devices stores in little endian format. Does this mean that even the memory locations which I see in KEIL window are in the little endian format? For eg: 0x20000AFF...
  • Instruction timings - arm cortex m3
    I am using the following 3 assembly sections to read a memory mapped i/o to multiple registers and to read same i/o and save it ram respectively , on an ARM Cortex M3. I want to know exactly how many...