What is considered "good" and "average" pipeline utilization?

yaniv.sapir 2 months ago

We are using an ARM Cortex-M4 in our application. Recently I was dealing with optimized critical DSP code. The code is written in C and compiled for the target using ARM Compiler 6 (armclang). When testing, I get a cycle count with is considerably more than expected. Peeping into the disassembly, it looks like the generated code is pretty good, which makes me assume the difference comes from pipeline stalls.

Assuming a small function is preformed in an interrupt-free environment, what would be considered a good utilization of the pipeline, that I can expect from compiled code?

As a baseline - in our case, the C code is translated to ~3200 (LDRSH, SMLABB) instructions, while the execution time is ~5000 cycles.

While there - Using the ARM Developer Studio environment and a DSTREAM debugger, is it possible to observe individual pipeline stages, and see where bubbles are formed?

Top replies

vstehle 2 months ago +2 suggested

Dear yaniv.sapir , The Cortex-M4 TRM has a table of the instructions timings where you can see that cycles per instructions vary from 1 to more than 10 in some cases. The memory operations might also...