My question to you is: why do you care?
The important question isn't how many cycles happen from the "start" to the "end" of the pipeline, it's how many cycles need to pass in order to traverse a critical path from one pipeline stage to another. The fetch stage F1 has as its input dependencies its PC. So the worst case latency depends on when PC is resolved. This will be resolved by stage E5. So you won't see a critical patch from N6 to F1.
So you won't see a critical patch from N6 to F1.
View all questions in Arm Development Studio forum