I am bit confused with neon pipeline stages. Can anyone explain the cortex a8 neon pipeline with some examples? I want to know how the multi cycles instructions paired in the neon pipeline (schedulung)
with other instructions and how the instruction cycles justified with the pipeline stages pairing.
If I have two independent vector muls one after the other, as cortex-a8 is 10 stage neon pipeline and these muls are independent, what I understood is if the first VMUL at N6 th stage, then second vmul
should be at N5, if this is the case, each mul take 1 cycle. But each VMUL taking 4 cycles, how this is explained with pipeline. I didn't find related information.
Can anyone help me to understand the multicycle instruction timing. Are these extra cycles avoided by using any other neon instructions in between these multicycle instructions?
What does mean "The NEON engine can potentially dual issue on both the first and last cycle of a multi-cycle instruction, but not on any of the intermediate cycles." What is first and last cycles here,
can exaplain with example?
Thanks
Ramesh