The Cortex-M4F has separate hardware for integer and floating-point arithmetic. Both integer and floating-point divide instructions take up to 12 clock cycles to complete. I've verified that integer instructions immediately following a VDIV are able to execute simultaneously while the VDIV is finishing. However, the reverse does not seem to be true - i.e., floating-point instructions immediately following an integer divide (SDIV or UDIV) must wait for the divide to complete before the floating-point instructions proceed. Does anyone know why the you can't overlap the execution of an integer divide with the execution of floating-point instructions?
I am pretty sure it is a pipeline thing. I remember a Doulos webinar explaining the CM4 pipeline, but I am not sure if there is a public version.
View all questions in Cortex-M / M-Profile forum