Architectures and Processors forum ASIMD multiply-accumulate instruction

This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

ASIMD multiply-accumulate instruction

fansi over 9 years ago

Instruction Group	AArch64 Instructions	Exec Latency	Execution Throughput	Utilized Pipelines
ASIMD FP multiply accumulate, Q-form	VMLA,VMLS,VFMA,	9(4)	1	F0/F1

ASIMD multiply-accumulate pipelines support late-forwarding of accumulate operands from similar μops, allowing a typical sequence of floating-point multiply-accumulate μops to issue one every four cycles

(accumulate latency shown in parentheses).

(1)、in above description, what is the meaning of "late-forwarding"?

(2)、whan is the meaning of "allowing a typical sequence of floating-point multiply-accumulate μops to issue one every four cycles"？

Top replies

venkataramanan over 9 years ago +1

I believe in order to issue FMA operation you don't need all the three input operands to be ready. It can start when multiply operands are ready. Later It can take that accumulate operand when it finished...