Thank you for your reply. A few more questions:
Is Dn a 128-bit wide register? Is Dd also a 128-bit wide register? (Referring to the diagram in the original question)
Also, the diagram shows 4 parallel operations. Is this the actual number of parallel operations…