smlal r0, r1, r3, r4smlal r0, r1, r3, r4smlal takes 3 cycle, destination register is available in E5. So the first instruction releases r0, r1 at the cycle 3 + 5 = 8.