Questions about Write Interleaving Exclusion in AXI4 Protocol

I am reaching out to seek the actual reason that write data interleaving is not supported in the AXI4 protocol.

As far as I know, in the AXI4 protocol, write data interleaving is no longer supported. Some argue that the effort and resources required to support write data interleaving were too high, which led to its exclusion from the AXI4 protocol. However, when multiple masters with different transmission speeds attempt to perform write transactions simultaneously, the interconnect can become congested if write data interleaving is not supported. This could decrease the bandwidth and throughput of the bus.

So, what is the actual reason that write interleaving is not supported in the AXI4 protocol if there is potential performance degradation in some common scenarios? Are there any other pros and cons that need to be considered? And what exactly are the efforts and resources needed to support write interleaving?

  • I think your second paragraph already contains the answers to your questions.

    Write data interleaving was dropped from AXI4 because it was too complex to implement, and instead designers tended to buffer up write data before sending it all in one burst of consecutive transfers, so minimising the time the transfers might clash with another manager attempting to send data.

    The reason why interleaving was first proposed in AXI3 was for the performance/bandwidth reasons you listed, but this was not an option that was frequently implemented, so was dropped to simplify the protocol.

    As for what would be required to implement support for write data interleaving, I've never implemented such a system, but I'd imagine that AXI managers have the slightly easier job of just merging together the active write data streams from each processing thread, but AXI subordinates have a much more difficult job in that they need to be able to support the write interleaving depth number of data streams, either being able to accept and store a data transfer for any of the active transactions (so no use of bursts) or else locally buffering up the received data for each active stream and then using burst writes when the data for the transaction has all been received.

    Most systems I would imagine would have more subordinates than managers, so supporting interleaving has a bigger complexity impact than the simpler approach of buffering up write data in the manager before issuing the transaction.