This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

A53: PMU - BUS_ACCESS_LD - Write-Streaming

I am using the Cortex-A53 processor (Xilinx Zynq Ultrascale+ SoC).

I have a problem that I get high BUS_ACCESS_LD count with write-streaming/read-allocate mode if I do a memset (it is a self-written memset in assembly). On the Xilinx chip I can also measure write byte count and read byte count to the DDR memory controller ports and I can see that the actual read byte count is not that high.

Testcase 1: memset of 1.085.440 bytes, write-streaming disabled:

L2D_CACHE: 32776
BUS_ACCESS_LD: 65547
L1D_CACHE_REFILL: 16386
L1D_DACHE_WB: 16386
L2D_CACHE_REFILL: 16388
L2D_CACHE_WB: 16375


DDRC.S1 Write Byte Count: 524160
DDRC.S1 Read Byte Count: 524480
DDRC.S2 Write Byte Count: 523840
DDRC.S2 Read Byte Count: 524416

One cacheline is 64 bytes. BUS_ACCESS counts beats, data width of the bus is 16 bytes. These values seem to make sense.

Testcase 2: memset of 1.085.440 bytes, write-streaming enabled:

L2D_CACHE: 16388
BUS_ACCESS_LD: 16419
L1D_CACHE_REFILL: 6
L1D_DACHE_WB: 0
L2D_CACHE_REFILL: 9
L2D_CACHE_WB: 16255


DDRC.S1 Write Byte Count: 520128
DDRC.S1 Read Byte Count: 384
DDRC.S2 Write Byte Count: 520192
DDRC.S2 Read Byte Count: 64

The count values of L2D cache access, L2D cache write-back and BUS_ACCESS_LD are close together. It makes sense that cache refill is low and L1 write-back is also low. But I do not understand why BUS_ACCESS_LD is so large in this case. I can see that on the DDR memory controller ports there are only a few bytes read.

There is an errata notice for the Cortex-A53 regarding "PMU counter values might be inaccurate when monitoring certain events". But only BUS_ACCESS and BUS_ACCESS_ST are mentioned there. Is there an error with BUS_ACCESS_LD and write-streaming?