Hey,
on our development board we use PCIe to exchange data between the two Tegras on a NVIDIA Drive PX2.
Basically the data coming across NT ports acts like a DMA engine writing to system RAM. With an interface function from the API of the PCIe-chip we allocate memory. In its definition the API-function uses "dma_alloc_coherent" from the Linux kernel. In our application we now can use the address of the allocated memory area and do our work. With memory barriers the right order of execution between reads/writes is guaranteed.
We are facing the problem, that (how it looks) new data is not polled out of the RAM and we read old data from the CPUs cache. Unfortunately the MMU is disabled, as we can't use the PCIe-driver when it is activated.
I have come across this document about Cache coherency but I am not exactly sure, if this can help us. In addition I am a complete newbie with programming on ARM on such a low level.
Any help is appreciated, thanks in advance.
Jan
Thank you all for your help!
With
asm volatile("dc civac, %0" : : "r" (addr) : "memory");
Note: Only cleaning or invalidating the cache didn't to the trick, we must do both.
Truly sounds weird to me unless you run "dc civac" before getting data from the PCIe. If you have run it after and the PCIe isn't part of coherency you may have overridden what the PCIe sent over. Or another unless you also have the outer caches as L3 then "dc civac" should push data up to L3 and to clean and invalidate from there you should use the power transitioning mechanism of the cache coherency interconnect, that is for my CCN I use a power transition from FAM (Fully Associative L3 Memory) to NOL3 (No L3).
Either way good "dc civac" worked with you.
It is indeed used before getting the data.
Like I said before, I am a newbie with low level programming on ARM. I should read some docs on caches and co...