These days I'm using Xilinx SoC to design a software, which shares memory between Cortex-A cores and FPGA.
I've tried reserve memory in Linux and mmap() /dev/mem. The problem is if I use O_SYNC, it very slow since
my software access every byte computed by FPGA many times, and it seems that while using O_SYNC for
open(), the physical memory mapped uncacheable.
So I want to use cacheable memory and manage synchronization by code. Is there anyway to flush/invalidate
cache in user space for Cortex-A9 running Linux?
No; you'll need a kernel-side device driver if you want to do cache maintenance.