ARM Linux: Can I control cache flush and invalidation in user space?

These days I'm using Xilinx SoC to design a software, which shares memory between Cortex-A cores and FPGA.

I've tried reserve memory in Linux and mmap() /dev/mem. The problem is if I use O_SYNC, it very slow since

my software access every byte computed by FPGA many times, and it seems that while using O_SYNC for

open(), the physical memory mapped uncacheable.

So I want to use cacheable memory and manage synchronization by code. Is there anyway to flush/invalidate

cache in user space for Cortex-A9 running Linux?