performance monitor unit real time related

background: i tried to get data cache access times by set pmu counter event as 0x4 on cortex-a9 processor.

question:

  • the total time seems to be right (read and write for many times in a loop),
  • but when i check the counter each time  after r/w, i got a constant value.
  • then i continue run for a while, the counter's value incresing again

is there a time constrain between r/w and counter value updating? can i flush it handly?