This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Time cost change hugely on some function routine on RealView Versatile EB with ARM11 MPCore

Time cost of some huge functions (execute 1000 times) like match lib function( take acos for example ) changed hugely.

I run this test on a RTOS,  though it is SMP,  I  only enable one core, and before invoke the fucntion, I disable interrupts and lock the task switch. So there should be no interrupts during the execution of the function.

I add some monitor code at this test,  it seems when the time cost become longer, the count of instruction cache miss become bigger, and "Stall because instruction buffer cannot deliver an instruction" become bigger. So it seems this is related with instruction cache.

Then I add instruction cache invalidation operation before the function,  the time cost become steady.

I see that ARM11 MPCore is using a round-robin cache replacement policy. So I think there should be no time difference whether adding the cache invalidation operation.

Who can help me with this issue? Thanks!

Parents
  • It's perhaps worth a general comment that cached A class cores are non deterministic by nature. What's in the cache at a given point isn't guaranteed, because of things like speculative accesses and prefetching. (i.e. things outside your control in software). It's unsurprising that invalidating the cache gives more consistent (but presumably worse) performance. Generally cached cores aim to give *much* better performance in the general case. However, it's possible to have a pathalogical bit of code that demonstrates occasional poor performance. Possibly it's something specific about the code or code structure you can change, possibly not.

Reply
  • It's perhaps worth a general comment that cached A class cores are non deterministic by nature. What's in the cache at a given point isn't guaranteed, because of things like speculative accesses and prefetching. (i.e. things outside your control in software). It's unsurprising that invalidating the cache gives more consistent (but presumably worse) performance. Generally cached cores aim to give *much* better performance in the general case. However, it's possible to have a pathalogical bit of code that demonstrates occasional poor performance. Possibly it's something specific about the code or code structure you can change, possibly not.

Children