We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
I am studying about cache effect using a simple micro-benchmark.
I think that if N is bigger than cache size, then cache have a miss operation every first reading cache line. (Show 1.)
In my board(Arndale-5250), cache line size=64Byte, so I think totally cache occur N/8 miss operation and cache grind show that. (Show 2.)
However, streamline tool displays different result. It only occur 21,373 cache miss operations. (Show 3.)
I am doubted about hardware prefetch, however I can't check any value through the counter in streamline tool.
I really don't know why streamline tool's cache miss occur very small operations than "cachegrind". Could someone give me a reasonable explanation?
#include <stdio.h>
#define N 10000000
static int A[N];
int main(){
int i;
double temp=0.0;
for (i=0 ; i<N ; i++){
temp = A[i]*A[i];
}
return 0;