Is the FVP accurate in terms of measuring performance of programs? Is it cycle accurate? If I use clock_gettime to measure time taken on applications, is it meaningful? If not, is there an accurate way to measure performance of programs on the FVP?
Hello Ronan, as for relative comparisons: I'm wondering whether the performance of vector instructions is correctly captured by the fast models.To be more specific: I'm currently comparing an implementation of an algorithm that uses regular vector loads to one that uses gather memory accesses on the Corstone300 MPS2. Would the performance penalty that gather loads typcially incur already be included in the number of cycles reported by the Fast Model in this case? I'm currently comparing the cycle number reported by the PMU_CCNTR register, if it makes any difference.
Hi Eltro, I would not rely on the Fast Model for that level of accuracy. You may see that one implementation is 'better' than the other, but I would not rely on 'how much better', for exactly reasons like this (this is unfortunately the cost of keeping these models "fast").
Great, that is good to know. Of course I'll need to benchmark on real hardware in the end, but knowing that the numbers from the Fast Model are not completely off in this case is already quite helpful.