This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Accurate cycles measurement

Dear experts,

I am currently trying to measure the cycles required to context switch between two linux processes and the cycles required to world-switch between two linux VMs running above a thin bare-metal hypervisor. For this purpose, I am using the PMCCNTR (cycle counter register) after enabling/disabling it in the PMCR and set it to increment once every 64 cycles.

- Does the value of the  PMCCNTR reset when switching from the user mode to the kernel mode or even to the hypervisor or it keeps its value when migrating between different provilegeslevels until it is reset by the user. How about when switching between different same-privilege contexts?

- To get accurate and reliable results ,   I have to to disable interrupts when measurements are made, disable preemption(in the case of linux processes), and insert serializing instruction  barriers at the front-end and back-end of the profiled code. What else shall I pay attention to?

- When calculating the total time how does the Linux kernel optimizations such as the DVFS  influence the result? Do I need to turn off the DVFS optimization? Any suggestion.

- The PMCCNTR is a 32-bit register so it can actually count  4294967296 * 64=276690969984, is this a concern (e.g. overflow), is it advisable to use the 64-bit architected counter/timer instead? Any advanced tutorial on the usage of the 64-bit architected counter for accurate profiling. Does the 64-bit system architected counter run at frequency of the processor, how can I get the procesor cycles from the architected counter which to my knoweldge run a lower frequency?

Thank you so much.