Arm Community
Site
Search
User
Site
Search
User
Support forums
Arm Development Studio forum
program execution time in ARM Cortex-A9 processor
Jump...
Cancel
Locked
Locked
Replies
5 replies
Subscribers
119 subscribers
Views
6089 views
Users
0 members are here
Options
Share
More actions
Cancel
Related
How was your experience today?
This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion
program execution time in ARM Cortex-A9 processor
Irina Fedotova
over 12 years ago
Note: This was originally posted on 4th January 2013 at
http://forums.arm.com
I'm using ARM Cortex-A9 and trying to read the value from CCNT time counter through the assembly code. I am following this post
http://stackoverflow.com/questions/3247373/how-to-measure-program-execution-time-in-arm-cortex-a8-processor?answertab=oldest#tab-top
. In accordance with it, before I can read the value from timer, I have to enable counter, enable a 64-bit divider and clear overflows. These operations are performed by writing inside the appropriate registers (for instance, PMCR (Performance Monitro Control Register)). So, I am printed counter values in a loop to keep track how overflow occurs and I have this behavior:
[size="1"]1 (starts to incrementing after it was reset to zero)
4650
4858
4943
5023
...
... (incrementing...)
...
4293939054
4293939128 (overflow happens)
1602570
1602703
1602788
...
...
4293522911
4293522987
4293523062
4293523137
1186243
1186367
1186453
1186536
1186612
1186686
...
4293536300
4293536377
4293536456
4293536533
4293536612
1199090
1199209
1199295
1199373
1199453
1199530
....
and so forth.
[/size] Accordingly, I have a set of questions:
a) Which or the said above registers are used by the Linux kernel ? (how reliable is the information for further kernel versions). How safe can be the change of their values?
[size="3"]b ) What is the accurate value of CCNT frequency and how to get it? Unfortunately, I can't find the value in processor spec. However, dmesg says that [/size]
[ 0.000000] OMAP clocksource: GPTIMER2 at 24000000 Hz
[ 0.000000] sched_clock: 32 bits at 24MHz, resolution 41ns, wraps every 178956ms
[ 0.132855] Switching to clocksource gp timer
But identifying it manually, against the clock_gettime, gives me 7 MHz. So, why it is not 24 MHz as expected?
c) According to my first output, why after the overflow it starts not with zero, but from about 1 mil ?
d) Why without 64 divider am I getting wrong results? The value starts to jump this way:
...
134110099
134114934
134119656
302352300
302361825
302367135
...
2885588930
2885593776
2885598630
3053958670
3053966752
3053972232
...
261130096
261134909
429343853
429351487
429356735
I'd appreciate any help. Thanks
Parents
Peter Harris
over 12 years ago
Note: This was originally posted on 14th January 2013 at
http://forums.arm.com
One additional point on context switching:
Each core in an SMP system has a unique set of performance counters, so threads which move cores are going to see changeable results.
It may be useful to try "perf" on Linux - the "perf" infrastructure in Linux wraps the performance counters and performs suitable context switching of data when threads or processes are context switched / migrate cores.
Second point - why are you trying to second guess the OS in this case? OSes have time functions for a reason, and linux time functions usually give down to microsecond granularity. Exposing peripherals directly to user-space is usually a "bad idea" - the global timer is probably used by the OS itself if it is available.
HTH,
Iso
Cancel
Vote up
0
Vote down
Cancel
Reply
Peter Harris
over 12 years ago
Note: This was originally posted on 14th January 2013 at
http://forums.arm.com
One additional point on context switching:
Each core in an SMP system has a unique set of performance counters, so threads which move cores are going to see changeable results.
It may be useful to try "perf" on Linux - the "perf" infrastructure in Linux wraps the performance counters and performs suitable context switching of data when threads or processes are context switched / migrate cores.
Second point - why are you trying to second guess the OS in this case? OSes have time functions for a reason, and linux time functions usually give down to microsecond granularity. Exposing peripherals directly to user-space is usually a "bad idea" - the global timer is probably used by the OS itself if it is available.
HTH,
Iso
Cancel
Vote up
0
Vote down
Cancel
Children
No data