This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Cycle Measurement for CortexA8 on Beagle Board

Note: This was originally posted on 12th May 2010 at http://forums.arm.com

Hi all,
      I'm working on CortexA8 Hardware(Beagle Board). I want to measure the cycles that it takes for my function using the cycle count register that is present in the CP15 coprocessor register.  I made total memory into 1MB descriptors, a total of 4096 and set the permissions for each descriptor and enabled the MMU and cache. I'm trying to get the cycles that it takes for my function. But when I'm trying to do this I got more cycles than what I expect.

My function is
 
void main()
{
      int count1, count2;

      //Enabled cycle counter here
     
      count1 = read_ccount();    // That returns the cycles count value from the performance monitor register
    
      func();
     
      count2 = read_ccount();    // That returns the cycles count value from the performance monitor register

      TotalCycles = count2 - count1;
}

   .global func

func:
     MOV     r4, #7
     MOV     r5, #2
     MOV     r6, #9
     MOV     r2, #60000
    
nextt1:
     MUL     r7, r5, r4
     MUL     r4, r6, r6
     MUL     r9, r5, r6
     MUL     r7, r6, r5
     SUBS    r2, r2, #1
     MUL     r8, r8, r4
     MUL     r9, r9, r5
     BNE     nextt1    

     MOV     r0, r8
    
    bx lr
    .end
      
        The cycles that I got using cycle count register is 960745. But Manually, if I calculate it should take around 900000 as
 
  6 Mul's * 2 cycles each * 60,000 loop count      = 720000
  1 sub   * 1 cycle each  * 60,000 loop count       =   60000
  1 branch * 2 cycles each * 60,000 loop  count  = 120000
                                                  -------------------------------------
                                                 Total                   = 900000(appr)

                       Is the cycles which I'm getting using cycle counter is correct or  is I'm missing any thing in calculation. Can any one help in this.
                         
Thanks in advance,

with regards,
Raghavendra.M
Parents
  • Note: This was originally posted on 17th May 2010 at http://forums.arm.com

    Cycle count tables in the TRM often hide some of the detail for sake of clarity. You have almost the right answer (only off by one cycle per loop - the extra 745 cycles is probably just down to cache misses and the initial overhead of calling the function and reading the performance counter).

    It's pretty rare to get cycle counts on a modern core which are exactly right because in reality the hardware isn't as simple as the cycle timing tables in the manual make out.
Reply
  • Note: This was originally posted on 17th May 2010 at http://forums.arm.com

    Cycle count tables in the TRM often hide some of the detail for sake of clarity. You have almost the right answer (only off by one cycle per loop - the extra 745 cycles is probably just down to cache misses and the initial overhead of calling the function and reading the performance counter).

    It's pretty rare to get cycle counts on a modern core which are exactly right because in reality the hardware isn't as simple as the cycle timing tables in the manual make out.
Children
No data