This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

AXD cycle count problems

Note: This was originally posted on 23rd September 2009 at http://forums.arm.com

Hello everyone,
I recently use the AXD to study the instruction cycle of ARM926EJ-S processor.
Since it's said that ARM9EJ-S is the core of ARM926EJ-S,
I use both of the model to see the cycle count, and have some problems with them.

Here is the information from AXD with ARM9EJ-S:
~~~~~~~~~~~~~~~~instruction~~CoreCycle~~IDCycle~~IBus~~Idle~~DBus~~Total
add r8, pc, #0xc4~~~~~~1~~~~~~~~~4~~~~~~~~0~~~~~~3~~~~1~~~~0~~~~~4
ldmia r8, {r0,r1} ~~~~~~~2~~~~~~~~~6~~~~~~~~0~~~~~~4~~~~1~~~~1~~~~~6  

And the information with ARM9EJ-S is shown below:
~~~~~~~~~~~~~~~~instruction~~CoreCycle~~SEQ~~NONSEQ~~Idle~~Busy~~Total
add r8, pc, #0xc4~~~~~~1~~~~~~~~~4~~~~~~~~2~~~~~~2~~~~~~4~~~~0~~~~~8
ldmia r8, {r0,r1} ~~~~~~~2~~~~~~~~~6~~~~~~~~7~~~~~~3~~~~~~4~~~~0~~~~~14  

I suppose that it's reasonable for the same CoreCycle count, since the core processor architecture is the same.
However, use instruction 1 as an example,
it's quite strange that when ARM9EJ-S issue 3 memory request, (I suppose that it's because of the PCs sent out)
the ARM926EJ-S issue 2 SEQ and 2 NONSEQ request with the same instruction executed...
For the instruction 2 it becomes stranger that 5 SEQ and 1 NONSEQ requests are sent out in ARM926EJ-S!

I would expect that the IDCycle+IBus+DBus from ARM9EJ-S will somehow equal to the SEQ+NONSEQ from ARM926EJ-S, but it doesn't !
Can anyone tell me that is there anything I misunderstand with the information?
Thanks a lot!
  • Note: This was originally posted on 24th September 2009 at http://forums.arm.com

    Is there any document explaining what the transaction actually is?
    I would like to understand how those number comes.

    Thanks
  • Note: This was originally posted on 24th September 2009 at http://forums.arm.com

    Actually I want to know what the number shown in the "debug internal" means.
    Since I cannot find any document to explain how the number "SEQ" "NONSEQ" "IDLE" comes in ARM926EJ-S,
    I suppose that the number might have some relationships between those in ARM9EJ-S, but it seems that I am wrong.

    Besides, when I just examine about the number from ARM926EJ-S, another problem bothers me.
    I just list the scenario below:
    ~~~~~~~~~~~~~~~~instruction~~CoreCycle~~SEQ~~NONSEQ~~Idle~~Busy~~Total
    add r8, pc, #0xc4~~~~~~1~~~~~~~~~4~~~~~~~~2~~~~~~2~~~~~~4~~~~0~~~~~8
    ldmia r8, {r0,r1} ~~~~~~~2~~~~~~~~~6~~~~~~~~7~~~~~~3~~~~~~4~~~~0~~~~~14
    add r0,r0,r8~~~~~~~~~~3~~~~~~~~~7~~~~~~~~8~~~~~~3~~~~~~4~~~~0~~~~~15
    add r1, r1, r8~~~~~~~~~4~~~~~~~~~8~~~~~~~~8~~~~~~3~~~~~~4~~~~0~~~~~15
    subr11. r0, #1~~~~~~~~5~~~~~~~~~9~~~~~~~~8~~~~~~3~~~~~~4~~~~0~~~~~15

    First of all the number  "NONSEQ" in instruction 1 bother me, since I'm wondering why there is two "NONSEQ" if it just prefetches the instructions.
    Second, for instruction 3, 4, 5, it seems strange to me that no bus activities during that time (no insturction prefetch?).
    Besides, "Idle" term do not change at the same time. What's it about if no SEQ and no NONSEQ, and Idle does not change as well?
    I just cannot understand what's the actual meaning of the number "TOTAL" since its value equals SEQ+NONSEQ+Idle+Busy?

    ps: Some materials said that TOTAL is used to profile the total instruction of a function consumes.
    However I cannot find out  why from this scenario.

    Can someone give me some suggestions?
    Thanks a lot!
  • Note: This was originally posted on 25th September 2009 at http://forums.arm.com

    Thanks for the reply, but I still cannot explain the reason of the unchanged IDLE value.
    Furthermore, after I turn off the Icache, Dcache and MMU, it still seems strange to me.
    The number of cycle becomes:

    ~~~~~~~~~~~~~~~~instruction~~CoreCycle~~SEQ~~NONSEQ~~Idle~~Busy~~Total
    add r8, pc, #0xc4~~~~~~1~~~~~~~~~~4~~~~~~~~2~~~~~~1~~~~~~2~~~~0~~~~~5
    ldmia r8, {r0,r1} ~~~~~~~2~~~~~~~~~~6~~~~~~~~6~~~~~~3~~~~~~2~~~~0~~~~~11
    add r0,r0,r8~~~~~~~~~~3~~~~~~~~~~7~~~~~~~~7~~~~~~3~~~~~~2~~~~0~~~~~12
    add r1, r1, r8~~~~~~~~~4~~~~~~~~~~8~~~~~~~~7~~~~~~3~~~~~~2~~~~0~~~~~12
    sub r11. r0, #1~~~~~~~~5~~~~~~~~~9~~~~~~~~7~~~~~~3~~~~~~2~~~~0~~~~~12

    I suppose that the 1 NONSEQ and 2 SEQ in instruction 1 is the instruction fetch.
    However, for instruction 2, I cannot figure out why there are 2 NONSEQ there.
    I think that one of the NONSEQ is due to the ladmia, but how comes the other one? The fetch of instruction should be a SEQ one.
    Besides, since the instruction cache is turned off, it is strange to assume that the 4 new SEQ is due to instruction fetches since there should be no space for them.
    And for instruction 3, 4 & 5, the situations occur that all the number (SEQ, NONSEQ, Idle, Busy) unchanged.
    Where does the instruction come from at that time?

    Please give me some suggestions, thanks!
  • Note: This was originally posted on 27th September 2009 at http://forums.arm.com

    To Kalomama:

    I cannot understand what you means. I am serious in finding the meaning of those number, since I want to explain the behavior of the processor.

    Can anybody give me some suggestions in my last post? Thanks.
  • Note: This was originally posted on 27th September 2009 at http://forums.arm.com

    HI setose

    HOw are you? I have seen you axd cycle count problem. I hope that its is ok. if you Actually want to know what the number shown in the "debug internal" means.
    Since I cannot find any document to explain how the number SEQ  NONSEQ" "IDLE" comes in ARM926EJ-S,
    I suppose that the number might have some relationships between those in ARM9EJ-S, but it seems that I am wrong.



    Thanking you


    Kalomama












    -------------------------------
    [url="http://www.intelligenceverte.org/"]comment soigner les rhumatismes[/url]
  • Note: This was originally posted on 23rd September 2009 at http://forums.arm.com

    ARM9E defines the common pipeline shared by many of the ARM processors, but not the memory system interface. Many of the ARM9's have different memory system integrations, typically in terms of the cache implementation, so I would expect to see some differences.
  • Note: This was originally posted on 24th September 2009 at http://forums.arm.com

    > First of all the number "NONSEQ" in instruction 1 bother me, since I'm wondering why there is two "NONSEQ" if it just prefetches the instructions.

    Is the MMU and cache enabled in your test case?

    If so, then you may be seeing accesses caused by the MMU page table walks rather than just the instruction fetch.

    > Second, for instruction 3, 4, 5, it seems strange to me that no bus activities during that time (no instruction prefetch?)

    Well, if the instruction cache line has already been loaded by the first access, then there will be no need for the core to access external memory (up to 8 instructions will be in the cache line loaded, depending on the alignment of the code, if the cache is enabled).
  • Note: This was originally posted on 24th September 2009 at http://forums.arm.com

    I'm not sure how useful looking at the 9EJ-S cycles will be to you.  The 9EJ-S is the integer core inside a 926EJ-S.  So you would never (?) see a raw 9EJ-S, and what the 9EJ-S sees is the L1 memory system provided by the 926.