Hello all!
So I'm working on a development with a Cortex M4 and there is something i don't understand, I was hoping someone could help clarify this:
This is the code I' using
(Assume R3 content is 1, R6 R8 the address needed to set PIN1, and R11 R9 the address needed to set PIN2)
asm ( "CMP R3,#0 \n\t");
asm ( "BNE NCycles_CapDelay2 \n\t");
/*asm ( "NOP \n\t"
"NOP \n\t"
"NOP \n\t");*/<-----------------------------------
asm ( "STR R6,[R8] \n\t"); //PIN1 SET
asm ( "STR R11, [R9] \n\t" //PIN2 SET
asm ( "B SOMEWHERE_ELSE\n\t");
asm ("NCycles_CapDelay2: \n\t");
"STR R6,[R8] \n\t"); //PIN1 SET
asm ("LOOP_NCycles_CapDelay2: \n\t");
asm ( "SUBS R3, #1 \n\t");
asm ( "bne LOOP_NCycles_CapDelay2 \n\t");
The thing is: If i leave the NOPs commented, the time between PIN1 set and PIN2 set is 7 cycles, and if i UNcomment those NOPs, the time is 1 Cycle (measured externally with OSC)
And when R3=0, the time difference is 0 Cycles (UNcommented NOPs) to 1 Cycle (commented NOPs)
Any ideas with what is happening with the pipeline and conditional Branches here?
Thanks for any ideas.
BR
As you are running the asm through the C compiler rather than the assembler it's possible that the compiler is "optimizing" you code and you are not executing the exact code sequence you think you are.
It might be worth running both builds through fromelf or objdump to check what code is /actually/ being run.
Hello Peter, thanks for the quick response. You're right about the compiler. In the actual running code , STR is replaced with str.w and BNE is replaced with bne.n (cheked in dissasembly view in LPCXpresso) Those are the only differences. Either way, i still don't get why I get different performance with and without nop instrucions. Isn't it related to the pipeline?
Thank you!!
Can you provide disassembler's view of your code?
"""ASSUME R3=1"""1a000334: cpsid i1a000336: cmp r3, #01a000338: bne.n 0x1a00035c <NCycles_CapDelay2>1a00033a: nop <with these nops i've achieved 0 cycle betwwen PIN1 set & PIN2 Set - R3=0>1a00033c: nop <with these nops i've achieved 0 cycle betwwen PIN1 set & PIN2 Set - R3=0>1a00033e: nop <with these nops i've achieved 0 cycle betwwen PIN1 set & PIN2 Set - R3=0>1a000340: str.w r6, [r8] <PIN1 SET>1a000344: str.w r11, [r9] <PIN2 SET>1a000348: nop 1a00034a: nop 1a00034c: nop 1a00034e: nop 1a000350: nop 1a000352: str.w r11, [r9] <PIN2 CLR>1a000356: nop 1a000358: nop 1a00035a: b.n 0x1a000380 <ADC_CAPT2>1a00035c: nop <with these nops i've achieved 0 cycle betwwen PIN1 set & PIN2 Set - R3=1>1a00035e: nop <with these nops i've achieved 0 cycle betwwen PIN1 set & PIN2 Set - R3=1>1a000360: nop <with these nops i've achieved 0 cycle betwwen PIN1 set & PIN2 Set - R3=1>1a000362: str.w r6, [r8] <PIN1 SET>1a000366: subs r3, #11a000368: bne.n 0x1a000366 <LOOP_NCycles_CapDelay2>1a00036a: str.w r11, [r9] <PIN2 SET>1a00036e: nop 1a000370: nop 1a000372: nop 1a000374: nop 1a000376: nop 1a000378: str.w r11, [r9] <PIN2 CLR>1a00037c: nop 1a00037e: nop 1a000380: stmdb sp!, {r0, r2}1a000384: ldr r3, [pc, #64] ; (0x1a0003c8 <LOOP_NCycles_Period2+36>)1a000386: movw r2, #65535 ; 0xffff1a00038a: str r2, [r3, #8]1a00038c: nop 1a00038e: ldr r3, [pc, #56] ; (0x1a0003c8 <LOOP_NCycles_Period2+36>)1a000390: ldr r3, [r3, #12]1a000392: cmp r3, #01a000394: beq.n 0x1a00038e <ADC_CAPT2+14>1a000396: ldr r0, [pc, #48] ; (0x1a0003c8 <LOOP_NCycles_Period2+36>)1a000398: bl 0x1a000244 <Chip_SSP_ReceiveFrame>1a00039c: str.w r0, [r4], #21a0003a0: pop {r0, r2}.....
I believe that in line 1a00035c a pipeline break happens, and line 1a000362 does not execute till pipeline is full again (3 Cycles stall) and in the fourth cycle line 1a00036a gets executed, is that correct? But what about if i comment the NOPs? Why do i get a 7 cycles delay then?
Thanks for your help
Any ideas on this?
Thanks