I tried to small program to see the code produced by the compiler.
This is the C code: unsigned char loop;
while (1) {
P3 = 0xFF;
for(loop=0xff; loop>0; loop--) {}
P3 = 0x00;
}
This is the disassembly:
25: unsigned char loop;
26:
27: while (1) {
28:
29: P3 = 0xFF;
C:0x0800 75B0FF MOV P3(0xB0),#0xFF
30: for(loop=0xff; loop>0; loop--) {}
C:0x0803 7FFF MOV R7,#0xFF
C:0x0805 DFFE DJNZ R7,C:0805
31: P3 = 0x00;
C:0x0807 E4 CLR A
C:0x0808 F5B0 MOV P3(0xB0),A
32: for(loop=0xff; loop>0; loop--) {}
C:0x080A 7FFF MOV R7,#0xFF
C:0x080C EF MOV A,R7
C:0x080D D3 SETB C
C:0x080E 9400 SUBB A,#0x00
C:0x0810 40EE JC main(C:0800)
C:0x0812 1F DEC R7
C:0x0813 80F7 SJMP C:080C
C:0x0815 787F MOV R0,#0x7F
C:0x0817 E4 CLR A
C:0x0818 F6 MOV @R0,A
C:0x0819 D8FD DJNZ R0,C:0818
C:0x081B 758107 MOV SP(0x81),#0x07
C:0x081E 020800 LJMP main(C:0800)
Why is the second timing loop so different than the first ??
What can I do so both are the same ???
Thank you
Donald
I have a few questions too. Here is an example where the ARM compiler changes low level assembly code. I was merging in some optimized assembler code generated by a Forth compiler, this went fine until I realized that it was running 20-30% slower than expected. I eventually realized that the ARM compiler was ignoring the assembly code instructions and using its own code versions. Here is an example followed by two resulting code outputs using different optimization settings. Can someone explain this. I would have thought any assembly code should just be assembled and not altered?
Also the compiler seems to ignore the __inline command and still makes one code section and inserts call-returns. So the question is there a way to force the __inline to really do it?
I found that I had to use the __inline for any assembly code put in a .h (header) file. If I didn't I got "error L6200E: symbol Delay multiply defined by main.o and xcee.o" - Why is this? I think multiply means multiple? . .
#pragma arm __inline void Delay(uint32_t loops) { __asm { loop1: SUBS loops, loops, #1 // 1 cycle BNE loop1 // 3 cycles true, 1 cycle false } } #pragma thumb #endif
optimization (level=3) correct.
116: loop1: SUBS loops, loops, #1 // 1 cycle 117: BNE loop1 // 3 cycles true, 1 cycle false 118: } 0x00000464 E2500001 SUBS R0,R0,#0x00000001 0x00000468 1AFFFFFD BNE Delay4(0x00000464) 119: } 0x0000046C E12FFF1E BX R14 0x00000470 4778 BX PC
optimization (level=0) adds two extra instructions
116: loop1: SUBS loops, loops, #1 // 1 cycle 0x00000464 E1A00000 NOP 0x00000468 E2500001 SUBS R0,R0,#0x00000001 0x0000046C 0A000000 BEQ 0x00000474 117: BNE loop1 // 3 cycles true, 1 cycle false 118: } 0x00000470 EAFFFFFC B 0x00000468 119: } 0x00000474 E12FFF1E BX R14 0x00000478 4778 BX PC
__inline doesn't force the compiler to inline your code. It just tells it that you think it's ok to inline it.
Once upon a time, people where using the 'register' keyword to specify what variables to put in registers. Most compilers now completely ignores it, since the 'register' keyword hampers their optimization.
It seems like the compiler doesn't emit assembler instructions verbatime, but instead adds them to it's evaluation tree. Then it performs different rewrites of the instruction sequences before emitting. In your case, it seems like a reversed peep-hole optimization :)
If you define the body of a function in the .h file, then you get multiple copies of the function, one copy everywhere that #include's the file. Header files generally contain only declarations of names, not the actual definitions of those objects. __inline is a special case because the compiler has to be able to see the body of the function in order to inline it. You'd expect the inline function to be inserted many times, while you'd expect the regular function call to insert only a call to only one body of the function.
Thanks for the input. I understand what you are saying, but I wish the compiler would just do what it's asked to compile when input with assembly code. Fair enough with high level code.
Drew, thank you for the explanation, that makes it crystal clear. I was thinking that the #ifndef, #define in the .h files files would stop the repetition and the compiler built a list from the 1st definition it found. Thats what happens with .c files isn't it?
I lucked out using the __inline it seems, but I wanted inline code, sad the compiler didn't oblige.
#ifndef _delay #define _delay #pragma arm __inline void Delay4(uint32_t loops) { __asm { loop1: SUBS loops, loops, #1 // 1 cycle BNE loop1 // 3 cycles true, 1 cycle false } } #pragma thumb #endif \\ _delay
As Per said earlier, "__inline doesn't force the compiler to inline your code. It just tells it that you think it's ok to inline it."
As always, if it needs to be in assembler, then write it properly as assembler - in its own assembler source modulle - and call it from 'C'.
" I was thinking that the #ifndef, #define in the .h files files would stop the repetition"
No, they don't.
They stop the definition being repeated in the same source file if the same header is accidentally (or otherwise) included multiple times...
Andy, thanks for the hints it forced me to read the Keil __inline and __asm help in more detail.
On the __inline I discovered using by combining __forceinline and static and -Otime I finally got the compiler to do the __inline thing.
On the __asm in the restrictions section it says "Do not use it to generate more efficient code than the compiler generates." I wonder, why not?
I need __forceAsmAsWritten !
But thanks, I will look into armasm.
Not really. What you need is to understand that assembly code that don't want the C compiler to mess with shouldn't have been put in a C source file to begin with.
"On the __asm in the restrictions section it says 'Do not use it to generate more efficient code than the compiler generates'. I wonder, why not?"
Because the optimisation relies upon the compiler being in complete control - inserting random bits of arbitrary assembler over which the compiler has no control will (almost) certainly break the optimisation.
One more time: If the precise sequence of machine instructions really matters to you, then do not use a high-level language!
What would Hemingway have done if he had been forced to incorporate parts of a Shakespeare play somewhere in the middle? Their styles certainly doesn't match.
The compiler isn't smart. It follows specific rules specified by the developer of the compiler. These rules tend to produce reasonable code, but are only valid under a specific circumstances. As soon as you add inline assembly, you are pushing the compiler outside the envelope it's code generator and code optimizer are designed for. Don't ever assume the outcome of such situations.
Han's, Andy, Per.
A quick thanks for the latest input and happy Easter wishes!
Kindest regards.