Hello friends
I have encountered a very strange problem, I have keil with MDK3.08a version. When i compiled program and loaded in the target the execution speed of the program becomes considerably slower (not in debug mode). I have compared the speed with the previous version of keil (i don't remember may be 3.20).
Furthermore when i interchange the location of C file in the work space it effects the execution speed. When I compared the output hex file of the same project with interchanged c files position in work space, i found they were different.
Please help me out of this
becomes considerably slower
How much exactly, is being called "considerable" here?
Are you sure you set up the project as equal as you can make it, with the new tools? In particular, did you check the manuals for changed default states of compiler/optimization options?
When I compared the output hex file of the same project with interchanged c files position in work space, i found they were different.
Of course they were. Linkage order is (initially) controlled by the order of files in the IDE project, and linkage order will affect the actual program being generated. So the hex files should change.
thanx for reply Hans-Bernhard Broeker
By "Considerable slower " i mean to say that a for loop delay function take more time to execute in 3.80a then in older version.
More over what about when i exchange the position of C file in workspace the delay changes. Please try to understand. What you are asking i already asked to myself and verified it. i just copied the whole project from older version to newer version then how the defaults can get changed. nevertheless i have checked it
But you must be able to quantify what you mean by "considerably slower". 1ppm? 1 percent? Twice the time? 5 times slower?
a for loop delay function take more time to execute
There should not be a "for loop delay function" in your program in the first place. Delay loops, in those rather few cases where they're not a plain bad idea, should not written in C. Precisely for reasons like the one you just experienced, they have to be written in assembler.
arround 5 to 10 times slower or 1.5 times fater depends upon the placement of C file in work space.
Yah you are right i could have used timer in place of for loops, but for loop is not only used for delay may be i could use it for another purpose also, will that for loop not affected due to slower speed. Please try to understand and justify ur self. Please Dont take it otherwise
a for loop delay function take more time to execute in 3.80a then in older version
Is this delay function short enough to be easily analyzed with the disassembler? If so, just look at the generated code and see what exactly changed. And yes, delay 'for' loops are generally a bad idea.
but for loop is not only used for delay may be i could use it for another purpose also, will that for loop not affected due to slower speed
It's too early to worry about that. You have not yet demonstrated that the new compiler generates slower code. The delay 'for' loop doesn't count as proof, it should be clear by now.
See: www.8052.com/.../162556
but for loop is not only used for delay may be i could use it for another purpose also, will that for loop not affected due to slower speed.
Let's delay worrying about that until you've demonstrated that it actually is a problem.
hello friends i wrote the following function
void os_dly_wait(unsigned int dly) { unsigned int i; for(i=0; i<=(dly *100); i++); }
in the previous version of keil its assembly was 0x00000344 E3A01000 MOV R1,#0x00000000 0x00000348 EA000000 B 0x00000350 0x0000034C E2811001 ADD R1,R1,#0x00000001 0x00000350 E3A02019 MOV R2,#0x00000019 0x00000354 E0020290 MUL R2,R0,R2 0x00000358 E1510102 CMP R1,R2,LSL #2 0x0000035C 9AFFFFFA BLS 0x0000034C
while in the newer version of keil it is 0x000003FC E3A01000 MOV R1,#0x00000000 0x00000400 EA000000 B 0x00000408 0x00000404 E2811001 ADD R1,R1,#0x00000001 0x00000408 E0802180 ADD R2,R0,R0,LSL #3 0x0000040C E0822200 ADD R2,R2,R0,LSL #4 0x00000410 E1510102 CMP R1,R2,LSL #2 0x00000414 9AFFFFFA BLS 0x00000404
when i tried to write in line assembly as follow void non_interrupt_delay(unsigned int dly) { unsigned int i,R2; __asm { MOV i,#0x00000000 B loop1 loop2: ADD i,i,#0x00000001 loop1: ADD R2,dly,dly,LSL #3 ADD R2,R2,dly,LSL #4 CMP i,R2,LSL #2 BLS loop2 } } its assembly becomes 0x000002D0 E3A01000 MOV R1,#0x00000000 0x000002D4 EA000002 B 0x000002E4 0x000002D8 E1A00000 NOP 0x000002DC E2811001 ADD R1,R1,#0x00000001 0x000002E0 E1A00000 NOP 0x000002E4 E0802180 ADD R2,R0,R0,LSL #3 0x000002E8 E0822200 ADD R2,R2,R0,LSL #4 0x000002EC E1510102 CMP R1,R2,LSL #2 0x000002F0 8A000000 BHI 0x000002F8 0x000002F4 EAFFFFF7 B 0x000002D8
due to unnecessary NOPs my delay becomes slow again. Please put your valuable ideas on this
Please read the instructions on how to post source code - they are really quite clearly stated: www.danlhenry.com/.../keil_code.png
You seem to have only compared old/new compiler.
It is expected that you will get big differences when switching compiler or compiler settings. That is why a delay using a C loop should not make use of a loop variable, but should repeat until a hw timer has ticked far enough.
But your claim seem to be that you can have a 10x speed difference for the same compiler by just changing order of the source files.
Have you produced any disassembly of the loop when switching location too? And have you checked if the processor have multiple code regions or if it has any execution cache that is only available for a limited range of the flash? Or are you running in RAM, and may get one of the loops in a RAM region where you also run heavy DMA transfers?
Again, the whole point of any High-Level Language (HLL) is that you do not have control of the generated machine code - you delegate that task to the compiler.
Since you do not have control of the generated machine code, you do not have control of its execution speed!
Since you do not have control of the execution speed, you must not rely upon the execution timing in any way!
If you really do need to rely upon the execution timing, then you really must write it in assembler; or use some means that does not rely upon the execution timing - such as a hardware timer.
It appears that you have not enabled speed optimization in the compiler: the compiler has not applied loop-invariant code motion optimization to move the multiplication operation (dly*100) out of the loop.
It seems the compiler has become smarter: now it applied strength reduction optimization and replaced a multiplication with a shift. The code size hasn't changed, but the execution speed has likely become faster. The is no loop-invariant code motion here either: looks like you set optimization to 'generate smaller code.'
But Andy whole program depends upon execution speed, for example I2C driver in which you have to generate clock of the order of microseconds, then how could you say that do not rely on that. moreover if there is a difference in execution speed it will not be in the order of 10x
"...whole program depends upon execution speed, for example I2C driver in which you have to generate clock of the order of microseconds..."
if you're requiring timing with that order of accuracy, you should seriously consider writing it in assembler.