Hello,
1.when i use printf with no var_args then the compiler should call puts instead of printf.
ex1:
#include <REGX51.H> #include <stdio.h> void main(void) { printf("This must call puts instead of printf"); }
Program Size: data=30.1 xdata=0 code=1103
ex2:
#include <REGX51.H> #include <stdio.h> void main(void) { puts("This must call puts instead of printf"); }
Program Size: data=9.0 xdata=0 code=168
The above code links the printf function from the library which is huge(produces 1103 bytes).But the compiler can use puts when there is no var_args given which is much smaller than printf(produces 168 bytes).
2.The Compiler must find and remove the duplicate constant strings
ex3:
#include <REGX51.H> #include <stdio.h> void main(void) { puts("This string gets duplicated as many time as i use it"); puts("This string gets duplicated as many time as i use it"); puts("This string gets duplicated as many time as i use it"); puts("This string gets duplicated as many time as i use it"); puts("This string gets duplicated as many time as i use it"); puts("This string gets duplicated as many time as i use it"); puts("This string gets duplicated as many time as i use it"); puts("This string gets duplicated as many time as i use it"); puts("This string gets duplicated as many time as i use it"); puts("This string gets duplicated as many time as i use it"); puts("This string gets duplicated as many time as i use it"); puts("This string gets duplicated as many time as i use it"); puts("This string gets duplicated as many time as i use it"); puts("This string gets duplicated as many time as i use it"); puts("This string gets duplicated as many time as i use it"); puts("This string gets duplicated as many time as i use it"); puts("This string gets duplicated as many time as i use it"); puts("This string gets duplicated as many time as i use it"); puts("This string gets duplicated as many time as i use it"); puts("This string gets duplicated as many time as i use it"); puts("This string gets duplicated as many time as i use it"); puts("This string gets duplicated as many time as i use it"); puts("This string gets duplicated as many time as i use it"); puts("This string gets duplicated as many time as i use it"); puts("This string gets duplicated as many time as i use it"); puts("This string gets duplicated as many time as i use it"); puts("This string gets duplicated as many time as i use it"); puts("This string gets duplicated as many time as i use it"); puts("This string gets duplicated as many time as i use it"); puts("This string gets duplicated as many time as i use it"); puts("This string gets duplicated as many time as i use it"); puts("This string gets duplicated as many time as i use it"); puts("This string gets duplicated as many time as i use it"); }
Program Size: data=9.0 xdata=0 code=334
ex4:
#include <REGX51.H> #include <stdio.h> void main(void) { puts("This string gets duplicated as many time as i use it"); }
Program Size: data=9.0 xdata=0 code=183
3.Bit Test instructions are not used when i actually test for the bit
ex5:
#include <REGX51.H> #include <stdio.h> void main(void) { if(P0^1) { P1 = 10; } } ASSEMBLY LISTING OF GENERATED OBJECT CODE ; FUNCTION main (BEGIN) ; SOURCE LINE # 6 ; SOURCE LINE # 7 ; SOURCE LINE # 8 0000 E580 MOV A,P0 0002 6401 XRL A,#01H 0004 6003 JZ ?C0002 ; SOURCE LINE # 9 ; SOURCE LINE # 10 0006 75900A MOV P1,#0AH ; SOURCE LINE # 11 ; SOURCE LINE # 13 0009 ?C0002: 0009 22 RET ; FUNCTION main (END)
In the above assembly output it should have used a single instruction JNB instead of three MOV,XRL and JZ.This is very basic anybody would object the assembly code produced.
I have not used the compiler much.But the compiler needs a look by the programmers at keil.
The above programs were all compiled with compiler optimisation level set to 9 & favour speed.
About 5 years back i compiled a c51 source code using keil. Now i recompiled the same source code with the latest compiler from keil and compared the two output .hex files. Unfortunately it produced exactly the same output.Here i was expecting some code and data size reduction as the compiler must be capable of optimising more.
It seems there was no improvement on the compiler side.
It is not a complaint but in the interest of improving the compiler.
regards,
S.Sheik mohamed
But the compiler needs a look by the programmers at keil
No dear Sheik: you need to know what you're doing and what you're talking about, before making baseless accusations!
The Above produces the following assembly code
ASSEMBLY LISTING OF GENERATED OBJECT CODE ; FUNCTION main (BEGIN) ; SOURCE LINE # 4 ; SOURCE LINE # 5 ; SOURCE LINE # 6 0000 7BFF MOV R3,#0FFH 0002 7A00 R MOV R2,#HIGH ?SC_0 0004 7900 R MOV R1,#LOW ?SC_0 ; SOURCE LINE # 7 ; SOURCE LINE # 8 ; SOURCE LINE # 9 ; SOURCE LINE # 10 ; SOURCE LINE # 11 ; SOURCE LINE # 12 ; SOURCE LINE # 13 ; SOURCE LINE # 14 ; SOURCE LINE # 15 ; SOURCE LINE # 16 ; SOURCE LINE # 17 ; SOURCE LINE # 18 ; SOURCE LINE # 19 ; SOURCE LINE # 20 ; SOURCE LINE # 21 ; SOURCE LINE # 22 0006 120000 R LCALL L?0002 ; SOURCE LINE # 23 ; SOURCE LINE # 24 ; SOURCE LINE # 25 ; SOURCE LINE # 26 ; SOURCE LINE # 27 ; SOURCE LINE # 28 ; SOURCE LINE # 29 ; SOURCE LINE # 30 ; SOURCE LINE # 31 ; SOURCE LINE # 32 ; SOURCE LINE # 33 ; SOURCE LINE # 34 ; SOURCE LINE # 35 ; SOURCE LINE # 36 ; SOURCE LINE # 37 ; SOURCE LINE # 38 0009 120000 R LCALL L?0002 000C 020000 E LJMP _puts ; SOURCE LINE # 39 000F L?0002: 000F 120000 E LCALL _puts 0012 7BFF MOV R3,#0FFH 0014 7A00 R MOV R2,#HIGH ?SC_0 0016 7900 R MOV R1,#LOW ?SC_0 0018 120000 E LCALL _puts 001B 7BFF MOV R3,#0FFH 001D 7A00 R MOV R2,#HIGH ?SC_0 001F 7900 R MOV R1,#LOW ?SC_0 0021 120000 E LCALL _puts 0024 7BFF MOV R3,#0FFH 0026 7A00 R MOV R2,#HIGH ?SC_0 0028 7900 R MOV R1,#LOW ?SC_0 002A 120000 E LCALL _puts 002D 7BFF MOV R3,#0FFH 002F 7A00 R MOV R2,#HIGH ?SC_0 C51 COMPILER V9.02 T 12/12/2010 06:18:02 PAGE 3 0031 7900 R MOV R1,#LOW ?SC_0 0033 120000 E LCALL _puts 0036 7BFF MOV R3,#0FFH 0038 7A00 R MOV R2,#HIGH ?SC_0 003A 7900 R MOV R1,#LOW ?SC_0 003C 120000 E LCALL _puts 003F 7BFF MOV R3,#0FFH 0041 7A00 R MOV R2,#HIGH ?SC_0 0043 7900 R MOV R1,#LOW ?SC_0 0045 120000 E LCALL _puts 0048 7BFF MOV R3,#0FFH 004A 7A00 R MOV R2,#HIGH ?SC_0 004C 7900 R MOV R1,#LOW ?SC_0 004E 120000 E LCALL _puts 0051 7BFF MOV R3,#0FFH 0053 7A00 R MOV R2,#HIGH ?SC_0 0055 7900 R MOV R1,#LOW ?SC_0 0057 120000 E LCALL _puts 005A 7BFF MOV R3,#0FFH 005C 7A00 R MOV R2,#HIGH ?SC_0 005E 7900 R MOV R1,#LOW ?SC_0 0060 120000 E LCALL _puts 0063 7BFF MOV R3,#0FFH 0065 7A00 R MOV R2,#HIGH ?SC_0 0067 7900 R MOV R1,#LOW ?SC_0 0069 120000 E LCALL _puts 006C 7BFF MOV R3,#0FFH 006E 7A00 R MOV R2,#HIGH ?SC_0 0070 7900 R MOV R1,#LOW ?SC_0 0072 120000 E LCALL _puts 0075 7BFF MOV R3,#0FFH 0077 7A00 R MOV R2,#HIGH ?SC_0 0079 7900 R MOV R1,#LOW ?SC_0 007B 120000 E LCALL _puts 007E 7BFF MOV R3,#0FFH 0080 7A00 R MOV R2,#HIGH ?SC_0 0082 7900 R MOV R1,#LOW ?SC_0 0084 120000 E LCALL _puts 0087 7BFF MOV R3,#0FFH 0089 7A00 R MOV R2,#HIGH ?SC_0 008B 7900 R MOV R1,#LOW ?SC_0 008D 120000 E LCALL _puts 0090 7BFF MOV R3,#0FFH 0092 7A00 R MOV R2,#HIGH ?SC_0 0094 7900 R MOV R1,#LOW ?SC_0 0096 120000 E LCALL _puts 0099 7BFF MOV R3,#0FFH 009B 7A00 R MOV R2,#HIGH ?SC_0 009D 7900 R MOV R1,#LOW ?SC_0 009F 22 RET ; FUNCTION main (END)
The optimiser should have used a counter and repeated the following block
MOV R3,#0FFH MOV R2,#HIGH ?SC_0 MOV R1,#LOW ?SC_0 LCALL _puts
I am not accusing,complaining or under estimating any one either at keil or in this forum I think only when we discuss we can improve.
If by anyway i hurted anybody's heart please forgive me!!
No, it does not - because the code that you have written does not test for a bit!
As already explained to you, the code you have written contains an expression using the ANSI Standard exclusive-OR operator.
Perhaps you are confusing the meaning of "bitwise", as used by the ANSI Standard, with the individual bit operations of the 8051...?
1) printf() does not do the same thing as puts() even if you send a constant not containing any % parameter expansions. Haven't you stil read up on puts() and seen what it does, _besides_ emitting the text string?
Well? I'm waiting. Have you read the documentation for puts yet? Still waiting...
2) You want the fastest code? Converting a sequence of calls put puts() with a loop isn't faster. Many optimizing compilers actually does the reverse. They do loop unrolling where they intentionally duplicate the code inside the loop to reduce the number of loop iterations - sometimes totally unrolling the loop so no loop operation remains.
3) my_byte XOR 1 inverts one bit in a byte, and then tests if the byte result is non-zero. That is not a single-bit operation that may use any bit instructins in the 8051 processor. The ^ may _only_ be used when declaring bit variables, since it is an overload of a standard C operator for XOR. And the standard C operator performs bit operations on full bytes/shorts/ints/long ints and not on a single bit.
Somehow, you have to switch from output mode into input mode. You must pick up the feedback you get in the forum, and not just run along further and further away on the wrong track.
www.cplusplus.com/.../ www.cplusplus.com/.../
Spot the difference!
Ok,
In that case atleast the compiler could even have different versions of printf and use the one which is appropriate for the current project.
That is if i have never used float inside printf in my project then float to string part of the printf library is not neccessary.
The comipler/linker during optimization can decide which printf library would be suitable for my project.
"That is if i have never used float inside printf in my project then float to string part of the printf library is not neccessary"
That is exactly what happens already!
Have you now understood why the code that you thought did a single-bit test does not actually do a single-bit test?
Note that it isn't trivial for the linker to analyze object files and try to figure out which of several printf() functions to use.
Remember that not all printf() calls needs to look like:
printf("formatting string",param,param,...);
You can also have:
void function(const char* fmt) { printf(fmt,int1,int2); }
and you can have:
char fmt[100]; sprintf(fmt,"xxx",...); printf(fmt,...);
The linker runs at link time. It doesn't know what happens at run time. A program could have multiple sets of strings, to allow it to print the same messages in english, italian, german, ...
And remember that printf() and sprintf() shares the same background "engine", so it isn't enough to look at all printf() calls.
The only one who really knows everything about your program - or is expected to - is you.
Not to continue the the discussion about the C language.
But a look at the 8051 architecture. Please note, that there are port access instructions that read the latch and others that read the port pin. You will find this in every description of standard 8051.
What does that mean? This means, that under certain circumstances (please think about that yourself) you will have
if (P0 ^ 1) // the if-condition is FALSE P0.0 = 0;
and
if (_testbit(P0.0)) // the if-condition is TRUE.
So, already on the hardware level, these both are NOT the same!
Hello All,
Thank you all for keeping patience with me.
I agree that all my allegations were complete wrong.
puts Adds extra linefeed to the string so it cannot be used instead of printf.So i was wrong here.
But i think the compiler could be supplied with different versions of printf and let the user decide which printf version is best for him.This way compiler & linker need not struggle to find the best printf.
No the compiler allocates the string only once.it was again my mistake
I must have used (P0 & 1) instead of (P0 ^ 1) again my mistake.
But when i use (P0 & 1) the compiler understood my intention of bit testing but it has assembled it in a different way.
#include <REGX51.H> #include <stdio.h> void main(void) { if(P0 & 1) { if(P0_1) { P1 = 10; } } } ; FUNCTION main (BEGIN) ; SOURCE LINE # 5 ; SOURCE LINE # 6 ; SOURCE LINE # 7 0000 E580 MOV A,P0 0002 30E006 JNB ACC.0,?C0003 ; SOURCE LINE # 8 ; SOURCE LINE # 9 0005 308103 JNB P0_1,?C0003 ; SOURCE LINE # 10 ; SOURCE LINE # 11 0008 75900A MOV P1,#0AH ; SOURCE LINE # 12 ; SOURCE LINE # 13 ; SOURCE LINE # 15 000B ?C0003: 000B 22 RET ; FUNCTION main (END)
where it could have simply put "JNB P0_1" instead of "mov a,P0" & "jnb ACC.0"
But overall if you compile a source file using an old version of the compiler and again with the new version of the compiler the produced hex file is byte to byte same.
Why the compiler or its optimizer has not improved in reducing the code & data size for many years.
Once again thank for all your patience
"The ^ may _only_ be used when declaring bit variables"
Here is where i got confused.
Because i had used something like P0^1 so i thought that is the only way to reference a bit. i do not know why keil chose to use P0^1 to declare bits instead of P0.1 I just used keil only after about 5 years.in fact i do not use MCS51.
Thanks
Sheik mohamed
Yes, it is a very common source of confusion - it caught me out when I first started with C51!
It does seem to be a rather poor choice on Keil's part, and it is certainly not well explained in the manual.
:-(
"I must have used (P0 & 1) instead of (P0 ^ 1)"
No - that is still a whole byte operation!
if you want to use the 8051's single-bit features, then you have to define a single-bit variable:
sbit P0_1 = P0 ^ 1; // Define a single-bit variable : if( P0_1 ) // Test the single-bit variable : P0_1 = 1; // Set the single-bit variable : P0_1 = 0; // Clear the single-bit variable
http://www.keil.com/support/man/docs/c51/c51_le_sbit.htm
Yes, I got your point.
But when i use (P0 & 1) the compiler knew that i was testing for bit and it has assembled the right bit instruction for testing bit and not "AND" instruction for testing the whole byte.That is very nice & wise of the compiler.But the compiler did not check if the address is bit-addressable or not.If it had found the address is bit-addressable then it could have assembled more specific "jb" instruction.
No, you clearly didn't!
"when i use (P0 & 1) the compiler knew that i was testing for bit"
No, it does not!!
P0 is an 8-bit value;
1 is an integral constant.
In strict ANSI 'C', the integral constant is considered an int, and the 8-bit value would be promoted to an int before doing a bitwise 'AND' of all bits and giving an int result. Effectively, the expression is:
( P0 & 0x0001 )
Keil C51 gives you the option to disable this promotion, so that the expression becomes just an 8-bit operation.
But the only way to get Keil C51 to operate on a single bit is to use the specific bit operations.
Again, ANSI 'C' bitwise operators have noting to do with the 8051's single-bit operations!
In strict ANSI 'C', the integral constant is considered an int, and the 8-bit value would be promoted to an int before doing a bitwise 'AND' of all bits and giving an int result.
With the integer promotion, the semantics of the operation do not change: the code is still testing for a bit. In this particular case, if the compiler knew that the register is bit-addressable and chose not to ignore this information it could test for this bit directly.
Well, that's the point. A smarter compiler will use faster and more compact code constructs where appropriate. This is clearly one of those cases.
I'm not sure what you mean by that. I've seen the Green Hills compiler for Coldfire generate BSET and BCLR instructions when I was using bitwise OR and bitwise AND operators to set or clear bits. Why wouldn't C51 do the same? Especially since the 8051 core does a read-modify-write internally to set or clear bits, so semantics are the same.
Note that the compiler can be smart enough to notice that x & 1 can be seen as a bit operation. It would be a lousy compiler if it couldn't.
But P0 is a volatile 8-bit variable. And the language standard requiers that the compiler does perform the volatile access. So the compiler can't do a single-bit access to a bit variable that just _happens_ to be aliased to one bit of the 8-bit P0 special function register. The compiler _must_ do the full 8-bit read of P0. Then it can use its optimization abilities to convert (x & 1) into a bit operation instead of a full byte-wise and followed by a check of the zero flag.
When you do declare a bit variable P0_0, then the compiler isn't bound by the language standard into having to read the full P0. You have explicitly then told the compiler that you want an access to a single-bit variable, and the compiler may perform such a single-bit access without involving the accumulator.
Remember that it isn't obvious what special actions that may be trigged by accesses to an SFR. In some situations, the byte read may acknowledge an interrupt mechanism. In some situations, the access may control the latching of a 16-bit timer value into two 8-bit SFR.
Keil do not want to add a lot of special code intentionally violating the language standard just because they have considered it "safe" (an assumption since the world is full of 8051 variants with varying "special" features added) to violate the language standard in special situations. And there is no need to add such special code since they have explicitly given you the means to declare bit variables to explicitly tell the compiler to perform just a bit access.
"I've seen the Green Hills compiler for Coldfire generate BSET and BCLR instructions when I was using bitwise OR and bitwise AND operators to set or clear bits. Why wouldn't C51 do the same?"
Was that with SFR or volatile variables? Or was it with normal variables?
It really is important to separate SFR and volatile variables from standard char/int/... variables when discussing single-bit accesses.
But P0 is a volatile 8-bit variable. And the language standard requiers that the compiler does perform the volatile access.
For what it's worth:
On a '51, a bitwise operation internally requires a read of the complete byte. For example a CLRB is actually documented as a read-modify-write of the byte. So the JB, JNB, CLR and SETB would actually satisfy the volatile requirement.
I'd say that the volatile argument is a poor excuse for avoiding optimization here. One could argue that an SFR is not covered by the language standard: even the declaration of an SFR has a special non-standard syntax. Besides, it is known that the CPU will perform a byte read when you read a single bit.
In that case it must have converted the P0 into an integer & do an 'AND' operation on that integer with an integer constant 0x0001
That is in strict ANSI-C,
mov a,p0 <--- LOW(P0) <-port value andi a,#LOW(0x0001) <-integer const 1 mov r0,a //Save the result of low byte mov a,#0 <--- HIGH(P0) <-port value promoted to integer as per ANSI C andi a,#HIGH(x0001) <-integer const 1 orl a,r0 <--- is integer result 1 jz xxx
The above code is not needed we all know and the compiler too knew it.If it was to follw the ANSI-C then it would not have choosen the jnb instruction but the above big code. But the compiler actually found the smart way to do the same thing with a bit test instruction. But it missed the checking of whether the port byte is bit-addressable or not.
The "volatile" does not come here as the port bit is being accessed directly every time.
The compiler should produce small,smart & fast code that gives the same output when executed.
I have seen compiler demote an integer to char
ex:
void delay(void) { int delay_val; for(delay_val = 0;delay_val < 250;delay_val++) { do_nothing(); } }
In the above code one of a compiler used a single 8 bit register for 'delay_val' that is it demoted the int into char.Which may not be supported by your ANSI-C.But we all know when the code executes it will produce the same result.In fact a better result.Lower code & data memory usage and faster code execution.
It was nice & wise.Since the delay_val is never assigned a value above 255 so the compiler chose to demote the integer to a char.
There is no ANSI-C comming here.It is just an optimization by the compiler
It is irrelevant what a classic 8051 does. The classic 8051 implements the instructions based on a requirement to minimize the number of transistors. But a modern one-clocker or a soft core may introduce a lot of changes since they have a completely different transistor budget available. I'm not convinced that the read-modify-write argument is true for every 8051 implementation in existence now or next year.
There shouldn't be anything stopping a manfacturer from introducing pin-change interrupts, where a read of a single port bit acknowledges interrupts for that pin while a read of the port acknowledges/clears potential interrupts for all eight pins.
I'm not convinced that ...
You dare to doubt the so-called "bible"? shame on you.
I don't doubt the bible, where it is applicable.
But I doubt that 25 year old documentation of the original 8051 is true for all 8051 variants in existence or in planning. It is enough that a single 8051 manufacturer (or programmer, if we think about soft cores) decides that bit operations can be done without a read/modify/write.
We are not debating any natural laws here. And even many of our natural laws are only applicable in their "normal" form for non-relativistic speeds.
But I doubt that 25 year old documentation of the original 8051 is true for all 8051 variants in existence or in planning.
But that shouldn't prevent the compiler writers from implementing optimizations for cases where they are applicable. That would be most if not all existing 8051 cores. A simple command line switch will take care of new/non-compliant cores. My understanding is that the OP is expressing frustration by the lack of progress on the optimization front. And I agree with that. It appears that C51 has been in bug-fix-only mode for many years.
Not if you were relying upon the loop execution time based on int arithmetic...!
View all questions in Keil forum