This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

unnecessary code generation

Hello,

I'm relatively new in using Keil C166 development system. First tests are successfully so far using the Keil MCB167NET board. As a long time C programmer I'm interested in the quality of generated code. I noticed that the generated code sometimes has unnecessary code lines that I thought an optimizer should be able to find. Here is a example:

#include <C167CS.H>
#include "comtype.h"
#include "timer.h"
void timer_init()
{  struct T01_CON {
      uint  T0L :3;
      uint  T0M :1;
      uint  un00:2;
      uint  T0R :1;
      uint  un01:1;
      uint  T1L :3;
      uint  T1M :1;
      uint  un10:2;
      uint  T1R :1;
      uint  un11:1;
   };
   union {
      struct T01_CON tcon;
      uint   reg;
   } u;
   u.reg = T01CON;
   u.tcon.T0L = 2;    // FCPU / 2^(3+T0L) = 2 µs at 16MHz
   u.tcon.T0M = 0;    // Timer Mode
   u.tcon.T0R = 1;    // run Timer
   T01CON = u.reg;
}//timer_init

The genearated code looks for speed optimizer, Level 7 is:

   ; FUNCTION timer_init (BEGIN  RMASK = @0x4030)
0000 2802          SUB       R0,#02H
0002 F2F550FF      MOV       R5,T01CON
0006 B850          MOV       [R0],R5       ; u
0008 A840          MOV       R4,[R0]       ; u
000A 0AF54F42      BFLDL     R5,#04FH,#042H
000E B850          MOV       [R0],R5       ; u
0010 A840          MOV       R4,[R0]       ; u
0012 F6F550FF      MOV       T01CON,R5
0016 0802          ADD       R0,#02H
0018 CB00          RET
   ; FUNCTION timer_init (END    RMASK = @0x4030)

At first I see that at Offset 0008 and 0010 the mov to R4 is unnecessary because R4 is never read again. Removing this moves a second look noticed that also the moves at offset 0006 and 000E are unnecessary.

Should this kind of optimization be doable by a peephole optimizer or by data flow analysis?

NB: This is not critical, just as a hint for future improvements.
The most important thing with optimizers is still the correctness of generated code.

- Heinz (from Delmenhorst, Germany)

Parents

0 Mik Kleshov over 22 years ago in reply to Jon Ward

Well, it tells something about those who did the porting. We all know that the C166 port of GCC was not merged with the official GCC code tree so it didn't see as much development as it could have. When I mentioned GCC, I was mostly referring to GCC for x86.
I agree, Keil's C compiler for the C166 must be the best in the market. But it can be better.
Cancel
Vote up 0 Vote down

Cancel

Reply

0 Mik Kleshov over 22 years ago in reply to Jon Ward

Well, it tells something about those who did the porting. We all know that the C166 port of GCC was not merged with the official GCC code tree so it didn't see as much development as it could have. When I mentioned GCC, I was mostly referring to GCC for x86.
I agree, Keil's C compiler for the C166 must be the best in the market. But it can be better.
Cancel
Vote up 0 Vote down

Cancel

Children

0 Jon Ward over 22 years ago in reply to Mik Kleshov

Well, it tells something about those who did the porting. We all know that the C166 port of GCC was not merged with the official GCC code tree so it didn't see as much development as it could have. When I mentioned GCC, I was mostly referring to GCC for x86.

Actually, the port I used is a commercial GCC implementation that is supposedly better than the "official" (is there such a thing) GCC 166 port.

The problem that you pose is interesting in that you generate a dividend and remainder (in adjacent statements) for the same quotient and divisor. So, if the CPU's DIV instruction generates a dividend and a remainder, only 1 DIV instruction is required. However, I'm not sure what kind of general optimization that would be (dividene/remainder coloring???).

I contacted one of the compiler developer's at Microsoft to see if they performed this optimization and it appears that they do. After a lengthy discussion, I think that the added this optimization to improve some kind of Windows or PC benchmarks.

A question that needs to be answered is, how often do developer's need a dividend and remainder from the same quotient and divisor? If the answer is often, then perhaps we need to consider this optimization (we focus our efforts on improvements that benefit the greatest number of users).

Jon
Cancel
Vote up 0 Vote down

Cancel
0 Drew Davis over 22 years ago in reply to Jon Ward
A question that needs to be answered is, how often do developers need a dividend and remainder from the same quotient and divisor?

My two cents: pretty often!

For one example, it's very common for me to run into a situation where I have some sort of table in a series of registers, where the individual items are less than the full register width. Let's say you have 16 items, two bits wide, 4 items per byte, thus taking up 4 bytes of address space in total, thus:

addr item num ---- -------- 0000: 3 2 1 0 0001: 7 6 5 4 0002: b a 9 8 0003: f d e c

To access item n, you need code along the lines of:

Item GetItem (int n) { index = n / ItemsPerWord; offset = n % ItemsPerWord; return (*(base + index) >> offset) & mask; }

With most designs, the div and mod should strength-reduce to shifts and masks because they're powers of two.

I also find it moderately common any time I need a division to also need the remainder for any sort of fraction where I want to avoid floating point.
Cancel
Vote up 0 Vote down

Cancel
0 Heinz Saathoff over 22 years ago in reply to Mik Kleshov

I agree, Keil's C compiler for the C166 must be the best in the market. But it can be better.
That's what I wanted to say ;-)

- Heinz
Cancel
Vote up 0 Vote down

Cancel
0 Heinz Saathoff over 22 years ago in reply to Jon Ward

The problem that you pose is interesting in that you generate a dividend and remainder (in adjacent statements) for the same quotient and divisor. So, if the CPU's DIV instruction generates a dividend and a remainder, only 1 DIV instruction is required. However, I'm not sure what kind of general optimization that would be (dividene/remainder coloring???).
Maybe as a extension of common subexpression optimization? Or, as I remember from the compiler writing course, a peephole optimization on the generated code. Such a optimizer should be able to analyze code sequences with no jumps in/out. So the flow graph must still be used.

- Heinz
Cancel
Vote up 0 Vote down

Cancel