This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Convincing the optimizer (another fun bit of code) to optmize?

I'm a bit curious as to why this bit of code wasn't AS optimized as it would normally be. I have written compilers, so I am not clueless it's just strange the optimizer didn't optimize out some of this code.
This code becomes

  ADCON1 = chan->constants.location_info.con1 & 0x70;
  ADMUX = chan->constants.location_info.mux;
  ADCON0 = chan->constants.type_info.con0;
  OCH = chan->a2d.och;
  OCM = chan->a2d.ocm;
  OCL = chan->a2d.ocl;

  GCH = chan->a2d.gch;
  GCM = chan->a2d.gcm;
  GCL = chan->a2d.gcl;

this code

0010         L?0038:
0010         L?0039:
0010 F582              MOV     DPL,A
0012 E4                CLR     A
0013 3E                ADDC    A,R6
0014 F583              MOV     DPH,A
0016 E0                MOVX    A,@DPTR
0017 22                RET

0000 8F82              MOV     DPL,R7
0002 8E83              MOV     DPH,R6
0004 A3                INC     DPTR
0005 A3                INC     DPTR
0006 E0                MOVX    A,@DPTR
0007 5470              ANL     A,#070H
0009 F5DD              MOV     ADCON1,A
                                           ; SOURCE LINE # 279
000B 8F82              MOV     DPL,R7
000D 8E83              MOV     DPH,R6
000F E0                MOVX    A,@DPTR
0010 F5D7              MOV     ADMUX,A
                                           ; SOURCE LINE # 282
0012 EF                MOV     A,R7
0013 2404              ADD     A,#04H
0015 120000      R     LCALL   L?0038
0018 F5DC              MOV     ADCON0,A
                                           ; SOURCE LINE # 286
001A EF                MOV     A,R7
001B 2417              ADD     A,#017H
001D 120000      R     LCALL   L?0038
0020 F5D3              MOV     OCH,A
                                           ; SOURCE LINE # 287
0022 EF                MOV     A,R7
0023 2418              ADD     A,#018H
0025 120000      R     LCALL   L?0039
0028 F5D2              MOV     OCM,A
                                           ; SOURCE LINE # 288
002A EF                MOV     A,R7
002B 2419              ADD     A,#019H
002D 120000      R     LCALL   L?0039
0030 F5D1              MOV     OCL,A
                                           ; SOURCE LINE # 290
0032 EF                MOV     A,R7
0033 241A              ADD     A,#01AH
0035 120000      R     LCALL   L?0039
0038 F5D6              MOV     GCH,A
                                           ; SOURCE LINE # 291
003A EF                MOV     A,R7
003B 241B              ADD     A,#01BH
003D 120000      R     LCALL   L?0039
0040 F5D5              MOV     GCM,A
                                           ; SOURCE LINE # 292
0042 EF                MOV     A,R7
0043 241C              ADD     A,#01CH
0045 120000      R     LCALL   L?0039
0048 F5D4              MOV     GCL,A


First all the information is references from a pointer.
All variable data access is sequential from said pointer. Why isn't it optimizing out the ADD A, #XXX into INC DPTR?
It has done this a number of other places in the code. Why not here?

Do I have the settings wrong or something?
Do I need to arrange the code differently?

This is embedded into an ISR could that be the reason (that wouldn't make sense however ... )

If anyone can let me know if I need to do something different.

Stephen

Parents
  • No 'optimizer' can optimize as well as a competent programmer.

    If you need optimized, then optimize, do not rely on some mechanical function to do your work for you.

    Yes, the optimizer will optimize, but if what you write is not optimum, what is the point?

    someone post about you using generic pointers in this thread and, if something is NOT optimum, that takes the prize.

    I have, in many cases, two 'identical' functions, the only difference being that one takes a pointer to 'code' and one takes a pointer to 'xdata'. Two

    Now, the smoked sardine is going to barge in and say something stupid, just ignore it.

    Erik

Reply
  • No 'optimizer' can optimize as well as a competent programmer.

    If you need optimized, then optimize, do not rely on some mechanical function to do your work for you.

    Yes, the optimizer will optimize, but if what you write is not optimum, what is the point?

    someone post about you using generic pointers in this thread and, if something is NOT optimum, that takes the prize.

    I have, in many cases, two 'identical' functions, the only difference being that one takes a pointer to 'code' and one takes a pointer to 'xdata'. Two

    Now, the smoked sardine is going to barge in and say something stupid, just ignore it.

    Erik

Children
  • A good optimizer can optimize better than a good developer can manage, unless the developer spends a huge amount of time, and writes down a large number of permutations with paper and pen.

    It's just a question of how hard the processor is to optimize for. The C51 processor is quite easy to optimize for (for a human). It has few registers, and very limited instructions so it is easy to keep all required state information in the head.

    Let's just look back at the Intel Pentium processor. It got an exchange instruction that could swap two fp registers without consuming any clock cycles. Suddenly, it became very hard to match the Watcom compiler on fp code, because a normal human does not like to constantly swap the locations of the fp registers. You can't even document the register contents in a good way, since the fp register stack doesn't behave like a stack anymore.

    A number of processors have extra bits in the instruction to inform the processor about future instructions, for example if a jump is likely to happen within a short while or if the processor should be lazy or aggresive with writing back changed memory values. Some processors always proceses the first instruction after a jump, just to make sure that they have something to do while the pipeline and memory subsystem is busy to retrieve the first instruction from the jump destination.

    High-end processors are superscalar, so they process multiple instructions for each clock cycle. Even if all execution blocks are maximum fast and performs their result in a single clock cycle, the code must be rewritten so that the pipeline will always be able to issue the full set of integer or fp instructions and remember that more advanced memory addressing modes may possibly lock up one ALU for evaluation of scaled multiple-offset addresses.

    Even if you are very good at writing in assembler, these non-regular hw tricks makes it very, very hard to keep track of everything, and to manage to always produce a correct program.

    In theory, a human should always win over the optimizer in a compiler, but the optimizer has the advantage that it will _always_ remember all the tricks it knows about. It will not, now and then, forget to check if dirty trick "x" is applicable. It will not decide that it is too much work to perform a huge reorder/rewrite just to gain a single clock cycle - a normal developer will reach a stage where he feels: enough is enough.

    This is no different from chess. A chess master does not have it easy against the best chess computers, because the associative memory has to fight against the deep and brutally fast memory of the computer. A chess master (or a human developer) has to prune the alternatives much earlier than a computer. It is only experience combined with luck that helps the chess master/developer from incorrectly pruning a winning tree.

  • A good optimizer can optimize better than a good developer can manage, unless the developer spends a huge amount of time, and writes down a large number of permutations with paper and pen.
    divide and conquer. My estimate is that less than 5% of most programs will gain anything noticeabke (to the user) by optimization. So instead of "spends a huge amount of time" spend a bit of time superoptimizing the critical function and leave the rest alone.

    The C51 processor is quite easy to optimize for (for a human). It has few registers, and very limited instructions so it is easy to keep all required state information in the head.
    and that is the processor we discuss (see above MCU: Cx51. 8051 or MCS51)

    Let's just look back at the Intel Pentium
    and that is not the processor we discuss (see above MCU: Cx51. 8051 or MCS51)

    your insights in other processors, while valuable, do not apply to the origin of this thread.

    Erik

  • Now, the smoked sardine is going to barge in and say something stupid, just ignore it.

    I'm curious, what sort of a stupid thing do you think I might say?