This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Compiler produces inefficient assembly code?

I and a co-worker are programming an 8051 uController in C. Last week we were struggling with a poor perfomance time of our program. So I talked to the assembly guy and we figured that our way of software timer handling was too slow. We were using 16 bit variables. Now I tweaked the timer software a bit and it runs about 4x faster, which is fast enough.

Digging a little deeper in the produced assembly code we found a 'persistent nuisance'. As a test I wrote these lines in code:

uint8 j;
        for(j=10;j--;){
                rightTorqueArray[j] = j; }

The array is an unsigned char array but when we observe the assembly

        MOV     R7,#0AH
?C0001:
        MOV     R6,AR7
        DEC     R7
        MOV     A,R6
        JZ      ?C0002
;               rightTorqueArray[j] = j; }
                        ; SOURCE LINE # 52
        MOV     A,#LOW (rightTorqueArray)
        ADD     A,R7
        MOV     DPL,A
        CLR     A
        ADDC    A,#HIGH (rightTorqueArray)
        MOV     DPH,A
        MOV     A,R7
        MOVX    @DPTR,A
        SJMP    ?C0001
?C0002:

We noticed that the array is adressed with LOW and HIGH so apparantly it is treated as a 16 bit variable. But my assembly-nese is not so well, so please correct me if I am wrong.

I set the Code Optimalization at level 8: reuse Common entry code and the emphasis at Favor speed.

The assembly was produced as a .SRC file using #pragma SRC on top of the C-file.

Parents

0 P Heven over 7 years ago in reply to Andy Neil

I have been told and I have read on several websites about that decrementing variables is quicker than incrementing them. That's why some of my for-loops run backwards instead of forwards.

I remember reading that when I was at primary school of the programming. It is being wrong to think always the same. it is depending on the processor achitecture and situations.

if you using the assembler with the 8051 a simple 8 bit decrement counter is being very efficient with the DJNZ instruction but the compiler cannot always use it.

To writing efficient code of the 8051 you must goodly understand the processor. it is old and it is lots of limitations. The KEIL C51 compiler is surprising good but it is not magic and cannot make your bad understanding code very fast. you must help to make it good code.
Cancel
Vote up 0 Vote down

Cancel

Reply

0 P Heven over 7 years ago in reply to Andy Neil

I have been told and I have read on several websites about that decrementing variables is quicker than incrementing them. That's why some of my for-loops run backwards instead of forwards.

I remember reading that when I was at primary school of the programming. It is being wrong to think always the same. it is depending on the processor achitecture and situations.

if you using the assembler with the 8051 a simple 8 bit decrement counter is being very efficient with the DJNZ instruction but the compiler cannot always use it.

To writing efficient code of the 8051 you must goodly understand the processor. it is old and it is lots of limitations. The KEIL C51 compiler is surprising good but it is not magic and cannot make your bad understanding code very fast. you must help to make it good code.
Cancel
Vote up 0 Vote down

Cancel

Children

0 sebastiaan knippels over 7 years ago in reply to P Heven

"To writing efficient code of the 8051 you must goodly understand the processor. it is old and it is lots of limitations. The KEIL C51 compiler is surprising good but it is not magic and cannot make your bad understanding code very fast. you must help to make it good code."

@Kalib

I am asking for precisely this. I am specifcally asking you guys to help me to help the keil C51 to produce better code.

Can I make adressing array elements faster? And if so, how can I?
Cancel
Vote up 0 Vote down

Cancel
0 Andy Neil over 7 years ago in reply to sebastiaan knippels

Fundamental to answering that question is understanding the meaning & significance of the xdata keyword - hence my question above.

Surely, your "assembly guy" can help you with this ... ?

Have you looked at Appendix C, "Writing Optimum Code", in the Compiler manual ?
Cancel
Vote up 0 Vote down

Cancel
0 ²erik malund over 7 years ago in reply to Andy Neil

try incrementing 'j' (I HATE these nondescriptive variables) there is as stated above an INC DPTR and I would hope it is used by the compiler.
Cancel
Vote up 0 Vote down

Cancel
0 Andy Neil over 7 years ago in reply to ²erik malund

Not looking at the assembler posted above.

It is computing the DPTR value for each individual element access - rather than pointing the DPTR at the start of the array, and then incrementing it for each element.

This is where the "inefficiency" lies here.

Maybe incrementing a pointer - rather than using an index - would yield better results in this case ... ?
Cancel
Vote up 0 Vote down

Cancel
0 HansBernhard Broeker over 7 years ago in reply to Andy Neil

The absolutely foremost cause of inefficiency is that it's an XDATA array. Once that decision has been made, worrying about execution speed is, to a large extent, just a waste of time.
Cancel
Vote up 0 Vote down

Cancel
0 sebastiaan knippels over 7 years ago in reply to HansBernhard Broeker

@Neil The "assembly guy" cannot help, he is 66 years, worked his entire live at this company and most importantly: he does not know anything about C or Keil. The one programming language he ever learned is assembly.

Well a funny thing. If I remove xdata there is no difference in the ammount of assembly lines. We (me and he co-worker who isn't the assembly guy) already suspected it.

We only have 64 bytes memory for non-xdata memory and over 65kB of xdata, which I personally call "unlimited variable memory".

Co-worker already tried using pointers but his attempt produced the same problem. According to the link I sent: Optimal C Constructs for 8051 Microcontrollers by Nigel Jones.

Accessing xdata via for-loop supposed to be faster than using pointers. I will however examin appendix C first. See what that files has to say over the matter

I have another smaller question. For my software timers I removed most loops and I used constants instead. It uses more code but should have a quicker execution timer. Most of the arrays are still members of structs. Does it benefit if I were to remove the structs and use separate arrays instead?
Cancel
Vote up 0 Vote down

Cancel
0 Andy Neil over 7 years ago in reply to sebastiaan knippels

But, surely, he can explain the assembler to you? And explain why this is inherent in the 8051 architecture?

"Well a funny thing. If I remove xdata there is no difference in the ammount of assembly lines"

That's not funny at all - it is to be expected if you are using a Memory Model which defaults to XDATA.

As already noted, XDATA memory access is inherently slow & "inefficient".

"Co-worker already tried using pointers but his attempt produced the same problem"

So show what, exactly, he tried - and what, exactly, was the result.

As the "inefficiency" is inherent in the underlying hardware architecture, it is bound to happen!

"Accessing xdata via for-loop supposed to be faster than using pointers"

Says who?

"Does it benefit if I were to remove the structs and use separate arrays instead?"

Most likely not. More detail needed to answer specifically.
Cancel
Vote up 0 Vote down

Cancel

0 Andrey Shemet over 7 years ago in reply to Andy Neil


void test(void)
{
   uint8 rightTorqueArray[10], j;
      for(j = 10;j > 0; --j)
      {
         rightTorqueArray[j] = j;
      }
}
/////////////////////////////////////////////
    88: void test(void)
    89: {
    90:    uint8 rightTorqueArray[10], j;

    91:       for(j = 10; j > 0; --j)
C:0x199C    7F0A     MOV      R7,#0x0A
    92:       {
    93:          rightTorqueArray[j] = j;
C:0x199E    7453     MOV      A,#0x53
C:0x19A0    2F       ADD      A,R7
C:0x19A1    F8       MOV      R0,A
C:0x19A2    A607     MOV      @R0,0x07
    93:       }
C:0x19A4    DFF8     DJNZ     R7,C:199E
C:0x19A6    22       RET

0 Andy Neil over 7 years ago in reply to Andrey Shemet

But that example is not using XDATA - is it?
Cancel
Vote up 0 Vote down

Cancel
0 Andrey Shemet over 7 years ago in reply to Andy Neil

Yes, it is in data memory.
It was done intentionally, to demonstrate different memory and access type in x51.
It was not any restriction for it in the beginning of the topic.
Cancel
Vote up 0 Vote down

Cancel
0 Andy Neil over 7 years ago in reply to Andrey Shemet

OK - That's true.

A fundamental problem seems to be that the OP hasn't understood (or didn't originally understand) the implications of using XDATA - hence my early question about the significance of the xdata keyword.
Cancel
Vote up 0 Vote down

Cancel