This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

How to get C51 to generate a DJNZ loop?

I want a fast and simple delay routine:

#pragma OT(8,SPEED)    // Switch to speed optimization for this routine
void my_delay(BYTE count)
{
        register BYTE i;
        for (i=count; i; i--);
}


Is the best so far and becomes:

B02:0xB1F1  EF       MOV      A,R7
B02:0xB1F2  6003     JZ       B02:B1F7
B02:0xB1F4  1F       DEC      R7
B02:0xB1F5  80FA     SJMP     I2C_delay(B02:B1F1)
B02:0xB1F7  22       RET


but I was hoping there was a way to get C51 to do just a DJNZ instead of 4 instructions for each loop.

Is there a magic coding style or optimization level that will generate the DJNZ?

  • So you want the fastest 1ms delay the world has ever seen? :)

    One thing to think about, is that you should always try to use pre-increment/pre-decremented whenever possible.

    With post-increment, you are telling the compiler to keep a copy of the variable _before_ the increment, because you plan to use the old value after you have incremented.

  • At the rist of bragging, if you want specific assembler instruction sequences, use assembler.

    The job of the C51 is to create correct code, the job is not to create specific assembler code sequences you might want to have.

  • One thing to think about, is that you should always try to use pre-increment/pre-decremented whenever possible.

    It's so surprising to see someone write that! When I first moved from exclusive assembly to C, that was one of the first discoveries I made and have followed that same rule since.

    For some reason the post increment is far more common. It looks like the language name C++ has had at least a partial influence there.

    But mention it to many people and it's generally dismissed. We even had one of the posters of this forum do a code review for us a few years ago and he actually insisted that the post increment was not just favourable, but actually better. He was not open to persuasion either. Go figure!

  • All 4 variants of the code result in the same 4 instruction loop.

    //      register BYTE i;
    //      for (i=count; i; --i); // this is a 4 instruction loop - MOV/JZ/DEC/SJMP which should simply be a DJNZ but I can't get Keil to generate it!
    //      for (i=count; i; i--); // this is a 4 instruction loop - MOV/JZ/DEC/SJMP which should simply be a DJNZ but I can't get Keil to generate it!
    //  while(count) count--; // this results in the exact same 4 instruction loop.
      while(count) --count; // this results in the exact same 4 instruction loop.
    


    So at least for C51 the pre vs post increment appears to make no difference.

    I can always code it in assembler. But in embedded programming, you often want to know the timing of instructions to at least a gross level. In this case, the speed of 4 instructions is much slower than a single instructions so I was hoping for a simple method to get the result I am looking for.
    I was hoping maybe there was an optimization switch that would get the DJNZ instruction. I tried a few but I always seem to get the same instructions.

    In the for loop, if you use "i>0" instead of just 'i' you get a 6 instruction loop which does a SETB C and a subtract against 0 even though i and i>0 are the same thing.

  • So at least for C51 the pre vs post increment appears to make no difference.

    For that one very limited use, maybe. You're making a mistake if you think that will extend to other situations.

    If you want to control specific opcodes, you'd better look at assembler. Not many compilers allow the control you're looking for.

  • "even though i and i>0 are the same thing"

    Remember that "i" is the equivalent of "i != 0". And "i > 0" is not the same as "i != 0" unless the specific case of operating on unsigned integers. In this case, Keil seems to have missed an optimization. But then all compilers misses optimizations - it's just a question of how many optimizations they miss.

    Not sure still why you worry about saving instructions in the ms delay. Do you plan to create longer delays by doing multiple calls to this function so a 1% error in the 1ms delay will result in a 10ms error for a 1-second delay?

    You should only do very short delays with instruction-counting in busy-loops. Whenever you need longer delays you should make use of timers or similar. Busy-loop while polling the timer for semi-long delays and make use of interrupts for longer delays.

    The problem with instruction-counting is that it fails to take into account time lost in interrupt handlers - busy-loops with disabled interrupts aren't fun. And any device with some form of interface wants to respond to interface actions without being locked in long delays.

    Instruction-counting delays are best when you need very, very short delays for some setup or hold times. So five "nop" might be enough to have a signal stabilize before performing the next step of some hardware manipulation.

    The difference between pre-, and postincrement relates to the need to make use of the variable value. Post-increment regularly creates a need for a temporary variable with the value before the increment/decrement. In cases where the value isn't used, you would normally get the same performance with both constructs.

  • this thread is missing one very important fact
    delay routines in C are fraught with error opportunities
    the next version of the compiler may do it differently (use DJNZ ?)
    the optimizer may change the code
    ....

    erik

  • Changes to the delay caused by changed compiler version or changed optimization level isn't so much of a problem - code that relies on an instruction-counting busy loop should have some form of benchmark function so normal regression testing can validate the delay.

    What is worse is that global optimization can change the delay after code changes in completely different parts of the code. Because of the problems with supporting high-level languages on the 8051 architecture, you can get parts of the optimization actually happening in the linker. It isn't until linking that variable overlaying can be performed - and first then will it be known the actual address distance between different symbols.

    Making use of a timer means only a changed clock frequency will require code adaptations. So every new release build doesn't require the delay function to be explicitly checked.

  • code that relies on an instruction-counting busy loop should have some form of benchmark function so normal regression testing can validate the delay.
    and you think that will happen?

  • No, I don't. Most people who write busy-loop delays based on execution speed doesn't even realize there are issues with that concept. And neither do they know how to make code testable.

    I just hope that hardware I buy have the code developed by someone who do care and have a reasonable amount of knowledge.

    The big problem is that too much bad code exists on the net - and when people see bad code enough times they tend to assume that it is representative code that represents best practices. People tends to think that if they Google and find an answer, it has to be a good answer - why else would the information show up...

  • Two things:

    1) it is possible to write the loop such that C51 will use a djnz instruction
    2) no, I won't show how, because that would be an exercise in futility

    Writing a busy-loop delay in C is totally silly anyway, so I will not encourage it in any way.

  • This is not a religious war. PLEASE STOP the discussion of the reasons not to use instruction delay loops! Please just answer the question!

    I have very low requirements for the accuracy of the delay loop.
    We all know instruction delays are very inaccurate.
    But in this case, I don't care about accuracy of the timing!

    I just want the top end to be a little faster by using a DJNZ instead of 4 instructions.

    I can always code it in assembler if I really felt it was necessary. But it isn't. It would just be handy not to waste quite so much time.

    So PROVE to me you can get Keil C51 to produce a DJNZ and I will be very thankful and praise you on high.

  • Remember that you need not separate comparison and decrement.

  • Writing a busy-loop delay in C is totally silly anyway, so I will not encourage it in any way.

    That's got to be one of the stupidest responses seen here in ages. Why say that after taunting the OP with the suggestion the very thing you don't want to encourage is possible?

    My suggestion for this is simple: Use a small piece of assembler and stop this useless thread.

  • Why say that after taunting the OP with the suggestion the very thing you don't want to encourage is possible?

    Because whether it's possible is an entirely different issue from whether it's advisable.