We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
Hello everyone, I implemented a Moving average routine in Keil C51. And as I read else where that whenever I divide the number with 2^n the compiler uses shift operation, so that code is small and fast. And Compiler did optimize it in this condition:
average =BINHI/64; //BINHI, average is unsigned int.
However when BINHI is unsigned long instead of unsigned int the compiler uses actual division which is very costly in 8 bit. So is there any way to divide unsigned long using shift operations only, if I make sure divisor is 2^n.
You can do "value >> 6" to perform a 6-bit shift right, which represents a division by 64.
You can even do "(value+32) >> 6" if you want to round the result.
Just that a 6-step 32-bit shift on an 8-bit processor isn't as elegant as you might think.
One way it might happen (remember that a 32-bit value is represented by 4 bytes):
repeat six times: shift byte 3 right; shift byte 2 right with carry; shift byte 1 right with carry; shift byte 0 right with carry;
The other way it might happen:
byte0 = (byte0 >> 6) | ((byte1&3f) << 2); byte1 = (byte1 >> 6) | ((byte2&3f) << 2); byte2 = (byte2 >> 6) | ((byte3&3f) << 2); byte3 >>= 6;
And in that second example, "byte >> 6" is a single instruction for a procesor with barrel shifter. But for a tiny 8051 it isn't, so it's 6 one-step shifts.
So the compiler decided that it was no fun to do the 6-step shift of the 32-bit number and instead decided to divide.
So in the end, you can write short C code:
x = y / 64;
or
x = y >> 6;
but it will still end up as a quite costly task for the processor when x and y are 32 bits wide and the processor registers are 8 bits wide.
There is a reason why todays hobbyists, starting with cheap 32-bit ARM chips, can do magic at home compared to the projects most people implemented 20-30 years ago with 8051 chips.
Sorry - I did press send a bit early.
Notice that if you have enough numeric range, then it can be cheaper to not divide by 64 (or shift right by six) but instead multiply by 4 (or shift left by two).
That means you get the answer in the upper three bytes of a 32-bit variable.
And the 8051 processor - being an 8-bit processor - is excellent at picking up these three bytes and copy them one step down and then zero the high byte of the result.
In some situations, you can even drop these two shifts, by performing your running average on input values that have been scaled 4 times larger. Which means that your running total will also be four times larger. And all you need to do is to throw away the contents of the low byte.
Mann you are a life saviour. Shifting instead of division saved me like 200 bytes, (I am using silabs F350 MCU with nice peripherals but only 8K flash) it means a lot in my case. And especially your rounding technique, there's no way in hell I would have known that. A Really Big thanks to you.