<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="https://community.arm.com/utility/feedstylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Division optimization problem</title><link>https://community.arm.com/developer/tools-software/tools/f/keil-forum/31752/division-optimization-problem</link><description> 
Hello everyone, 
I implemented a Moving average routine in Keil C51. 
And as I read else where that whenever I divide the number with 2^n
the compiler uses shift operation, so that code is small and
fast. 
And Compiler did optimize it in this condition</description><dc:language>en-US</dc:language><generator>Telligent Community 10</generator><item><title>RE: Division optimization problem</title><link>https://community.arm.com/thread/82137?ContentTypeID=1</link><pubDate>Wed, 17 Sep 2014 23:25:16 GMT</pubDate><guid isPermaLink="false">dd9e70c8-6d3c-4c71-b136-2456382a7b5c:b629fbd5-dbcc-486b-ba92-995483bda322</guid><dc:creator>sajid shaikh</dc:creator><description>&lt;p&gt;&lt;p&gt;
Mann you are a life saviour.&lt;br /&gt;
Shifting instead of division saved me like 200 bytes, (I am using
silabs F350 MCU with nice peripherals but only 8K flash) it means a
lot in my case.&lt;br /&gt;
And especially your rounding technique, there&amp;#39;s no way in hell I
would have known that.&lt;br /&gt;
A Really Big thanks to you.&lt;/p&gt;
&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item><item><title>RE: Division optimization problem</title><link>https://community.arm.com/thread/93265?ContentTypeID=1</link><pubDate>Wed, 17 Sep 2014 04:21:49 GMT</pubDate><guid isPermaLink="false">dd9e70c8-6d3c-4c71-b136-2456382a7b5c:66114b60-4bab-443a-a455-ffc419b583d4</guid><dc:creator>ImPer Westermark</dc:creator><description>&lt;p&gt;&lt;p&gt;
Sorry - I did press send a bit early.&lt;/p&gt;

&lt;p&gt;
Notice that if you have enough numeric range, then it can be
cheaper to not divide by 64 (or shift right by six) but instead
multiply by 4 (or shift left by two).&lt;/p&gt;

&lt;p&gt;
That means you get the answer in the upper three bytes of a 32-bit
variable.&lt;/p&gt;

&lt;p&gt;
And the 8051 processor - being an 8-bit processor - is excellent
at picking up these three bytes and copy them one step down and then
zero the high byte of the result.&lt;/p&gt;

&lt;p&gt;
In some situations, you can even drop these two shifts, by
performing your running average on input values that have been scaled
4 times larger. Which means that your running total will also be four
times larger. And all you need to do is to throw away the contents of
the low byte.&lt;/p&gt;
&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item><item><title>RE: Division optimization problem</title><link>https://community.arm.com/thread/68896?ContentTypeID=1</link><pubDate>Wed, 17 Sep 2014 04:10:43 GMT</pubDate><guid isPermaLink="false">dd9e70c8-6d3c-4c71-b136-2456382a7b5c:e4348abf-2455-494c-a19e-07f15b43515e</guid><dc:creator>ImPer Westermark</dc:creator><description>&lt;p&gt;&lt;p&gt;
You can do &amp;quot;value &amp;gt;&amp;gt; 6&amp;quot; to perform a 6-bit shift right,
which represents a division by 64.&lt;/p&gt;

&lt;p&gt;
You can even do &amp;quot;(value+32) &amp;gt;&amp;gt; 6&amp;quot; if you want to round the
result.&lt;/p&gt;

&lt;p&gt;
Just that a 6-step 32-bit shift on an 8-bit processor isn&amp;#39;t as
elegant as you might think.&lt;/p&gt;

&lt;p&gt;
One way it might happen (remember that a 32-bit value is
represented by 4 bytes):&lt;/p&gt;

&lt;pre&gt;
repeat six times:
    shift byte 3 right;
    shift byte 2 right with carry;
    shift byte 1 right with carry;
    shift byte 0 right with carry;
&lt;/pre&gt;

&lt;p&gt;
The other way it might happen:&lt;/p&gt;

&lt;pre&gt;
byte0 = (byte0 &amp;gt;&amp;gt; 6) | ((byte1&amp;amp;3f) &amp;lt;&amp;lt; 2);
byte1 = (byte1 &amp;gt;&amp;gt; 6) | ((byte2&amp;amp;3f) &amp;lt;&amp;lt; 2);
byte2 = (byte2 &amp;gt;&amp;gt; 6) | ((byte3&amp;amp;3f) &amp;lt;&amp;lt; 2);
byte3 &amp;gt;&amp;gt;= 6;
&lt;/pre&gt;

&lt;p&gt;
And in that second example, &amp;quot;byte &amp;gt;&amp;gt; 6&amp;quot; is a single
instruction for a procesor with barrel shifter. But for a tiny 8051
it isn&amp;#39;t, so it&amp;#39;s 6 one-step shifts.&lt;/p&gt;

&lt;p&gt;
So the compiler decided that it was no fun to do the 6-step shift
of the 32-bit number and instead decided to divide.&lt;/p&gt;

&lt;p&gt;
So in the end, you can write short C code:&lt;/p&gt;

&lt;pre&gt;
x = y / 64;
&lt;/pre&gt;

&lt;p&gt;
&lt;br /&gt;
or&lt;/p&gt;

&lt;pre&gt;
x = y &amp;gt;&amp;gt; 6;
&lt;/pre&gt;

&lt;p&gt;
&lt;br /&gt;
but it will still end up as a quite costly task for the processor
when x and y are 32 bits wide and the processor registers are 8 bits
wide.&lt;/p&gt;

&lt;p&gt;
There is a reason why todays hobbyists, starting with cheap 32-bit
ARM chips, can do magic at home compared to the projects most people
implemented 20-30 years ago with 8051 chips.&lt;/p&gt;
&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item></channel></rss>