We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
Hello, I have to compute the square root of a floating point number. A function sqrt() already exists in the math.h, but this function is too slow and the computation time depends on the input value (because of an iterative method used to compute the square root). The accuracy requirements in my case are not very high, so 2 digits after decimal point would be enough. So I seached for an alternative method to compute a square root and found an interessting method, that uses an approximation to compute the square root (see below). But this only works on 32 bit CPUs (little endian), so I tried to port it to a ST10. I thought this is not very difficult because the ST10 also uses little endian and is IEEE-754 compatible. The function to approximate the square root: float fastsqrt(float val) { int tmp = *(int *)&val; tmp -= 1<<23; /* Remove IEEE bias from exponent (-2^23) */ /* tmp is now an appoximation to logbase2(val) */ tmp = tmp >> 1; /* divide by 2 */ tmp += 1<<23; /* restore the IEEE bias from the exponent (+2^23) */ return *(float *)&tmp; } I replaced all int with long (because long is 32bit on ST10), but this doesn't work anyway. It seems that the behavior of the shifts is different on the ST10 in combination with long numbers than on a 32-bit CPU. Does anybody has same hints for me to make this function working? Or has somebody an alternative function for approximate or compute a square root in a quick way on a ST10? Thanks a lot... Alexander
Hi Mike, thanks for your answer. Yout hint doesn't work, I'm afraid. I don't know why this function work, but it does work. It is a special case of newton's method in combination with the bit-format of an IEEE-floating-point-number. Please referr to http://en.wikipedia.org/wiki/Methods_of_computing_square_roots#Approximations_that_depend_on_IEEE_representation for more information. regards... Alexander
The algorithm assumes that 'int' is a 32-bit number. Replace 'int' with 'long'.
Jack Gannsle has had a few articles about fast approximations in embedded magazine (I then to remember feb-apr) go to http://www.embedded.com and hunt them up Erik
Hi Alexander, I read the Wikipedia article you mentioned. Seems like this should work. The function will return a denormalized result, but this should not be a problem. Please confirm that this works (or doesn't work):
float fastsqrt(float val) { long tmp = *(long *)&val; tmp -= 1L<<23; /* Remove IEEE bias from exponent (-2^23) */ /* tmp is now an appoximation to logbase2(val) */ tmp = tmp >> 1; /* divide by 2 */ tmp += 1L<<23; /* restore the IEEE bias from the exponent (+2^23) */ return *(float *)&tmp; }
Hello together, I have tested the modified function Mike posted above. This function doesn't work. On my uC-board is a display (16x2 chars) so I'm able to see the result of the fastsqrt-computation on the display and the result is always zero (independed of input-value). I used this quick&dirty hacked program: So there are same variables and #include I have to use to control the display. But this has no effect on the fastsqrt-function.
#include "stdio.h" #include "math.h" #include "Intrins.h" // intrinsic commands (nop..) #include "regst10F269.h" // Register Set of ST10F268 controller // variables for the display unsigned int far display_count1,display_count2,x,disp_pos,disp_data; unsigned char far c; bit far disp_busy, disp_tmp1, disp_tmp2; #include "display_2zeilig_4bit.h" // display control float fastsqrt(float val) { long tmp = *(long *)&val; tmp -= 1L<<23; /* Remove IEEE bias from exponent (-2^23) */ /* tmp is now an appoximation to logbase2(val) */ tmp = tmp >> 1; /* divide by 2 */ tmp += 1L<<23; /* restore the IEEE bias from the exponent (+2^23) */ return *(float *)&tmp; } void main(void) { float sqrtval = 10.0; float result = fastsqrt(sqrtval); float result2 = sqrt(sqrtval); init_display(); printf("%f, %f", result, result2); }
fastsqrt PROC FAR PUBLIC fastsqrt ; FUNCTION fastsqrt (BEGIN RMASK = @0x0330) MOV [-R0],R9 MOV [-R0],R8 SUB R0,#4 MOV R4,[R0+#4] ; val MOV R5,[R0+#6] ; val+2 MOV [R0],R4 ; tmp MOV [R0+#2],R5 ; tmp+2 MOV R8,R4 MOV R9,R5 SUB R9,#128 MOV [R0],R4 ; tmp MOV [R0+#2],R9 ; tmp+2 MOV R5,R9 SHR R4,#1 BMOV R4.15,R5.0 ASHR R5,#1 MOV [R0],R4 ; tmp MOV [R0+#2],R5 ; tmp+2 ADD R5,#128 MOV [R0],R4 ; tmp MOV [R0+#2],R5 ; tmp+2 MOV R4,[R0] ; tmp MOV R5,[R0+#2] ; tmp+2 ADD R0,#8 RETS ; FUNCTION fastsqrt (END RMASK = @0x0330) fastsqrt ENDP
MOV R8,#0 MOV R9,#16672 CALL fastsqrt
Hi Alexander, I've played around with the code a bit. I believe that you need to replace 1L<<23 with 127L<<23. That's the correct way of removing and restoring IEEE exponent bias, if I am not mistaken, based on the discription of float representation in memory from Keil's docs. I tried it and it gave sensible results. Googling around I found this link, which seemed interesting: http://www.mactech.com/articles/mactech/Vol.14/14.01/FastSquareRootCalc/ Regards, - mike