Arm Community
Site
Search
User
Site
Search
User
Support forums
Arm Development Studio forum
Cortex M4 FPU against fixed point math
Jump...
Cancel
Locked
Locked
Replies
7 replies
Subscribers
119 subscribers
Views
11492 views
Users
0 members are here
Options
Share
More actions
Cancel
Related
How was your experience today?
This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion
Cortex M4 FPU against fixed point math
michele corradin
over 12 years ago
Note: This was originally posted on 1st March 2012 at
http://forums.arm.com
Hi,
I'm working on 3 cortex, a STM32F027, cortex M3, a TI Cortex M4 and an Infineon Cortex M4.
I would like to move from a TI C2000 TMS320F2810 (fixed point 32bit core) to an M4 to control a 3 phase power bridge.
My algorithms nowadays work in fixed point math, IQ22, and are based for 98% on simple multiplications and some sine/cosine calculations: PI, PID, Pll, low pass filter, notch, ..
I ported the algorithm in the cortex mainly redifining the IQmpy, moltiplication, and the IQsin, sine calculation first in fixed point then in floating point.
I was expencitng to have a speed improvment running in floating because every multiplication in fixed math requires a shift while in floating I don't need the shift but I'm exeriencing a dramatic slow down of the algorithm running in floating point.
I'm doing my test in IAR.
I checked the assembler and I verified the compiler is using the floating point.
My only explenation is that the FPU doesn't have, as far as I know, direct access to the CPU registers so every multiplication in FPU requires 2 loads to the FPU registers and another load to move the result to the CPU register.
Is there anybody that can confirm me that?
Thank you very much
michele
michele corradin
over 12 years ago
Note: This was originally posted on 1st March 2012 at
http://forums.arm.com
dear joseph,
I'll post a couple of example with the assembler.
In my case I'm completely switching all my code from IQ22 to float and I verified it is using the floating point.
Let's me make a "simple" question: is it true that the FPU doesn't have direct access to the core registers so to perform an operation in the FPU I have to load the data from the CPU registers to the FPU and back?
I check the manual but I'm not sure if I understood right: in the assembler it seems to me to see some load operations.
Thank you very much for your help
michele
Cancel
Vote up
0
Vote down
Cancel
michele corradin
over 12 years ago
Note: This was originally posted on 2nd March 2012 at
http://forums.arm.com
Dear Joseph
I verified that there was a cast error in my algorythms moving from IQ math to float: now the floating code runs a 10-20% slower then the IQ one.
I notice some vmov, vstr operations .. I guess that explain what I said before.
I downloaded the ARM DSP library: is there a speed report about IQ, float operation?
Thanks a lot for your support
Michele
Cancel
Vote up
0
Vote down
Cancel
michele corradin
over 12 years ago
Note: This was originally posted on 5th March 2012 at
http://forums.arm.com
Yes, I know ARM provides Q15,Q31 and single precision floating point libraries. I mean if there is any comparison of speed between the execution time of those library in Q15, Q31 and floating maybe on sinewave calculation or PID, ..
Thanks
Michele
Cancel
Vote up
0
Vote down
Cancel
Joseph Yiu
over 12 years ago
Note: This was originally posted on 1st March 2012 at
http://forums.arm.com
Hi Michele,
Without looking at the code I cannot be sure what is happening, but maybe the switching between the IQ22 and single precision is possibly the main issue. It is not just copying the data from integer register to floating point register and back, as you will also need to add in the exponent and sign bit, and the IEEE754 single precision format use 23 bits rather than 22 bits, so you might have additional shift operations there.
Can you change all the operations to single precision floating point?
regards,
Joseph
Cancel
Vote up
0
Vote down
Cancel
Joseph Yiu
over 12 years ago
Note: This was originally posted on 1st March 2012 at
http://forums.arm.com
Hi Michele,
Yes, you are correct.
The floating instructions operates on the floating point register bank. There are instructions to transfer floating point data to/from memory. So in theory the floating point data do not have to go through the integer register bank at all. But when mixing with IQ22 or fixed point, which (assumed) are processed in the integer registers, then it has to be transferred and converted between the two register bank. Instructions to convert between floating point and fixed point are available. So even the conversion is needed it shouldn't be too much worst.
The instruction set of the Cortex-M4 floating point unit can be found in this pdf document:
http://infocenter.arm.com/help/topic/com.arm.doc.dui0553a/DUI0553A_cortex_m4_dgug.pdf
or from ARM Infocenter:
http://infocenter.arm.com/help/index.jsp
-> Developer Guides and Articles
-> Software Development
-> Cortex-M4 Devices Generic User Guide
Potentially there are other areas that can make the performance worst
- accidentally used double precision data/functions
- Compiler/run-time library setting (e.g. hard VFP vs soft VFP)
regards,
Joseph
Cancel
Vote up
0
Vote down
Cancel
Joseph Yiu
over 12 years ago
Cancel
Vote up
0
Vote down
Cancel
Joseph Yiu
over 12 years ago
Note: This was originally posted on 9th March 2012 at
http://forums.arm.com
Hi Michele,
The information available on public domain is limited.
There are some information available. For example:
http://www.emcu.it/STM32/STM32_Journal/stm32_journal_1_1.pdf
http://www.embedded-world.eu/fileadmin/user_upload/pdf/arm_entwicklerkonferenz_2011/Session_3/01%20-%20Developing%20Advanced%20Signal%20Processing%20_Johnson_ARM.pdf
I know that this might not be exactly what you want, but you can generate the data using instruction set simulator in Keil MDK if needed.
regards,
Joseph
Cancel
Vote up
0
Vote down
Cancel