How can I tell the linker/compiler not to use memset/memcpy function which use FPU registers?
For example: SCIOPTA allows to limit the use of the FPU for certain tasks (to improve task-switching). Tasks without FPU may not use FPU registers. But the compiler/linker uses optimized versions of memset() which results in an exception.
I tried to compile C files with --fpu none, but this produces link-timer errors.
For ARM compiler you could try --no_allow_fpreg_for_nonfpdata
--no_allow_fpreg_for_nonfpdata
See ARM Information Center
See
ARM Information Center
Hi Joseph,
that'll be the option I was looking for. Only it does not work. At least not with 5.04.
My compile options:
CPUFLAGS = --cpu cortex-a8 --apcs=/interwork --thumb
CFLAGS += $(CPUFLAGS) --debug --diag_suppress=550 --c99 --no_inline
CFLAGS += --diag_suppress=9933
CFLAGS += --no_allow_fpreg_for_nonfpdata
My code uses memset(...,0,...). _memset() in the Library checks if the destination is dividable by 4 and then jumps to memset_w which uses FPU registers.
Hi,
I guess the FP instructions is inside the C runtime library, not the compiler generated code. Is that correct?
I guess the option --no_allow_fpreg_for_nonfpdata only affects the compiler and there isn’t a switch to force C runtime code to use a different behavior.
Sorry that I don’t have a solution for you right now.
Regards,
Joseph
yes, it is the memset in the standard C library. Guess there should be a no_allow_fpreg_for_nonfpdata version of the library.
(Btw. GCC does the same and tries to be "smart" )
THX for your time.
Then the solution will be to use -no_allow_fpreg_for_nonfpdata while compiling the runtime library (if using GCC, that is).
I use GCC myself, and when building my toolchain, one of the steps is to generate multiple libraries; one for each CPU type; for instance, there's no Cortex-M0 with FPU, thus it's not necessary to generate a runtime library with FPU support.
Could be a solution. But isn't.
For one, GCC does not have that or similar option (at least 4.9.3 2014 Q4 from launchpad), nor does DS-5 (professional) come with library sources.
And the problem is not no-FPU or FPU. The problem is, that the user decides where to use the FPU and where not.
The armcc option says it all: FPU registers only for FPU operation.
If I compile a C program with float operations, the compiler shall use of course the FPU if activated per command line.
And of course mixing soft-FPU code and hard-FPU code is difficult and rather seldom. But why not? At least as long no higher functions like sin() is used.
In my opinion, either the linker should be able to link FPU compiled files against a non-FPU library _or_ provide FPU clean standard libraries.
The linker should never be allowed to modify the library, which is linked. Thus you must link with the correct library; this is your responsibility.
That means you will need a library, which contains code that does not use the FPU registers, if you don't want your code to use FPU registers.
I've used GCC 4.9 for more than a year now, and I have multiple libraries for each Cortex-M architecture and ARM7TDMI.
So I can choose whether or not I want to use the FPU registers on Cortex-M4 by supplying a switch on the command-line for GCC and for the linker.
The linker will chose the correct library, depending on my command-line switches; eg. if I built a library using -mcpu=cortex-m4 and -mtune=cortex-m3, and the switches I provide for the linker is exactly that, then this library will be linked.
I'm not sure whether launchpad's GCC is built this way these days; I think it was earlier.
If you want to solve the problem here and now, you could write your own substitution for memset and memcpy. Make sure the link-order is correct, so your library will be searched first. One problem with using this solution is that you might not catch all the library functions, which are using FPU-registers right away.
Jens,
I think you are missing the point.
Of course, if I use the FPU I must expect that for example printf() uses FPU registers.
But I do not expect that the compiler uses FPU registers for things like memset().
Since parts of the software are allowed (=> RTOS configuration) to use the FPU and some not, I need the choice.
Clearly, I cannot use printf() in a task which is not allowed to use the FPU. If so, it is _my_ fault.
So what is needed is that the linker has an option to choose the right library or I need the choice to build my own runtime library and setting the -no_allow ... option. But as I wrote, no library sources in DS-5 installation (maybe I need the ultra license).
Neither works in GCC and DS-5/MDK. I think IAR does a better job here.
I probably misunderstood the problem as you wrote.
Is the memset code you're speaking about, generated on-the-fly ?
Eg. GCC generates both memcpy and memset code, if the block to set/copy is less than four 32-bit words.
I know GCC does optimize short memset/memcpy and I had to fight this on PowerPC. But so far I had no problem with GCC and ARM.
The problem is with DS-5/MDK and memset() is in the library.
PowerPC, Atari ... we have a lot in common.
Unfortunately my only option is running GCC, so I know nothing about DS-5/MDK.
If we're lucky, maybe sellis or johannesbauer will put in a word or two about this.
Of course, if I use the FPU I must expect that for example printf() uses FPU registers. But I do not expect that the compiler uses FPU registers for things like memset(). Since parts of the software are allowed (=> RTOS configuration) to use the FPU and some not, I need the choice. Clearly, I cannot use printf() in a task which is not allowed to use the FPU. If so, it is _my_ fault.
Based on that comment, I'm not entirely sure I understand your use case.
I've seen things like Android builds where every application has been built (with gcc..) for VFPv3 floating point, and the only floating point usage for hundreds of application binaries like file and filesystem utilities, system daemons, really boring code with no reason to use floating point and doesn't do any heavy lifting or data movement or processing, is for register spilling (freeing up integer registers by using the floating point register file for temporary storage instead of the stack). This is obviously where the context switch overheard -- and power consumption of course -- would be adversely affected, and in a vast majority of the cases the register spilling is pointless because the functions that end up using it don't even use enough of the volatile register file to make it worthwhile. It makes a lot of sense, there, to identify which applications really need floating point and just turn it off completely in the build for those individual applications.
Per Stephen Theobald's advice, the ARM compiler works on two things to decide whether it will generate FP instructions -- the "no_allow_fpdata_in_fpregs" option affects code generation by the compiler, including things like register spilling. The "cpu" and "fpu" options affect which ARM C Library is linked in.
So by default you're seeing a sort of mix of both worlds -- a totally only-integer-requiring application which the compiler refrains from using FP register file, but has an optimized ARM C library which uses floating point registers.
Now, in your description, you're saying you want to build an application which does not have any FP instructions in it at all to save on context switch storage space and time to context switch -- the solution there is to build the application in question with --fpu=none or similar as Stephen suggested. If you use printf() and you use a floating point format string argument, it will use a softfloat variant of the library to get that working and never use a floating point register, so your context switch time and stack space would be preserved. However how much benefit do you really get here? The context switch time is minuscule, usually, compared to a large memcpy or memset. You would rather these workhorse functions execute quickly..
Back to your comment, in terms of building an application that can use floating point instructions (printf %d or a math library function like cosf() for example) but you do not want it to use the optimized ARM C library functions that would otherwise not use the floating point unit, this is pointless -- you'd end up with the expensive context switch anyway the moment one of those FP registers was used.
I think your question has been answered (as above, and previously by Stephen), so is there a particular issue here with building your application, still, if you disable floating point completely (cpu, fpu)?
Ta,
Matt
Mat,
the point is our RTOS allows users to disable FPU usage for certain processes (mostly for all but a few) to save context-switch time and stack space. The user is aware that they cannot use printf() or floating point mathematics in these tasks.
_But_ he is certainly not aware that the compiler uses FPU registers for a memcpy(). So the -no_allow.. option is surely something he needs to apply when compiling those processes. But since 99% of our customers have a single binary, they must link in the FPU version of the library since they use math functions in a few tasks.So I see the only way is to remove the optimized function from the library.
Cheers,
42Bastian
Perhaps a possible solution would be to write tiny wrappers for memcpy, memmove and memset.
Each wrapper checks whether or not the current task is allowed to use FPU registers.
-It could then call the right memcpy / memmove and memset, depending on the flag.