How can I tell the linker/compiler not to use memset/memcpy function which use FPU registers?
For example: SCIOPTA allows to limit the use of the FPU for certain tasks (to improve task-switching). Tasks without FPU may not use FPU registers. But the compiler/linker uses optimized versions of memset() which results in an exception.
I tried to compile C files with --fpu none, but this produces link-timer errors.
Perhaps a possible solution would be to write tiny wrappers for memcpy, memmove and memset.
Each wrapper checks whether or not the current task is allowed to use FPU registers.
-It could then call the right memcpy / memmove and memset, depending on the flag.
Mat,
the point is our RTOS allows users to disable FPU usage for certain processes (mostly for all but a few) to save context-switch time and stack space. The user is aware that they cannot use printf() or floating point mathematics in these tasks.
_But_ he is certainly not aware that the compiler uses FPU registers for a memcpy(). So the -no_allow.. option is surely something he needs to apply when compiling those processes. But since 99% of our customers have a single binary, they must link in the FPU version of the library since they use math functions in a few tasks.So I see the only way is to remove the optimized function from the library.
Cheers,
42Bastian
Ok, thanks. I will give it a try.
Of course, if I use the FPU I must expect that for example printf() uses FPU registers. But I do not expect that the compiler uses FPU registers for things like memset(). Since parts of the software are allowed (=> RTOS configuration) to use the FPU and some not, I need the choice. Clearly, I cannot use printf() in a task which is not allowed to use the FPU. If so, it is _my_ fault.
Of course, if I use the FPU I must expect that for example printf() uses FPU registers.
But I do not expect that the compiler uses FPU registers for things like memset().
Since parts of the software are allowed (=> RTOS configuration) to use the FPU and some not, I need the choice.
Clearly, I cannot use printf() in a task which is not allowed to use the FPU. If so, it is _my_ fault.
Based on that comment, I'm not entirely sure I understand your use case.
I've seen things like Android builds where every application has been built (with gcc..) for VFPv3 floating point, and the only floating point usage for hundreds of application binaries like file and filesystem utilities, system daemons, really boring code with no reason to use floating point and doesn't do any heavy lifting or data movement or processing, is for register spilling (freeing up integer registers by using the floating point register file for temporary storage instead of the stack). This is obviously where the context switch overheard -- and power consumption of course -- would be adversely affected, and in a vast majority of the cases the register spilling is pointless because the functions that end up using it don't even use enough of the volatile register file to make it worthwhile. It makes a lot of sense, there, to identify which applications really need floating point and just turn it off completely in the build for those individual applications.
Per Stephen Theobald's advice, the ARM compiler works on two things to decide whether it will generate FP instructions -- the "no_allow_fpdata_in_fpregs" option affects code generation by the compiler, including things like register spilling. The "cpu" and "fpu" options affect which ARM C Library is linked in.
So by default you're seeing a sort of mix of both worlds -- a totally only-integer-requiring application which the compiler refrains from using FP register file, but has an optimized ARM C library which uses floating point registers.
Now, in your description, you're saying you want to build an application which does not have any FP instructions in it at all to save on context switch storage space and time to context switch -- the solution there is to build the application in question with --fpu=none or similar as Stephen suggested. If you use printf() and you use a floating point format string argument, it will use a softfloat variant of the library to get that working and never use a floating point register, so your context switch time and stack space would be preserved. However how much benefit do you really get here? The context switch time is minuscule, usually, compared to a large memcpy or memset. You would rather these workhorse functions execute quickly..
Back to your comment, in terms of building an application that can use floating point instructions (printf %d or a math library function like cosf() for example) but you do not want it to use the optimized ARM C library functions that would otherwise not use the floating point unit, this is pointless -- you'd end up with the expensive context switch anyway the moment one of those FP registers was used.
I think your question has been answered (as above, and previously by Stephen), so is there a particular issue here with building your application, still, if you disable floating point completely (cpu, fpu)?
Ta,
Matt
The C libraries provided in ARM Compiler 5.04 contains several variants of the memmove/memset routines, and the linker selects which variantto use according to the build attributes in the files being linked. If a file included in the link has an attribute indicating that it uses FPU instructions then the linker takes that to mean it is OK to use the variants of memmove/memset which use FPU instructions.
Obviosuly this does not work in your special case if your application could be using FPU instruction but only under special cirumstances.
One way around this is to create your own copy of the C library and remove the variants of memmove/memset that use FPU instructionfrom the library. Then the linker will have to use the non FPU variants of memmove/memset.
Copy the version of the ARM C library that you are linking with (here I am assuming it is c_2.l, the library name convention used is documented athttp://infocenter.arm.com/help/topic/com.arm.doc.dui0475j/chr1358938936497.html.Use the --info=libraries option to the linker to list the libraries your application is actually linking with):
cp $ARMLIB/armlib/c_2.l c_2.l
List all files in the library that containt the NEON variants of memmove/memset/etc (this step is just informative):
armar -t c_2.l | grep rt_neon_mem
Remove these files from your copy of your library
armar -d c_2.l `armar -t c_2.l | grep rt_neon_mem` Now use this modified c_2.l library to link with your application; the application should not longer be using versions of memmove/memset which contain FPU instructions.
Hi,
I'm afraid there's no easy way to avoid this with the ARM Compilation tools. At link time, if any of the user code was built for VFP then by default armlink will choose a library that is allowed to use the VFP. It is possible to override the linker's default choice, and force it to use a non-VFP library (see later), but that choice applies to the entire image, so that both your non-VFP and VFP-using functions will end up calling an inefficient non-VFP library.
When compiling with "--cpu=Cortex-A8", the compiler assumes VFP and NEON are allowed. To have better control over this, one solution is to compile your non-VFP using functions with "--fpu=SoftVFP", and the VFP-using functions with "--fpu=SoftVFP+VFPv3". These are link-time compatible. The linker will select the library functions that use floating point instructions but pass floating point parameters in core registers. As floating point instructions are still used by the library, that might not be sufficient to achieve the your goal of running some tasks entirely in soft-floating point and some in hard-floating point.
If all your code is compiled with either "--fpu=SoftVFP" or "--fpu=SoftVFP+VFPv3", the linker’s default choice of an "--fpu=SoftVFP+VFPv3" library can be overridden by adding the name of the soft floating point only library on the linker command line. In this case no library functions will use hardware floating point, so this is context switch safe, although will not be high performance if your VFP-using code makes heavy use of library functions. To find the software floating point library location, compile all files with "--fpu=SoftVFP" and use the linker option "--info=libraries" to list the libraries that the linker actually selected.
The only other solution would be to build _two_ executables, one containing the VFP-using functions and VFP libraries, and the other containing the non-VFP-using functions and non-VFP libraries, then join the two executables together somehow.
I use "memset", then the library function checks size and alignment and then calls a function memset_w which uses FPU registers to write 8bytes at once.
Are you explicitly calling memset function or compiler internally invokes it for various memory copy codes from one address to another.
Regards,
Techguyz
PowerPC, Atari ... we have a lot in common.
Unfortunately my only option is running GCC, so I know nothing about DS-5/MDK.
If we're lucky, maybe sellis or johannesbauer will put in a word or two about this.
I know GCC does optimize short memset/memcpy and I had to fight this on PowerPC. But so far I had no problem with GCC and ARM.
The problem is with DS-5/MDK and memset() is in the library.
I probably misunderstood the problem as you wrote.
Is the memset code you're speaking about, generated on-the-fly ?
Eg. GCC generates both memcpy and memset code, if the block to set/copy is less than four 32-bit words.
Jens,
I think you are missing the point.
So what is needed is that the linker has an option to choose the right library or I need the choice to build my own runtime library and setting the -no_allow ... option. But as I wrote, no library sources in DS-5 installation (maybe I need the ultra license).
Neither works in GCC and DS-5/MDK. I think IAR does a better job here.
The linker should never be allowed to modify the library, which is linked. Thus you must link with the correct library; this is your responsibility.
That means you will need a library, which contains code that does not use the FPU registers, if you don't want your code to use FPU registers.
I've used GCC 4.9 for more than a year now, and I have multiple libraries for each Cortex-M architecture and ARM7TDMI.
So I can choose whether or not I want to use the FPU registers on Cortex-M4 by supplying a switch on the command-line for GCC and for the linker.
The linker will chose the correct library, depending on my command-line switches; eg. if I built a library using -mcpu=cortex-m4 and -mtune=cortex-m3, and the switches I provide for the linker is exactly that, then this library will be linked.
I'm not sure whether launchpad's GCC is built this way these days; I think it was earlier.
If you want to solve the problem here and now, you could write your own substitution for memset and memcpy. Make sure the link-order is correct, so your library will be searched first. One problem with using this solution is that you might not catch all the library functions, which are using FPU-registers right away.
Could be a solution. But isn't.
For one, GCC does not have that or similar option (at least 4.9.3 2014 Q4 from launchpad), nor does DS-5 (professional) come with library sources.
And the problem is not no-FPU or FPU. The problem is, that the user decides where to use the FPU and where not.
The armcc option says it all: FPU registers only for FPU operation.
If I compile a C program with float operations, the compiler shall use of course the FPU if activated per command line.
And of course mixing soft-FPU code and hard-FPU code is difficult and rather seldom. But why not? At least as long no higher functions like sin() is used.
In my opinion, either the linker should be able to link FPU compiled files against a non-FPU library _or_ provide FPU clean standard libraries.
I guess the FP instructions is inside the C runtime library, not the compiler generated code. Is that correct?
Then the solution will be to use -no_allow_fpreg_for_nonfpdata while compiling the runtime library (if using GCC, that is).
I use GCC myself, and when building my toolchain, one of the steps is to generate multiple libraries; one for each CPU type; for instance, there's no Cortex-M0 with FPU, thus it's not necessary to generate a runtime library with FPU support.
View all questions in Arm Development Studio forum