Hello to all,
I am trying to port the old arm project written in C language and partly in assembly that was compiled with the armcc version 4.1. And I must mention I am not an expert on this, still learning, reading the arm documentation.
The selected new compiler is armclang v6.16. For the assembly code I am still using the armasm.
I have read the migration document and the difference between the old arm v5 and new arm v6 compiler and adapted the compiler/linker flags accordingly.
Everything is ok with the build, but the linking is failing with the error:
Error: L6291E: Cannot assign Fixed Execution Region ... Load Address:0x000179b4. Load Address must be greater than or equal to next available Load Address:0x00017f7c.
and what I could see from the map file, using fromelf for specific object file is that the build output from the armclang is bigger than when I used the older compiler. I did some basic test to compile empty functions and found out the following:
- Using old compiler: building the empty function void and non-void creates function of size 2 bytes when I look in the object file using fromelf or in the map file (Memory map of the image)
- Using the new compiler:
- building the empty function void creates function size of 2 bytes
- building the empty non void function creates function size of 4 bytes. (Very strange why additional 2 bytes).
For the code optimization I am using flag -oz as we are not using LTO in the project.
For the moment I am still trying to find out the reason, also checking if I misses any specific compiler flags.
Thank you for any useful information that could help me solve this mystery, Iknerf
Hello Ronan,
This is clear that memory sections are defined in the scatter file, but thanks to mention it, will check it again. I dont have much experiences with the linker files, is there an option to manually set code size section or something else?
I forgot to mention that I am using arm cortex m0+ and I am compiling using the instruction set thumb and following cflags-v -Oz -g -Wno=1581 -Wno=1296 -Wno=66 -diag_style=gnu --target=arm-arm-none-eabi -mcpu=cortex-m0plus -fshort-enums -fshort-wchar
-v -Oz -g -Wno=1581 -Wno=1296 -Wno=66 -diag_style=gnu --target=arm-arm-none-eabi -mcpu=cortex-m0plus -fshort-enums -fshort-wchar
What is puzzling me is that the code build with the new compiler is bigger and it does not fit into the specified memory sections.
For example I have empty non-void function:
uint8_t getHwVersion(void)
{
}
Old compiler armcc creates this assembly (2 bytes):
0x00000000: 4770 pG BX lr
The new armclang compiler creates this (4 bytes):
0x00000000: 2000 . MOVS r0,#0 0x00000002: 4770 pG BX lr
0x00000000: 2000 . MOVS r0,#0
0x00000002: 4770 pG BX lr
To me is not yet clear why new armclang compiler adds this MOVS r0, #0 as I dont pass any parameters. I assume this is something related to the stack. I am now trying to understand this and if possible to use any compiler flag to remove it as image is getting much bigger because of this additional 2 bytes per each function.
I also saw that building with new compiler uses separate sections per function and the old armcc did not use this. If I tried using the flags: -fno-data-sections -std=c99 -fno-function-sections then the image is even bigger.
-fno-data-sections -std=c99 -fno-function-sections
Best regards, Iknerf
I can replicate the MOV r0, #0 addition with Arm Compiler 6, even with the latest 6.20 version.I see armcc (I have 5.06 installed) does not generate it.I don't know the C specification to the level to know if what the compiler is doing is valid or not, but the function as defined is expecting a return value, which will be passed in r0. The instruction seems to be generated at all optimization levels.
Note that the compiler does warn about this code:
foo.c:1:25: warning: non-void function does not return a value [-Wreturn-type]
Is such code present in your application? Perhaps if you can explain what the code does, there may be a better way to do it? Is changing to a void return not viable?
Thanks for replying so fast, really appreciate it.
To answer your questions. No this code is not present in the application. The function getHwVersion() is not empty. I just commented the body to test the compiler. Main goal is to migrate the existing project from the old armcc to the new armclang, but armclang makes the image bigger and thought it does not fit intor the current memory layout and I want to understand what is happening.
I am still checking the options, maybe to increase memory a little if possible. At the moment it seems I am stuck with the current configuration.
If I uncommend getHwVersion() content, then with the old compiler I get size 36 bytes and with the new is 48 bytes ´+ additional 12 bytes that are not yet clear to me.
The content of the code has 3 macros that write/read the registers (set clock enable, read value from register, clear clock enable). Am trying to decipher macros so can then pass some example code here.
Regarding "MOV r0, #0 addition". As I saw it, this happens for all non-void functions.
Here is an example of compiled function getHwVersion where I lefts only one line that disables HW clocks
#define REG_SYS_CLK (0x40025044UL)
if (DisableHwClocks & DISABLE_CLK_MASK)
REG_SYS_CLK |= (1UL << 6);
getHwVersion 0x0000000a: 4905 .I LDR r1,[pc,#20] ; [disableHwClocks = 0x20] = 0 0x0000000c: 6809 .h LDR r1,[r1,#0] 0x0000000e: 0589 .. LSLS r1,r1,#22 0x00000010: d504 .. BPL {pc}+0xc ; 0x1c 0x00000012: 4904 .I LDR r1,[pc,#16] ; [0x24] = 0x40025040 0x00000014: 684a Jh LDR r2,[r1,#4] 0x00000016: 154b K. ASRS r3,r1,#21 0x00000018: 431a .C ORRS r2,r2,r3 0x0000001a: 604a J` STR r2,[r1,#4] 0x0000001c: 4770 pG BX lr $d 0x00000020: 00000000 .... DCD 0 ; disableHwClocks 0x00000024: 40025040 @P.@ DCD 1073893440
getHwVersion
0x0000000a: 4905 .I LDR r1,[pc,#20] ; [disableHwClocks = 0x20] = 0
0x0000000c: 6809 .h LDR r1,[r1,#0]
0x0000000e: 0589 .. LSLS r1,r1,#22
0x00000010: d504 .. BPL {pc}+0xc ; 0x1c
0x00000012: 4904 .I LDR r1,[pc,#16] ; [0x24] = 0x40025040
0x00000014: 684a Jh LDR r2,[r1,#4]
0x00000016: 154b K. ASRS r3,r1,#21
0x00000018: 431a .C ORRS r2,r2,r3
0x0000001a: 604a J` STR r2,[r1,#4]
0x0000001c: 4770 pG BX lr
$d
0x00000020: 00000000 .... DCD 0 ; disableHwClocks
0x00000024: 40025040 @P.@ DCD 1073893440
NEW:
getHwVersion 0x00000000: 48ff .H LDR r0,__arm_cp.5_0 ; = 0 0x00000002: 7840 @x LDRB r0,[r0,#1] 0x00000004: 0780 .. LSLS r0,r0,#30 0x00000006: d505 .. BPL {pc}+0xe ; 0x14 0x00000008: 2001 . MOVS r0,#1 0x0000000a: 0240 @. LSLS r0,r0,#9 0x0000000c: 49ff .I LDR r1,__arm_cp.5_1 ; = 0x40025044 0x0000000e: 680a .h LDR r2,[r1,#0] 0x00000010: 4302 .C ORRS r2,r2,r0 0x00000012: 600a .` STR r2,[r1,#0] 0x00000014: 2000 . MOVS r0,#0 0x00000016: 4770 pG BX lr $d.6 __arm_cp.5_0 0x00000018: 00000000 .... DCD 0 ; disableHwClocks
0x00000000: 48ff .H LDR r0,__arm_cp.5_0 ; = 0
0x00000002: 7840 @x LDRB r0,[r0,#1]
0x00000004: 0780 .. LSLS r0,r0,#30
0x00000006: d505 .. BPL {pc}+0xe ; 0x14
0x00000008: 2001 . MOVS r0,#1
0x0000000a: 0240 @. LSLS r0,r0,#9
0x0000000c: 49ff .I LDR r1,__arm_cp.5_1 ; = 0x40025044
0x0000000e: 680a .h LDR r2,[r1,#0]
0x00000010: 4302 .C ORRS r2,r2,r0
0x00000012: 600a .` STR r2,[r1,#0]
0x00000014: 2000 . MOVS r0,#0
0x00000016: 4770 pG BX lr
$d.6
__arm_cp.5_0
0x00000018: 00000000 .... DCD 0 ; disableHwClocks
And as you can see it, the new compiler makes bigger code. I thought that the new compiler should make smaller code. I really hope I am missing some compiler flag.
Thanks for any clue on this, Frenk
Hi Frenk,
I am not sure what advise to give here. Both implementations look functionally correct, but the semantics of the optimizers in both compilers is converting in subtly different ways.
Let me see if I can find anything to help.
Hi again,
A colleague suggested this article in case you haven't seen it already:https://developer.arm.com/documentation/ka002170/
Regards, Ronan
Hi,
Thanks for this link. I already read it and do all the optimizations there for C, but they don't help. The only thing I have not done is using -Omax and using the LTO as we dont support LTO.
I will now focus into scatter if maybe I can get more free memory for the code.
Have also an idea to use this disassembly and pass it to the armasm, maybe it can do further optimizations.
Best regards and thanks again for all the support, Iknerf
Hello,
From my side I could not do any additional optimization. As I can see it, the armclang produces around 20% bigger code than if I use old compiler armcc. So must now think how to solve this issue, the last resort is to stick with the old, good compiler.
Best regards, Frenk
In addition I also see other errors that did not occur before. In the project for example I see that one parameter is set in 2 different sections and in 2 different files:
file1.c:
Scatter file has both section defined.
If I compile with the old compiler everything is fine, but with the new compiler I get error:
Error: L6200E: Symbol gTableLen multiply defined (by SysTest.o and PatchTable.o).
Should I maybe for this create a new post?
With regards, Frenk
That seems like a reasonable linker error. I'm surprised the older tools do not catch it?
So this is actually not possible? Or is there any way to do it like that?
I thought that because they are different sections, its something like having:
void func(void) {
int dummy = 0;
if (<condition>) {
// and here we are now using this local dummy and there is no problem with build.
No, the symbol will exist across the entire image.
Does it link if you declare them as static?
Hi Ronan,
Sadly its not working, but still have few ideas to try.
Did you get any additional news regarding the code optimization if I use armclang?
Thanks for your time and support, Iknerf
I removed some unused code, just to at least build and prepare the image. I hope that later we found some solution how I can decrease the code using the armclang.
I did not change the scatter file and when build is completed a post processing is being done, just to verify that everything is ok. But apparently linker puts the XO section on wrong place. As I see first 2 sections are placed ok and then the 3rd one is placed too far away. I tried manually to set the address using ABSOLUTE in the scatter file and the got following address:
Error: L6244E: Exec region ER_FLASH_LIB address (0x00200228) not aligned on a 256 byte boundary.I then added ALIGN 4 parameter, but the error is still there.Sadly as I found out I cant use the parameter --no_legacyalign as it seems its deprecated.
I am now going through armlink user guide if maybe I missed something. I really hope that the new armlink does not have this fixed inside for the XO sectionScatter file is as following:
FLASH 0x00200000 0x30000{FLASH_FUNC_TABLE +0 FIXED { *(FLASH_FUNC_TABLE_LENGTH, +First) *(FLASH_FUNC_TABLE_PAYLOAD) *(FLASH_FUNC_TABLE_CRC, +Last) }
ER_FLASH_PATCH +0 FIXED { *(FLASH_PATCH_LENGTH, +First) *(FLASH_PATCH_PAYLOAD) *(FLASH_PATCH_CRC, +Last) }
FLASH_CODE +0 FIXED{ *(FLASH_CODE_LENGTH, +First) .ANY (+RO) *(FLASH_CODE_CRC, +Last) }
...
Thanks for any advice, Iknerf
Hi Frenk
This is probably beyond the scope of what I can provide here - we would likely need reproducible examples to advise further.
Can you raise an official support case from the support menu above?
At the moment the armlink alignment is solved by using the deprecated flag --legacyalign and change in the scatter file: ER_FLASH_LIB AlignExpr(+0, 4) FIXED