Hi
I am working with a microsemi cortex M3 ARM with uVision and ulink2 mdk 5.12 and I am getting a PRECISERR bus fault from a memory read during initialization by the scatterloader decompress routine. It is definitely trying to read a bad address. This code is created by the linker to initialize my memory/variables so its not something I have direct control over.
Here is the goofy thing, I found that I could get around getting a hard fault 1 time if after hitting the hard fault I get out of the debugger and right back in. Then I can run as long as I don't try to restart by clicking the reset button on uVision.
I don't understand why this would make any difference but it does and allowed me to work for the past few weeks although its slow and awkward.
This weekend my luck ran out and now I get that hard fault every single time I trying to run. My trick no longer works. All I did just before this started was to add 1 line of code and recompile and reload.
Here is a interesting clue. Out of desperation I recompiled my entire project and reflashed and I was able to get to Main! All I did was recompile. I did some testing and had to make one minor change to a index and recompiled and now I am back to getting that hard fault every time. None of my tricks work either.
Its still coming from the same loop in the decompress routine where it reads a address that is just below my DDR memory. 0x9FFFFEFF
It really sounds like some kind of memory alignment issue or something like that. But the clues do not seem to be related.
Why would adding or removing a few bytes from my code cause the decompress routine to hard fault? Why did recompiling the project allow me to get past the decompress problem once? Why does clicking reset and trying to run the same code cause the decompress to suddenly not work? Why would getting out the debugger and back in allow it to do the decompression correctly one time?
Arrrgg.
I am totally stopped from doing any more development because the decompression code always causes a hard fault from the same address and memory access.
Has anyone experienced anything like this before?
I have flash it into memory and let it run and it still gets a hard fault.
I have been doing some testing/investigating and I think I was off on thinking it was the compiler setting. But there is something going on interesting. I wish I could post a screen shot because that would be clear as to what I have found out.
When the debugger starts it does write to DDR. ESRAM also contains the exact same data in the same order. I believe decompress is copying the data in esram to the DDR. It writes what will be my initialized data. It writes to 0xA0000000 (mixed hex and ascii)
0x0 0xD 0x1A SD_NO_ERROR . .. .
My memory map says that at this address should be my error array which has a first value of SD_NO_ERROR (ascii). THE 0 is the offset to the current address to write, the 0xD is the length of the string and the 0x1A is ???. there are 3 hex bytes in front of every string.
I can run to main and this is what the code initialized the DDR memory to:
SD_NO_ERROR .....
This is exact correct. What the debugger wrote was ignored and over written as it should be.
Now, if I click reset and go to decompress and I see that the 3 bytes before the string are missing from ESRAM so when it loads the offset to write it gets 0x53 ("S") instead of 0 and a count of 3 bytes instead of 0xD:
D_NO_ERROR
If I let it continue with these bad values it will eventually cause hard fault because it tried to write to address 0x9FFFFB5 which would be below my DDR.
If I reload with the debugger and run to the start of decompress the esram looks good with with those 3 bytes before each string. I can reset over and over and every time it looks the same and correct.
But if I allow it to write just the first string SD_NO_ERROR to DDR and then do a reset and run to the start of decompress, those 3 bytes are missing from ESRAM and SD_NO_ERROR is at 0x2000000 instead of those 3 bytes and then the string.
Why writing the string to DDR would cause whatever initializes the esram with the string values to be off by 3 bytes is a mystery? The code is in nvm memory and so should these initial values that are being put into esram and writing them out to DDR should have not affect.
Getting out of the debugger and back in fixes this data in esram some how because after doing that and running to decompress the esram is correct again.
The debugger is doing something with getting that data into the esram correctly although that all should be done by the code itself.
This is why i thought it might be a compiler switch issue but now it looks like the act of writing the data to DDR changes what is loaded upon the next reset by 3 bytes.
I'll look back farther into the scatter loader to see how that esram data is being put into esram memory.
Ok I found how the data that was being put into esram is coming from. The scatter loader is called first and it calles scatter loader_null which believe it or not copies the values in the DDR into the esram!!!!! Holy cow. The DDR has been either just written by the debugger with the correct init values or was over written with the initialization values.
On the second (after reset) and all other runs of the code it will continue to copy the final initialized code into esram DDR and then copies the esram data BACK INTO DDR. The 3 bytes at the beginning of each string is long gone. It is only there on the first run when the debugger puts it into DDR. All runs of the code after that are picking up whatever the DDR was initialized to.
This is not offset, string length or any information to tell the decompress routine how much to copy and to where. Its just chaos at this point.
So the code is definitely depending upon the debugger to put this proper data into DDR so it can copy that to esram so that it can be decompressed back into DDR memory minus the 3 bytes of string information.
This cannot not possibility be how this ARM processor and code will run operationally. It cannot depend upon a debugger to put that information in DDR memory.
There is some code that should be created by the linker to do this and not the debugger. Obviously this code is not being generated by the linker so that the proper initialization information is made available to the decompress routine.
How does one turn that on? I need some help here because I have not idea why this missing code is not being created.
So exactly what does your scatter file look like?
Sounds a bit like some data that in a normal build should be stored in flash is in your build stored in a RAM region (downloaded by the debugger) that will be overwritten when the program starts to run - so a second run will basically have "parts of the 'nonvolatile flash' destroyed".
Exactly - show us your scatter file. The problem is there, or in your hardware.
OK that would make total sense. I am sure I have accidentally put that initial data into the DDR memory space through the use of a wildcard * .
I didn't even think about that. I assumed that the data would be a integrated part of the code and not a separate data item. I just put every RW section into DDR. I assumed that this static data would RO only. It should be RO data and not RW. I think that would be a misclassification of that type of data
Do you know what it would be called? Its the .data section right? The .bss is my read/write variables?
0xA0000000 is where I found the scatterload loading the initialization data and that is where the .data sections are a grouped together.
FLASH_LOAD 0x00000000 0x00080000 ; load region size_region { ER_RO 0x00000000 0x80000 ; load address = execution address { *.o (RESET, +First) *(InRoot$$Sections) ; startup_m2sxxx.o (.text) system_m2sxxx.o (.text) ; sys_config.o (.text) low_level_init.o (.text) retarget.o (.text) * (+RO) } ER_RW 0x20000000 UNINIT 0x10000 { startup_m2sxxx.o (STACK) } } MDDR_RAM 0xA0000000 0x10000000 { ER_DDR 0xA0000000 UNINIT 0x10000000 ; RW data DDR { * (+RW +ZI) * (HEAP) } }
Base Addr Size Type Attr Idx E Section Name Object 0xa0000000 0x00001c8e Data RW 7 .data accumulationnode.o 0xa0001c8e 0x00000002 Data RW 535 .data statemachine.o 0xa0001c90 0x00000004 Data RW 777 .data processidle.o 0xa0001c94 0x00000001 Data RW 1008 .data timerirq.o 0xa0001c95 0x00000001 PAD 0xa0001c96 0x00000002 Data RW 1069 .data dummy.o 0xa0001c98 0x000001e4 Data RW 1204 .data diskio.o 0xa0001e7c 0x00000006 Data RW 1365 .data ff.o 0xa0001e82 0x00000002 PAD 0xa0001e84 0x00000055 Data RW 1435 .data uart.o 0xa0001ed9 0x00000003 PAD 0xa0001edc 0x0000001c Data RW 1603 .data system_m2sxxx.o 0xa0001ef8 0x00000004 Data RW 1672 .data retarget.o 0xa0001efc 0x00000008 Data RW 1888 .data mss_can.o 0xa0001f04 0x00000018 Data RW 2047 .data mss_pdma.o 0xa0001f1c 0x0000002c Data RW 2121 .data mss_comblk.o 0xa0001f48 0x00004d34 Zero RW 5 .bss accumulationnode.o 0xa0006c7c 0x0000000c Zero RW 1203 .bss diskio.o 0xa0006c88 0x000002d0 Zero RW 1504 .bss fault_handler.o 0xa0006f58 0x00000018 Zero RW 1950 .bss mss_hpdma.o
Moving the .data sections to nvm solved the problem. I can run to main every time now. Decompress does not even get called now. I could not see the forest because of the trees.
Thank you very much for you help in leading me to the answer.
Steve
Isn't 0xA0000000 the base of your DDR? Why the heck are you copying stuff there, or allowing the areas to conflict?
Stuff that gets unpacked by the linker/loader needs to reside INSIDE the ROM/FLASH LOAD REGION braces.
If you are remapping 0 <-> 0xA0000000 you need to carve that space out of the linkers view so it doesn't put data over the top of it.
for this build I am not remapping. We needed to get this out by our deadline this friday so I created a no remapped nvm smaller version of my code. So I am only using it for my variables and heap right now. Just a huge hunk of ram.
Once that delivery is done. I will go back and work on the remapped version and your right my code and that variables will both be in DDR memory. I have a different scatter file for that mapping.
In fixing this initialization problem by moving the .data sections to nvm my mallocs will no longer allocate memory. They were working ok before making this change.
Do you know if there is anything in the .data section that would affect malloc?
A LOAD REGION is like a box of furniture from IKEA, all the parts need to fit inside the box for it to be shipped to you, this would be the "linkers" job. The "loaders" job is then to unpack the parts and assemble the pieces where you want the final constructed piece of furniture to end up. The separate parts cannot fit in the same time/space as each other without distorting the fabric of the universe.
FLASH_LOAD 0x00000000 0x00080000 ; load region size_region { ER_RO 0x00000000 0x80000 ; load address = execution address { *.o (RESET, +First) *(InRoot$$Sections) ; startup_m2sxxx.o (.text) system_m2sxxx.o (.text) ; sys_config.o (.text) low_level_init.o (.text) retarget.o (.text) * (+RO) } ER_RW 0x20000000 UNINIT 0x10000 { startup_m2sxxx.o (STACK) } ; ER_DDR 0xA0080000 0xFF80000 ; RW data DDR (if FLASH gets copied into DDR space) ER_DDR 0xA0000000 0x10000000 ; RW data DDR (if DDR doesn't clash with code space) { * (+RW +ZI) * (HEAP) } }
Thanks. This is what my remap scatter file looks like.
This problem with initialized variables is still a problem unfortunately. I know that move the .data sections into nvm solves the problem of having the initial values in nonvolatile memory and it does work and does not depend upon the debugger to load the values because they are part of the nvm now.
But it seems that the linker is also placing the variable address in nvm when it must be in a RW area like esram or DDR. The linker is moving all variable address to nvm. Even if they are not initialized at boot time.
I found this out when very early one of my routines attempted to write to a variable and it got a hard fault because when I looked at the map it was located in the flash memory. It was part of the .data section. In fact all variables are part of the .data section.
But if I move the .data section back to DDR, I am back the the original problem because the initialized data is put back into DDR as well. I cannot see a way to separate the address of a variable from the values used to initialize it.
The values should be in nvm to be accessed by the scatter loader and copied into the DDR where the variable is located. The .data section seems to apply to both which is a problem and does not make sense either.
There is a .constdata but it does not include the initialed variables which are .data
How am I to get the initial data to nvm and the variable to RW DDR?
This is my current scatter file
FLASH_LOAD 0x00000000 0x00080000 ; load region size_region { ER_RO 0x00000000 0x40000 ; load address = execution address { *.o (RESET, +First) *(InRoot$$Sections) * (+RO) * (.data) } ER_RW 0x20000000 UNINIT 0x10000 { startup_m2sxxx.o (STACK) } } MDDR_RAM 0xA0000000 0x10000000 { ER_DDR 0xA0000000 UNINIT 0x10000000 ; RW data DDR { * (+RW +ZI) * (HEAP) } }
I'll have to ponder, but if your DDR is properly initialized prior to calling __main, the C runtime / scatter loader should be able to initialize the data there.
I don't understand why your platform is so broken that this isn't being done already. Or how getting this to work properly isn't going to solve both your Friday issue, and the shadow/remap issue.
Why do you still have MDDR_RAM, you need to package ALL of the released image components in the FLASH?
In case the IKEA concept isn't working for you www.keil.com/.../armlink_pge1362075661087.htm
yes I do understand this already. Thank you.
The problem that I still have is that my initialized const char* strings and my initialized global variables are all in the .data section.
If I located the .data section in the nvm, then my global variables are not writable and I get a hard fault. If I put the .data section into DDR memory, then the initial values for my const char* strings are read from (attempted to be read from DDR) but the initial data will not be there unless I run the debugger which loads it there. That will not work operationally. The initial data must be in nvm but that drags all of my global variables along with it.
I do not know how to get that initial data into nvm without having it take the initialized globals too.
from wiki: The data area contains global and static variables used by the program that are explicitly initialized with a non-zero (or non-NULL) value. This segment can be further classified into a read-only area and read-write area. For instance, the string defined by char s[] = "hello world" in C and a C statement like int debug=1 outside the "main" would be stored in initialized read-write area. And a C statement like const char* string = "hello world" makes the string literal "hello world" to be stored in initialized read-only area and the character pointer variable string in initialized read-write area.
This is a example of one of the structures that is causes the problem. this should be in nvm as a constant value. Instead it is being located the DDR along with its initialization information if I put the .data section in DDR so that I can use my global variables.
If I put .data in nvm then this works fine but all my globals are not writable.
If I can get these into a read only section, then I can locate them in nvm. I would think there is some directive or something I can add to the declaration to force to be in a specific section. Maybe.
const char* ProcessProfile_errors_Tran[END_PROCESSPROFILE_ERROR][40] = { {"MESSAGE_SUCCESS"}, {"FAILED_TO_PROCESS"}, {"FAILED_TO_SEND_MESSAGE"}, {"FAILED_TO_RECEIVE_MESSAGE"}, {"FAILED_INVALID_MESSAGE"}, {"FAILED_TO_SYNC"}, {"FAILED_NO_ACTUATOR"}, {"FAILED_ACTUATOR_SETUP"}, {"TEST_USER_TERMINATED"}, {"FAILED_TO_FIND_NODE_INDEX"} };
This will force it
__attribute__ ((section ("INITDATA")))