Hello,
I run a baremetal program on one core of the Cortex A9. The First Stage Bootloader (FSBL, Pre-Loader) and the Second Stage Bootloader (SSBL) perform perfectly. When the baremetal application (axf-file) is started in the debugger it executes without a problem. However when the baremetal application (bin-file) is exectued directly on the machine, the baremetal application cannot be started and the follwing error is thrown:
data abort
MAYBE you should read doc/README.arm-unligned-accesses
However, when the MMU is deactivated, the application (bin-file and axf-file) execute perfectly.
This leads me to the follwing questions:
Why can the program be started in the Debugger, and at the same time cause a data abort, when the binary is executed on the machine?
Why could the MMU cause a data abort, when the bin-file is started on the machine?
I would be grateful for any hints and tips. Many thanks in advance.
Deas
Hello Deas,
The MMU of your bare metal application is only initialized with alt_pt_init function?
You should check the symbol in your linker script (usually mmu_tbl) and check if there are other assembly files (such as translation_table.S) which also initialize the MMU table.
The alt_pt_init seems to change access permissions to privileged... which is mostly reserved for OS kernels, and trusted applications...
You get an alignment exception, which occurs when MMU is enabled only for accesses to device regions. Thus you shall check that your accesses to the region starting at 0x40000000 are all 32-bit word aligned.
But I did not see the piece of assembly code where it is enabled... it shall be in the piece of code called right away from the base address of the vector table (entry point for ARMv7). For example, you may have a routine called _boot, which includes enabling MMU and caches in the CPU.
Good luck.
Florian
Hi Florian,
Many thanks for your detailed answer. I use the ARM DS-Ultimate toolchain, a linker script is not needed. I can only envoke the armlinker using:
armlink --cpu=Cortex-A9 --scatter="...\scatter_SDRAM.scat" --info=sizes --entry =alt_interrupt_vector
Some further context:
The reason I need to use the MMU, is so that I can make use of the cache. I am quite sure that the MMU is activated and populated correctly, and that the cache works. Because they work fine in the debugger. Furthermore the above mentioned function alt_pt_init() is provided by Intel, I presume it is correct.
The problem with the unalligned access only occurs, when the application.bin-file started, instead of the axf-file. The bin file is created with the following command.
-fromelf --bincombined application.axf --output=application.bin
When the MMU is not used i.e. when the further above mentioned function alt_pt_init() is not called no allignment error occurs.
Question:
Why is it possible that I can have an allignment error when the program is started directly on the machine - but no allignement error when the program is run in the Debugger?
I would be very grateful for further tipps. All the best
I recommed to set up your own vector table before anything else. It seems that still the bootloader is active.
Or again: Who outputs the error? Your code? Bootloader? If you also print the DFAR/IFAR registers you can come closer to the real cause.And again: The registers seem to be weird. FP is in the lower memory, SP in the higher. LR in the lower, but PC in the higher.
Hello Bastian,
many thanks for your answer.
The Error (Data abort maybe you should read doc/README.arm-unligned-accesses) is thrown by the SSBL. I wrote a U-Boot Script, in the last step, the application.bin is loaded and started - this command causes the above mentioned error-message and causes the SSBL to restart. Therefore the application.bin is never started.
I am sorry for asking something that must be obvious. SP=Stack Pointer? PC= Program Counter? But what denominates LR, and FP?
The addresses of the memory registers should be ok, they are based on this map Intel CycloneV Memory Map.
I would like to reiterate, that the program runs in the debugger (with or without the function alt_pt_init(void)).
However if the function alt_pt_init(void) is active the SSBL cannot start the compiled application.bin. With the function alt_pt_init(void) not being active the SSBL can start the compiled application.bin.
Thanks again for your help, and I would appreciate further tips.
All the best
You must pay attention to the fact that the debugger execution is different from the code execution without debugger.You cannot assume that debugger execution is OK so that your code should be 100% OK under any conditions.
Debugger execution is a invasive debug method. Armv7-A Architecture reference manual (ARM.ARM) says:
<quote>
Invasive debug authentication controls whether an debug event:• causes the processor to enter Debug state• generates a debug exception• is ignored• becomes pending.
</quote>
So the debugger execution perhaps hides the unaligned access failure or the debug event adds the explicit barrier for you.You still need to figure out where is the possible unaligned data access in your application.bin file. Check the offending data abort address further.
I agree with Zhifei and Bastian, you should check the Data Abort detailed information:
- Data Fault Status Register (DFSR) gives you the type of exception. Cortex A9 exceptions are encoded with the short descriptor format. Refer to ARMv7 architecture reference manual for the codes of all data abort exceptions.
- Data Fault Address Register (DFAR) gives you the address of the memory access which generated the abort, if this is a sycnhronous abort type (refer to DFSR)
If it is an MMU fault, it is synchronous and DFAR is valid. If it is a cache parity or DRAM ECC error, it is asynchronous basically, and DFAR is not valid.
With these information, you may be able to find the root cause.
Dear Flongnos,
thanks for your tip. The value that is stored in the DFSR is: 0x0000_1C97 resulting in the fault status of 0x17. However, this value is not shown in the table of the short descriptor format encodings (arm armv7A table B3-23), according to the specifications, all encodings not shown in the table are reserved.
I checked the different registers in debug mode in the ARM-DS-5 Debugger.The DFAR does not seem to be valid
The fault status of the DFSR does not point to a specifc error (as defined in arm armv7A B3-23). Does this imply that no data fault occured?
Many thanks for your reply.
Hi,
I think you did not decode it properly. For the short descriptor format, we have the fault status as the concatenation of bit 10 and bits 3~0.
So we have FS = ((DFSR[10] & 0x400) >> 6) | (DFSR[3:0] & 0xF)
As a result I found that you have FS = 0x7, that is SDTFMT_MMU_L2_TRANSLATION_FAULT
And the DFAR should be valid
Hello Florian,
thanks for your answer! Perhaps I am missing something too obvious here ;), but as far as I can tell, I followed the exact same procedure as you. i.e.:
The content in DFSR is: 0x0000_1C97
*DFSR = 0x0000_1C97 = 0b0000..._0001_1100_1001_0111;
stringing together the bit numbers [10,3,2,1,0] leads to: 0b10111=0x17.
Could you give me hint how you got to the fault status of 0x7 instead of 0x17?
Many thanks again.
Sorry Deas, I meant 0x17 indeed. But that code does not make sense, except if Large Page Address Extension and Long Descriptor Format are used and supported.
Maybe check something related to privilege level...
thanks for the clarification an the further input, I will look into it.
cheers