Hello,
I run a baremetal program on one core of the Cortex A9. The First Stage Bootloader (FSBL, Pre-Loader) and the Second Stage Bootloader (SSBL) perform perfectly. When the baremetal application (axf-file) is started in the debugger it executes without a problem. However when the baremetal application (bin-file) is exectued directly on the machine, the baremetal application cannot be started and the follwing error is thrown:
data abort
MAYBE you should read doc/README.arm-unligned-accesses
However, when the MMU is deactivated, the application (bin-file and axf-file) execute perfectly.
This leads me to the follwing questions:
Why can the program be started in the Debugger, and at the same time cause a data abort, when the binary is executed on the machine?
Why could the MMU cause a data abort, when the bin-file is started on the machine?
I would be grateful for any hints and tips. Many thanks in advance.
Deas
Maybe the SSBL enables also the MMU and this give a clash with your setup.
"Who" outputs the message? Seems it is the SSBL, so it seems the vector table still points to it.
For Cortex-A9, when MMU is disabled, the data accesses are treated as Strongly Ordered. When MMU is enabled, it will check the memory address permission.
If SCTLR.U == 1, unaligned access support for loads and stores of 16-bit halfwords and 32-bit words. Unaligned access support only applies to Normal memory. Unaligned accesses to Strongly-ordered or Device memory are UNPREDICTABLE.
SCTLR.A = 1 bit forces an abort on an unaligned access.
many thanks for your replies. I went through the specs. The enabling of the MMU resp. the population of the pagetable has to occur after the SSBL. I presume the SSBL is not part of the problem.
After the SSBL has finished - the baremetal application is started, the latter initilizes the system. The function alt_pt_init(void) enables the MMU and populates the page table, according to the required specifications.
ALT_STATUS_CODE alt_pt_init (void) { /* Populate the page table with sections (1 MiB regions). */ ALT_MMU_MEM_REGION_t regions[] = { /* Memory area: 1 GiB */ { .va = (void *) 0x00000000, .pa = (void *) 0x00000000, .size = 0x40000000, .access = ALT_MMU_AP_PRIV_ACCESS, // 1, /*!< Privileged access only .attributes = ALT_MMU_ATTR_WBA, // 0x13, /*!< Inner/Outer Write-Back, Write Allocate, Shareability determined by [S] bit */ .shareable = ALT_MMU_TTB_S_NON_SHAREABLE, // 0, /*!< Non-Shareable address map */ .execute = ALT_MMU_TTB_XN_DISABLE, // 0, .security = ALT_MMU_TTB_NS_SECURE}, // 0, /* Device area: Everything else */ { .va = (void *) 0x40000000, .pa = (void *) 0x40000000, .size = 0xc0000000, .access = ALT_MMU_AP_PRIV_ACCESS, // 1, /*!< Privileged access only */ .attributes = ALT_MMU_ATTR_DEVICE_NS, // 0x20, /*!< Device Non-Shareable */ .shareable = ALT_MMU_TTB_S_NON_SHAREABLE, // 0 /*!< Non-Shareable address map */ .execute = ALT_MMU_TTB_XN_ENABLE, // 1 .security = ALT_MMU_TTB_NS_SECURE} // 0, };
When the compiled (application.axf) is run in the debugger everything works as it should, however, when the binary (application.bin) is started I get the above message (via - UART Interface): (data abort ...) .
The application.bin is started via UART with the command:
fatload mmc 0:1 0x00100000 application.bin go 0x00100040;
This procedure works fine as long as the function alt_pt_init() is not executed.
In summary:
Question:
Why does the function alt_pt_init(void) prevent the application.bin from getting executed? Why can the application.axf still be executed in the debugger?
I am quite sure, that I am missing something rather substantial - and I would be really glad if somebody could provide me with some more tipps.
Many thanks and all the best
(*for context. The application is supposed to run on a Altera CycloneV containing a Cortex A9.)
where does your code run? LR points to lower memory, PC to upper.
Hello Bastian,
I wrote a scatter file, for these declarations:
SDRAM 0x100000 0x03F00000 { VECTORS +0 { *(VECTORS, +FIRST) } APP_CODE +0 { *(+RO, +RW, +ZI) } ARM_LIB_STACKHEAP +0 EMPTY (0x3F000000 - ImageLimit(APP_CODE)) ; Application Heap and Stack {} }
The entry point of the application.bin is 0x00100000, which I checked with the command readelf -h application.axf.
Thanks for your support.
I take the liberty to just post the information about the elf-file.
readelf -h application.axf ELF Header: Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 Class: ELF32 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: EXEC (Executable file) Machine: ARM Version: 0x1 Entry point address: 0x100000 Start of program headers: 381792 (bytes into file) Start of section headers: 381824 (bytes into file) Flags: 0x5000402, Version5 EABI, hard-float ABI, <unknown> Size of this header: 52 (bytes) Size of program headers: 32 (bytes) Number of program headers: 1 Size of section headers: 40 (bytes) Number of section headers: 18 Section header string table index: 17
Hello Deas,
The MMU of your bare metal application is only initialized with alt_pt_init function?
You should check the symbol in your linker script (usually mmu_tbl) and check if there are other assembly files (such as translation_table.S) which also initialize the MMU table.
The alt_pt_init seems to change access permissions to privileged... which is mostly reserved for OS kernels, and trusted applications...
You get an alignment exception, which occurs when MMU is enabled only for accesses to device regions. Thus you shall check that your accesses to the region starting at 0x40000000 are all 32-bit word aligned.
But I did not see the piece of assembly code where it is enabled... it shall be in the piece of code called right away from the base address of the vector table (entry point for ARMv7). For example, you may have a routine called _boot, which includes enabling MMU and caches in the CPU.
Good luck.
Florian
Hi Florian,
Many thanks for your detailed answer. I use the ARM DS-Ultimate toolchain, a linker script is not needed. I can only envoke the armlinker using:
armlink --cpu=Cortex-A9 --scatter="...\scatter_SDRAM.scat" --info=sizes --entry =alt_interrupt_vector
Some further context:
The reason I need to use the MMU, is so that I can make use of the cache. I am quite sure that the MMU is activated and populated correctly, and that the cache works. Because they work fine in the debugger. Furthermore the above mentioned function alt_pt_init() is provided by Intel, I presume it is correct.
The problem with the unalligned access only occurs, when the application.bin-file started, instead of the axf-file. The bin file is created with the following command.
-fromelf --bincombined application.axf --output=application.bin
When the MMU is not used i.e. when the further above mentioned function alt_pt_init() is not called no allignment error occurs.
Why is it possible that I can have an allignment error when the program is started directly on the machine - but no allignement error when the program is run in the Debugger?
I would be very grateful for further tipps. All the best
I recommed to set up your own vector table before anything else. It seems that still the bootloader is active.
Or again: Who outputs the error? Your code? Bootloader? If you also print the DFAR/IFAR registers you can come closer to the real cause.And again: The registers seem to be weird. FP is in the lower memory, SP in the higher. LR in the lower, but PC in the higher.
many thanks for your answer.
The Error (Data abort maybe you should read doc/README.arm-unligned-accesses) is thrown by the SSBL. I wrote a U-Boot Script, in the last step, the application.bin is loaded and started - this command causes the above mentioned error-message and causes the SSBL to restart. Therefore the application.bin is never started.
I am sorry for asking something that must be obvious. SP=Stack Pointer? PC= Program Counter? But what denominates LR, and FP?
The addresses of the memory registers should be ok, they are based on this map Intel CycloneV Memory Map.
I would like to reiterate, that the program runs in the debugger (with or without the function alt_pt_init(void)).
However if the function alt_pt_init(void) is active the SSBL cannot start the compiled application.bin. With the function alt_pt_init(void) not being active the SSBL can start the compiled application.bin.
Thanks again for your help, and I would appreciate further tips.
All the best
You must pay attention to the fact that the debugger execution is different from the code execution without debugger.You cannot assume that debugger execution is OK so that your code should be 100% OK under any conditions.
Debugger execution is a invasive debug method. Armv7-A Architecture reference manual (ARM.ARM) says:
<quote>
Invasive debug authentication controls whether an debug event:• causes the processor to enter Debug state• generates a debug exception• is ignored• becomes pending.
</quote>
So the debugger execution perhaps hides the unaligned access failure or the debug event adds the explicit barrier for you.You still need to figure out where is the possible unaligned data access in your application.bin file. Check the offending data abort address further.
I agree with Zhifei and Bastian, you should check the Data Abort detailed information:
- Data Fault Status Register (DFSR) gives you the type of exception. Cortex A9 exceptions are encoded with the short descriptor format. Refer to ARMv7 architecture reference manual for the codes of all data abort exceptions.
- Data Fault Address Register (DFAR) gives you the address of the memory access which generated the abort, if this is a sycnhronous abort type (refer to DFSR)
If it is an MMU fault, it is synchronous and DFAR is valid. If it is a cache parity or DRAM ECC error, it is asynchronous basically, and DFAR is not valid.
With these information, you may be able to find the root cause.
Dear Flongnos,
thanks for your tip. The value that is stored in the DFSR is: 0x0000_1C97 resulting in the fault status of 0x17. However, this value is not shown in the table of the short descriptor format encodings (arm armv7A table B3-23), according to the specifications, all encodings not shown in the table are reserved.
I checked the different registers in debug mode in the ARM-DS-5 Debugger.The DFAR does not seem to be valid
The fault status of the DFSR does not point to a specifc error (as defined in arm armv7A B3-23). Does this imply that no data fault occured?
Many thanks for your reply.
Hi,
I think you did not decode it properly. For the short descriptor format, we have the fault status as the concatenation of bit 10 and bits 3~0.
So we have FS = ((DFSR[10] & 0x400) >> 6) | (DFSR[3:0] & 0xF)
As a result I found that you have FS = 0x7, that is SDTFMT_MMU_L2_TRANSLATION_FAULT
And the DFAR should be valid
Hello Florian,
thanks for your answer! Perhaps I am missing something too obvious here ;), but as far as I can tell, I followed the exact same procedure as you. i.e.:
The content in DFSR is: 0x0000_1C97
*DFSR = 0x0000_1C97 = 0b0000..._0001_1100_1001_0111;
stringing together the bit numbers [10,3,2,1,0] leads to: 0b10111=0x17.
Could you give me hint how you got to the fault status of 0x7 instead of 0x17?
Many thanks again.