we made some simple tests with STM32F100 Value Line Eval Board:
//------------------------------------------------------------------------------ // Variables static unsigned char sDstBuf[1024]; // 1KiB static unsigned char sSrcBuf[sizeof(sDstBuf)];
printf("Copying words from misaligned src to aligned dst buffer... "); memset(sDstBuf, 0xcd, sizeof(sDstBuf));
with optimize Level 3, optimize for time this takes 120usec
with optimize Level 0 155usec
almost the same if memcpy is used: memcpy(sDstBuf, (const void *)0xcd, sizeof(sDstBuf));
It runs into hard fault, if optimize Level >=1 and optimise for time is not set.
I think this is a compiler error..
We ran into this before with MDK 4.60, now we use 4.70A
Werner
>> You should set the compiler switch "--no_unaligned_access" in Keil for Cortex M3/M4.(In fact it would be better, if it would be set by default already ...). <<
No; that's not what --no_unaligned_access means.
When you use --no_unaligned_access it tell armcc that it must not access unaligned data with LDR/STR (and so the processor can be set to disallow unaligned access). This mean that other, less-efficient code sequences will be used to access unaligned data. Accessing data that is guaranteed to be aligned, like (int *), will still use LDR/STR (or even LDM/STM).
Using --no_unaligned_access does *not* allow you to cast aligned values to (int *). Doing that is *undefined behavior* and the compiler can cause anything to happen that it wants, up to and including, but limited to, causing you to waste a lot of effort tracking down the problem in the hope that you'll learn never to lie to the compiler again.
Hi Scott, thanks for your warning "Using --no_unaligned_access does *not* allow you to cast aligned values to (int *).". I did not know this before.
But in fact I am not sure whether I understand your warning correctly. To make it more clear, could you perhaps give an example of a short C code snippet, where this would typically happen?
I started using "--no_unaligned_access" when I ran into a problem with the function
memcpy( &ac1, &ac2, 6)
to copy 6 bytes from ac2 to ac1. In this case (in Opt Level 1) the compiler used this halfword-aligned LDR/STR commands as described in ARMv7 TRM A3.2.1 - which require that the bit SCB->CCR.UNALIGN_TRP is set to 0 - I thought that this bit was set to 1 in the ST4Discovery and Blinky examples by default, but now checking it more thoroughly, I recognized that I set it somewhen to 1 myself in my InitTraps function - this was a bit over-ambitious then, I think).
I now re-compiled my complete code without this "--no_unaligned_access" flag, and I recognized that there seems to be really quite a difference - my code size shrinks from 40192 to 40032 Bytes (Opt Level 1) - which means that this flag really seems to touch very many parts of my program (I use the memcpy with 6 Bytes only once or twice in my code - this cannot explain the 160 bytes less in code size). So now I think I will remove the flag "--no_unaligned_access" and also I will remove the setting of SCB->CCR.UNALING_TRP.
... just anyway to complete my understanding of this topic with the "int*"s you touched - if you could give a short typical code example for this, this would be very helpful for me (as from the ARM-Cortex M4 TRM, "3.3.2 Load/store timings", "Unaligned word or halfword loads or stores add penalty cycles..." sounds to me a bit like a warning to the reader, better NOT to use such unaligned LDR/STR) (In fact I was a bit bewildered, why they would allow such unaligned LDR/STR at all - I thought it might be some sort of historical compatibility stuff, as it was introduced in ARMv6 and they somehow just wanted to keep it in ARMv7).