we made some simple tests with STM32F100 Value Line Eval Board:
//------------------------------------------------------------------------------ // Variables static unsigned char sDstBuf[1024]; // 1KiB static unsigned char sSrcBuf[sizeof(sDstBuf)];
printf("Copying words from misaligned src to aligned dst buffer... "); memset(sDstBuf, 0xcd, sizeof(sDstBuf));
with optimize Level 3, optimize for time this takes 120usec
with optimize Level 0 155usec
almost the same if memcpy is used: memcpy(sDstBuf, (const void *)0xcd, sizeof(sDstBuf));
It runs into hard fault, if optimize Level >=1 and optimise for time is not set.
I think this is a compiler error..
We ran into this before with MDK 4.60, now we use 4.70A
Werner
Note that some memory controllers can hide unaligned access - they just force the core to wait extra wait states while the memory controller performs multiple memory accesses and then glues together the partial reads.
I hope no chip gets a memory controller that performs such unaligned hiding for any peripherial device, or really bad things can happen - for peripherials, it isn't always safe to do an extra read. And an unaligned memory accesses can also trig special hardware logic for the neighbor word - potentially saying that an UART status register have been read and is now "cleared".
In almost all situations, code should make sure zero unaligned accesses are performed - the main exception is when storing a big array of "data records" where a significant amount of memory can be saved by packing the data.