we made some simple tests with STM32F100 Value Line Eval
static unsigned char sDstBuf; // 1KiB
static unsigned char sSrcBuf[sizeof(sDstBuf)];
printf("Copying words from misaligned src to aligned dst buffer...
memset(sDstBuf, 0xcd, sizeof(sDstBuf));
with optimize Level 3, optimize for time this takes
with optimize Level 0
almost the same if memcpy is used:
memcpy(sDstBuf, (const void *)0xcd, sizeof(sDstBuf));
It runs into hard fault, if optimize Level >=1 and optimise for
time is not set.
I think this is a compiler error..
We ran into this before with MDK 4.60, now we use 4.70A
You should set the compiler switch "--no_unaligned_access" in
Keil for Cortex M3/M4.(In fact it would be better, if it would be set
by default already ...).
No, that's not what he should do.
What he should do is stop expecting a compiler to magically turn
badly flawed source code into working machine code. The problem here
is not in the compiler ... it's in that source code, because that
code wilfully disregards both the programming language's and the
target architecture's properties.
Who do you think you are directing your late response to?
Thread create time: 15-Apr-2013 07:41 GMT
Last post before your: 16-Apr-2013 18:37 GMT
(note that Keil servers normally never manages to show correct
time - they have incorrect code and/or configuration for used time
zone. They can't even show the same time stamp in the thread list as
for the individual posts.
But anyway: not a single post in this thread can be called "late
response" unless you consider the time of day when the poster made
the post. And that would be hard to do if you don't know exactly
where in the world they actually live.
And that would be hard to do if you don't know exactly where in
the world they actually live.
Err, no. A typical method of determining elapsed time is
End-Start. If you remembered the local time of day when the original
post appeared and you also know the local time of when a
condescending response appeared, you can then easily calculate the
delay in that condescending response.
What he should do is stop expecting a compiler to magically
turn badly flawed source code into working machine code. The problem
here is not in the compiler ... it's in that source code, because
that code wilfully disregards both the programming language's and
the target architecture's properties.
oh, how often do we see the "what the #$#$ is that, I do not need
to know about stupid hardware I am a software person"
If that is your attitude STAY OUT OF EMBEDDED.
If you did read my post, and pondered a bit, you wouldn't "err,
A large percent of threads on this forum runs for a number of
Look at the time stamps of first/last post here - this is a young
thread so _no_ answer here can be considered a late response.
Which was why I noted that the only way you could debate 'late'
was if someone posted it late at night. But that requires that you
know what time zone the person lives in.
You realize this is a web forum? Not a chat program where you get
a 'beep' or something in the mobile phone, and then instantly writes
Who do you think you are directing your late response
And who do you think you are? Let's see: so far all you've
exhibited here is
a) a knack for totally missing the topic of discussion in every
single one of your three, well, "contributions",
b) a total lack of understanding of the medium you're using, and
c) a strangely narrow selection of targets for your insinuations:
Here's some free advice for you: the next time you decide to
launch a random campaign of throwing stuff at someone, you might want
to step out of that glass house of yours, first.
And who do you think you are?
I'm one who knows how to use a search facility effectively. It's
interesting going back to see historical posts.
thanks for your warning "Using --no_unaligned_access does *not*
allow you to cast aligned values to (int *).". I did not know
But in fact I am not sure whether I understand your warning
correctly. To make it more clear, could you perhaps give an example
of a short C code snippet, where this would typically happen?
I started using "--no_unaligned_access" when I ran into a problem
with the function
memcpy( &ac1, &ac2, 6)
to copy 6 bytes from ac2 to ac1. In this case (in Opt Level 1) the
compiler used this halfword-aligned LDR/STR commands as described in
ARMv7 TRM A3.2.1 - which require that the bit SCB->CCR.UNALIGN_TRP
is set to 0 - I thought that this bit was set to 1 in the
ST4Discovery and Blinky examples by default, but now checking it more
thoroughly, I recognized that I set it somewhen to 1 myself in my
InitTraps function - this was a bit over-ambitious then, I
I now re-compiled my complete code without this
"--no_unaligned_access" flag, and I recognized that there seems to be
really quite a difference - my code size shrinks from 40192 to 40032
Bytes (Opt Level 1) - which means that this flag really seems to
touch very many parts of my program (I use the memcpy with 6 Bytes
only once or twice in my code - this cannot explain the 160 bytes
less in code size). So now I think I will remove the flag
"--no_unaligned_access" and also I will remove the setting of
... just anyway to complete my understanding of this topic with
the "int*"s you touched - if you could give a short typical code
example for this, this would be very helpful for me (as from the
ARM-Cortex M4 TRM, "3.3.2 Load/store timings", "Unaligned word or
halfword loads or stores add penalty cycles..." sounds to me a bit
like a warning to the reader, better NOT to use such unaligned
LDR/STR) (In fact I was a bit bewildered, why they would allow such
unaligned LDR/STR at all - I thought it might be some sort of
historical compatibility stuff, as it was introduced in ARMv6 and
they somehow just wanted to keep it in ARMv7).
Thank you for the interesting link to the discussion.
But I must admit, after reading all this, I am still unsure,
whether not perhaps better specify the compiler switch
Even in your answer, you said that the memcpy functions might be
slower if I skip the "--no_unaligned_access".
You did not further elaborate the problem with the "(int*)" usage?
Is it difficult to give some basic code example for this?
Thank you, this is instructive (in my next life I will learn
To clarify what I meant about (int *), consider:
char *cp = &a;
void *vp = &a;
int *ip1, *ip2;
ip1 = (int *)cp; // may be undefined behavior
ip2 = cp; // C++ requires a cast here, C does not; still may be undefined behavior
The two statements both invoke undefined behavior if '&a'
does not have the alignment required for 'int' (and it probably
doesn't). Probably, nothing bad will happen until '*ip1' or '*ip2' is
used. Of course, it might also be the case that nothing bad happens
right away or ever -- there are no requirements at all on what
happens after undefined behavior.
In my opinion, casts (and converting from 'void *') are something
to be avoided when at all possible because they can easily hide
errors. There are cases where they are necessary, but I prefer to
avoid using them.
[In my weak attempt at lawyer-style humor in a post above, I, of
course, meant "..., up to and including, but not limited to,
You could consider using a __packed pointer.
Understand the underlying hardware along with what the processor
does would help you make a sensible decision about what is best.
My advice would be to not just take the typical "you are wrong"
and "the best way to do it is to" and the frequent "the only way to
do it is to" that frequently appears from beginners right through to
experts and professionals. Just remember that there is more than one
way to skin a cat.
If you listen carefully, you might just be able to hear the cries
Instead of using packed pointers, the code should do the best to
align the data manually if trying to store big objects in arrays of
And packed is just too easy to use. So people don't take 30
seconds to consider alternatives when all they need is zero thought
but activating the magic packed keyword.