unsigned char buf[100] . . . unsigned int val; . . . val = *((unsigned int *)(&buf[1])); . . .
comments?
Yes, it is non-portable code. I can cause an alignment fault on some platforms.
been there, done that, and, even worse
Some compilers for 'multibyte word' processors that require alignment will not declare a fault but access the previous byte and the pointed byte instead of the pointed byte and the next. (I.e. a 16 bit processor that ignore LSB of the address for word fetch). When I complained about it to one compiler maker, I was informed "there is nothing in ANSI C about this". I never checked, what good does it do to state "there is something in ANSI C about this", if your program does not work because of such.
Then, of course, there are the endianness which, if the array is accessed both as char and int, will make a real pow-wow.
Erik (sorry abuot "multibyte word" what else would you call it)
Misalignment detection not really up to the compiler. Some hardware -- the ARM7/9 core for example -- doesn't even detect alignment faults, and will just silently do interesting things to your byte lanes.
There are some cases where compilers might be able to suspect or even prove alignnment problems, but I'm not sure that's possible in general, especially since actual positioning of the data is theoretically the job of the linker. Without inserting run-time checks on the actual value of a pointer just before it is used, which would have a huge effect on the code, I don't think a compiler can solve that problem in general. (Might make for an interesting debug option, just like an optional null pointer check that no compiler vendor offers as a debug option, either.)
sorry about "multibyte word"
Sounds good to me.
When I complained about it to one compiler maker, I was informed "there is nothing in ANSI C about this".
That information is correct. Code like the OP is as wrong as C code can possibly be, while still apparently working sometimes: it causes undefined behaviour. Literally anything can happen, because the language makes no promises whatsoever what such a program may do. "Anything" of course includes "what the coder expected", which makes this kind of error so nasty --- it'll just work for quite a while, but unexpectedly break when you use the same code on a slightly different platform.
I never checked, what good does it do to state "there is something in ANSI C about this", if your program does not work because of such.
The "because of such" part is incorrect. The program fails for a much simpler reason: it's wrong. The code makes assumptions about misaligned access via a maltreated pointer, that aren't backed up by any applicable rule. The code will only work if those assumptions happen to be true.
what I ment by "because of such" was that if the tool (is supposed to) behave in such a way under such circumstances, well, the thing to do is to avoid "such circumstances", there is no other way to get the product out.
BTW the 'error' was not mine, it occurred when I was the contact person to the compiler manufacturer while we were using a beta of a 16 bit compiler and one of my coworkers (one of those 'coders' that make it a point of pride to be ignorant of the hardware) asked "why does it not work". I realized rather quickly what was going on and, in the hope that a compiler 'catch' existed reported the problem. I would claim that ANY multiword byte compiler should, at least, warn when a memory location is "typecasted up"
Erik
Thanks for your comments.
My view is that it is obvious non-portable code; but since the coder wrote it explicitly, it is intentional and is not wrong.
It may rely on assumptions concerning compiler and platform, but if those assumptions are constant for said compiler and platform then the assumptions are relatively safe.
When I get out of bed in the morning, I do not put on my wellington boots. I assume that there has been no flood while I've been asleep. It's not wrong to assume that, but it's a fairly safe bet.
If I were to go to (say) the Pacific island of Tivalu, I may well change the assumption; just like I would if I were to change compiler and/or platform.
"since the coder wrote it explicitly, it is intentional and is not wrong."
No, that does not follow at all! You may happen to be right in this case, but you can't generally make that assumption!
"It may rely on assumptions concerning compiler and platform, but if those assumptions are constant for said compiler and platform then the assumptions are relatively safe."
Unfortunately, it may rely upon false assumptions - and it might only work by pure luck!
Also, there is no guarantee that the assumptions will not become false with a compiler update - or possibly even if some options are changed...
That's why all such assumptions should be clearly and fully documented!
And, if the coder didn't bother to provide such documentation, you at least have to wonder if that's because she/he didn't understand the issues...
No, a program using implementation-specific behaviours in C is not considered invalid. Just less portable or in some cases buggy. It's all a question about what assumptions the developer made, and if these assumptions are valid for the intended platform(s).
C did intentionally allow a number of specifics to be left to the architecture or language implementation. If it would have been invalid, the language standard would instead have specified that the compiler should (or must) flag the code as invalid (an error) if it is able to detect the problem.
If I know that the processor is little-endian, the size of an integer, and that the data is aligned or that the target architecture can work with unaligned data, then I am completely allowed to typecast that position of the received byte array into an int pointer, allowing me to read the value in a single instruction, instead of reading it as two (or more) bytes, and shifting the individual bytes into the correct position.
It isn't wrong to do it. It ist just a question of calculated risk contra potential gain. Yes, people can get their foot shot off, but people who don't think about the possibility of division by zero or stack size or numeric overflow can also get their feet shot off.
The compiler isn't allowed to issue an error (unless explicitly allowed by the user) for non-portable usage, because the compiler is required to produce a runnable program. It is then up to the specific hardware if the application will generate an exception or strange results.¨
To give an explicit example: 6.1.3.4 in the language standard notes that it is undefined behaviour to make a conversion from an integer data typ to a different data type that can't handle the full numeric range. Assigning an int to a char for example. Whe may now and then see warnings "Conversion might loose significant digits" etc, but the programs are still valid. Some long definitions are twice the size of int. Some are not. Even if I add a typecast intvar = (int)logvar, it still represents a conversion from long to int. If this represents an invalid program, then most applications larger than "hello world" (and quite a number of them too) would be invalid.
Most source lines we write are based on assumptions. For example the assumption that other code somewhere have verified the input range of all parameters read - if not, every single + or * could overflow. And since the behaviour then can be undefined (note that not all machines are two-complement) then every + or * would require code to establish that they can not fail. But even that code would probably contain code that - depending on situation - may need to make assumptions. How about machines that can have +0 and -0? They exist, and we just have to make assumptions - some math function code is allowed to treat +0 and -0 differently depending on architecture. How many have specifically tried to tell the compiler that your intention is +0?
A different example: How many significant characters do you use in externally linked symbols? Different linkers supports different length of external symbols. Should all programs that doesn't have significant symbol names within the first 6 characters be invalid?
Anyone using memset() to clear a large struct or array? But what is the internal representation of 0 by the harddware?
Is a program invalid if it writes a 100 character text string without any newline characters in it? It is considered undefined behaviour to write past the end of the terminal width - but since there exists handhelds with puny displays, the assumption would then be that anything that writes more than a single character before a newline may trigger 5.2.2.
That comes as no surprise.
When I complained about it to one compiler maker, I was informed "there is nothing in ANSI C about this". I never checked,
Neither does that.
what good does it do to state "there is something in ANSI C about this", if your program does not work because of such.
Well, you see, the standard defines the 'C' language. If you write code that makes assumptions that are not guaranteed by the standard then you cannot reasonably expect your program to work.
I know this will fall on deaf ears, but I'll say it again anyway: if you want to become proficient with a tool you really do need to read and understand the manual.
"It may rely on assumptions"
some of the longest debugging sessions ... have been a result of relying on assumptions.
Mr smoked sardine,
the references to ANSI C had absolutely nothing to do with the point
Let me try to translate my statement to something a smoked sardine can understand: "it makes no sense to discontinue development because of a bug (perceived or real makes no difference) in the tools if there is a workaround"
Word alignment is NOT a C issue but an architecture issue.
Yes, but life isn't expected to be simple. Anything non-trivial has to be based on a number of assumptions.
We can't avoid assumptions. Just try to make good ones, and to qualify them. We can made risk assesments for a project - what if our assumptions about used hardware, used tools, available time, stability of customer requirements etc are wrong. We can document our code, specifying what assumptions we have made (or rat least realizes that we have made). We can - if the hhardware permits - perform checked builds, that contains extra integrity-testing code. We can make use of the preprocessor. We can use regression testing...
While our job is to produce working - and economical - solutions, we can't ignore assumptions. If we think that there are no assumptions involved, then we have just made a very big, and very wrong assumption.
In short: it is almost impossible to write any non-trivial applications that are guaranteed to work on any existing platform that has an ANSI/ISO-conformant compiler.
It is nice that you see that I have a lot of experience.
If, on the other hand, you were trying to belittle me, you must have missed "BTW the 'error' was not mine, it occurred ... in my post.
Er5ik
True - But someone who builds up experience of such things can learn to more reliably determine the risk.
Just to follow on from my previous - Through experience I have determined that I don't have to put on my wellington boots before getting out of bed.
I would prefer to consider it a calculated risk.
I would not consider it wrong - I might, possibly, change my mind if I were to get my feet wet one morning ;)
True - But someone who builds up experience of such things can learn to more reliably determine the risk. there is nothing wrong with experience, if therte was, I would be up the creek re the '51 :)
Of course, were I to 'verify' my assumption that a char is 8 bits every time I type char, I would never get anywhere.
I can state my point in another way, which may be better: "when you see a bug, before anything else, verify the correctness of your assumptions"
The references to ANSI C had absolutely everything to do with my point, however.
No, that isn't a translation of what you said. It may be what you wish you had said, but it certainly doesn't reflect what you did say.
Word alignment is NOT a C issue
So, why did you complain to the compiler vendor about your problem?
Why do you not read it all When I complained about it to one compiler maker, I was informed "there is nothing in ANSI C about this". I never checked, what good does it do to state "there is something in ANSI C about this", if your program does not work because of such.
I would claim that ANY multiword byte compiler should, at least, warn when a memory location is "typecasted up" I really do not give ahoot if it is ANSI C or not, it is dead easy for a compiler to 'see' that "a memory location is "typecasted up" and an (optional) warning would make the tool much more useful.
I care about good tools, if the toolmalker decides to hide behind ANSI C that is just too bad.
I really do not give ahoot if it is ANSI C or not
if the toolmalker decides to hide behind ANSI C that is just too bad.
ANSI 'C', now ISO 'C', is the 'C' language. In a recent thread you demonstrated that you didn't even know how a 'for' statement worked, so I guess it should come as no surprise that you can't grasp that simple fact.
whatever does you snide reply have to do with the fact that I posted "I care about good tools" which you left out in your quote.
What I said, and you decided to overlook, was that if it is possible to put out a warning for risky code (in this case "risky on the particular platdoem") the tool should do so whether it has anything to do with the C standard or not
Your other snide remark re a for. For your information I actually posted that I knew how a for loop worked, just wanted to make sure of one particular fact where the text in the standard was slightly unclear.
Why do you not just go bask to the sardine can someboduy let you out of
Nothing at all. That's why I left it out of the quote. Why do you ask?
You should only rely on a compiler to behave as specified by the standard. It is unreasonable for you to expect a compiler writer to try and anticipate every mistake someone who refuses to read the standard might make and issue a warning. Think about it: if you rely on a non-standard warning issued by a compiler you will have to read the documentation of every compiler you use to determine whether your incorrect code will work, generate a warning or simply fail. Why not just read the standard to determine whether your code is guaranteed to work and be done with it?
For your information I actually posted that I knew how a for loop worked, just wanted to make sure of one particular fact
Either you know something or you don't know something. Why would you need to 'make sure' of something you know?
where the text in the standard was slightly unclear.
Please quote this 'unclear text' from the standard. By the way, I fully expect you to be unable to do so.
I'm sure you're ego would prefer that, but sorry, you're out of luck.
How very amazing. You babble about "the standard" and what we are discussing is "a warning if some architecthure created easy to detect mistake". I have never had alignment problems, be that from typecasting or something else, but have in, at least 5 cases solved somebody elses problem by fixing alignment issues. There are legions of people that believe that a REAL C person should NEVER have even an inkling about what the hardware does, and for such (I have met many) a warning when an easily detectable architecturally dependent mistake is made, what is your opposition to asking for a warning there.
Please quote this 'unclear text' from the standard. By the way, I fully expect you to be unable to do so. sorry not to meet your expectations. the 'description' of a for states "3 is done as long as 2 is valid".
There is nothing said about "updated 3 is written back whatever 2 is"
in addition re an excerpt from you babble above whether your incorrect code will work who on earth cares if incorect code will work, incorect code IS incorrect code.
Think about it: if you rely on a non-standard warning issued by a compiler you will have to read the documentation of every compiler you use to determine whether your incorrect code will work, generate a warning or simply fail.
The question isn't if people rely on warnings. Some people do. But rely or not: We do know that they catch a number of problems. Is a compiler that helps catching problems good or bad?
People don't buy lint programs just because they are stupid or lazy. They know that they make mistakes. They know that projects can have multiple developers, and one developer may make assumptions that are not consistent with the views of the rest of the team. Not too many developers likes to scan every line checked in by all other developers, just in case they have done something stupid.
I try not to rely on my car warning if a door isn't fully closed. However, I still like that extra information. If I chose to distrust everyone, then I would always have to take a full walk around the car whenever someone have touched a door or have thrown in some extra luggage.
How very amazing. You babble about "the standard" and what we are discussing is "a warning if some architecthure created easy to detect mistake".
You misunderstand. If you write code that is guaranteed by the standard to work then you don't have to worry about or rely on the compiler to issue a warning. Simple.
I have never had alignment problems,
Ok, ok. I'll pretend to believe that all these mistakes you seem to have encountered have been made by other people. I'm sure everyone else will as well. Let's face it, you demonstrate your exemplary knowledge from the very basics of 'C' all the way up to complicated concepts like a 'for' statement with refreshing regularity.
There are legions of people that believe that a REAL C person should NEVER have even an inkling about what the hardware does,
A real 'C' person writes code that is guaranteed to work by the standard ie *code that will work on all architectures* wherever possible. Unnecessary use of knowledge of the architecture makes code non-portable, and depending on precisely what non-standard behaviour you are relying on potentially unreliable.
a warning when an easily detectable architecturally dependent mistake is made, what is your opposition to asking for a warning there.
I have no objection to the warning, and the compiler is entitled to issue one if it wishes. What I object to is your reliance on such warnings, or complaints about lack of them, rather than just doing the sensible thing and actually reading the language documentation to find out whether the code is guaranteed to work.
sorry not to meet your expectations. the 'description' of a for states "3 is done as long as 2 is valid".
Please quote the paragraph number, I'm having trouble finding that particular bit of text.
who on earth cares if incorect code will work, incorect code IS incorrect code.
That is precisely my point. You need to write correct code rather than relying on your 'suck it and see' approach.
Ok, ok. I'll pretend to believe that all these mistakes you seem to have encountered have been made by other people. Of course not. This is in response to your snide remark about "all the mistakes I have made"
A real 'C' person writes code that is guaranteed to work by the standard ie *code that will work on all architectures* wherever possible. Unnecessary use of knowledge of the architecture makes code non-portable, and depending on precisely what non-standard behaviour you are relying on potentially unreliable. The point you are totally missing is that using 'nonstandard features' e.g. DATA, XDATA is a requirement for writing efficient code, which YOU quite clearly is incapable of since to adhere to the standard and the satndard only you are forced into the LARGE memory model. I am quite certain that Keil added these things just to annoy you.
I have no objection to the warning, Then why all the words about it?
What I object to is your reliance on such warnings, or complaints about lack of them You state that I, personally rely on such warnings after I have posted that I have never had alignment problems (I know what alignment is) but have found them in other code.
I'm having trouble finding that particular bit of text. look for 'for' in the index, if you can not u8nderstand what I wrote, you do not know about it.
That is precisely my point. You need to write correct code rather than relying on your 'suck it and see' approach Mr preacher, you are preaching to the choir, nothing is farther from my coding than "it and see".
Ok, I'm going to try a different method of getting sense out of you. We'll deal with your points one at a time in separate posts:
I'm having trouble finding that particular bit of text.
look for 'for' in the index, if you can not u8nderstand what I wrote, you do not know about it.
You originally stated that there was some text in the standard that was unclear. I asked you to quote the text in question, your response was some text that does not appear in the standard. You now suggest I look up 'for' in the index - well, that takes me to the section describing the for statement, but not to any unclear text. Now please, quote the actual text that you felt was unclear and provide the paragraph number that will allow me to find it.
I may have abbreviated a bit too much, but thought that you with your self proclaimed brilliance coud figure out the abbreviateion of expression_2 to '2' etc.
One place for reference would be section 8.5 in ANSI C language summary in Kochan: "programming in ANSI C", the reference to expression_3 speak of 'evaluation'. Thus reading this with a very suspicious eye you could get that it would be within the standard to 'evaluate' expression_3 vs expression_2 without updating the content of expression_3.
Now, of course, you are going to refer to K&R, but every comparison I have made between the "ANSI C LAnguage Summary" in Kochans book has shown them the same and thus I refeer to one book instead of running sround in the library to find more sources.
Most of us refer directly to the standard, instead of any books, since this world is full of authors that only "thinks" that they know what they are writing about, and that want to rewrite the information in a more clear way - while at the same time missing a number of important details of the original, unabreviated, language.
Section from 6.8.5.3, #1: "The expression expression-3 is evaluated as a void expression after each execution of the loop body."
Since void, it never returns a value to be used. Hence, I must expect that the sole goal of evaluating expression-3 is to produce some form of side effect, sinch as updating zero or more loop variables.
6.8.6.2 shows what happens when you have a (non-nested) continue within the loop body - i.e. the loop body will be considered processed, so evaluation of expression-3 will follow.
6.8.6.3 shows what happens when you have a (non-nested) break within the loop body, i.e. the loop body will not be finished and expression-3 will not be evaluated.
does 'evaluated' mean that the result is 'permanent' even if the 'evaluation' result in that the 'evaluated' entity is not to be used any more.
I am not a compiler writer, but as I read the text the resulting code of "for (x=0; x !=8 ; x++)" could 'legally' be e.g. either of these enter ..... loop: ....... inc x cjne x,#8,loop sjmp out
or
enter: mov r4,x ........ loop: mov x,r4 ..... inc r4 cjne r4,#8,loop sjmp out
In the first case x would be 8 when the for exits, in the second case x would be 7 when the for exits.
This, Mr Smoked Sardine, is not about understanding C but about understanding English in the K&R usage.
With Keil C51 x is 8 when the for exits and I am relying on Keil not to change that, but since this is about portability, the discussion should be "could this be different in another compiler"
The result of the evaluation is always thrown avay, since it is evaluated as a void expression.
However, whenever evaluated, any side effects will always be permanent.
Speculative evaluation are outside the scope of any language specification. It is part of compiler optimizations or (more commonly) internal processor optimizations. Speculative evaluation may then be done to process several branches at the same time - but with the requirement that only the side effects of the actually taken branch should be visible to an end user.
In short: Unless you add a break or return statement in the body of the loop, your expr3 side effects will always - for any C/C++ compiler - result in expr3 side effects (the loop variable being updated) until expr2 doesn't allow any furter iterations.
Too thick fingers, or to dim mind...
"your expr3 side effects will always - for any C/C++ compiler - result in expr3 side effects (the loop variable being updated) until expr2 doesn't allow any furter iterations."
Should say "your expr3 evaluations will always - for any C/C++ compiler - result in expr3 side effects (the loop variable being updated) until expr2 doesn't allow any furter iterations."
I am not a compiler writer, but as I read the text the resulting code of "for (x=0; x !=8 ; x++)" could 'legally' be e.g. either of these
No, it couldn't.
No, it isn't.
I'm glad to see that you've finally come clean about the supposed 'unclear text' in the standard - you didn't actually read the standard at all.
When will you realise that what you happen to believe based on your 'experience' is, with monotonous regularity, wrong?
View all questions in Keil forum