unsigned char buf[100] . . . unsigned int val; . . . val = *((unsigned int *)(&buf[1])); . . .
comments?
Yes, it is non-portable code. I can cause an alignment fault on some platforms.
been there, done that, and, even worse
Some compilers for 'multibyte word' processors that require alignment will not declare a fault but access the previous byte and the pointed byte instead of the pointed byte and the next. (I.e. a 16 bit processor that ignore LSB of the address for word fetch). When I complained about it to one compiler maker, I was informed "there is nothing in ANSI C about this". I never checked, what good does it do to state "there is something in ANSI C about this", if your program does not work because of such.
Then, of course, there are the endianness which, if the array is accessed both as char and int, will make a real pow-wow.
Erik (sorry abuot "multibyte word" what else would you call it)
Misalignment detection not really up to the compiler. Some hardware -- the ARM7/9 core for example -- doesn't even detect alignment faults, and will just silently do interesting things to your byte lanes.
There are some cases where compilers might be able to suspect or even prove alignnment problems, but I'm not sure that's possible in general, especially since actual positioning of the data is theoretically the job of the linker. Without inserting run-time checks on the actual value of a pointer just before it is used, which would have a huge effect on the code, I don't think a compiler can solve that problem in general. (Might make for an interesting debug option, just like an optional null pointer check that no compiler vendor offers as a debug option, either.)
sorry about "multibyte word"
Sounds good to me.
When I complained about it to one compiler maker, I was informed "there is nothing in ANSI C about this".
That information is correct. Code like the OP is as wrong as C code can possibly be, while still apparently working sometimes: it causes undefined behaviour. Literally anything can happen, because the language makes no promises whatsoever what such a program may do. "Anything" of course includes "what the coder expected", which makes this kind of error so nasty --- it'll just work for quite a while, but unexpectedly break when you use the same code on a slightly different platform.
I never checked, what good does it do to state "there is something in ANSI C about this", if your program does not work because of such.
The "because of such" part is incorrect. The program fails for a much simpler reason: it's wrong. The code makes assumptions about misaligned access via a maltreated pointer, that aren't backed up by any applicable rule. The code will only work if those assumptions happen to be true.
what I ment by "because of such" was that if the tool (is supposed to) behave in such a way under such circumstances, well, the thing to do is to avoid "such circumstances", there is no other way to get the product out.
BTW the 'error' was not mine, it occurred when I was the contact person to the compiler manufacturer while we were using a beta of a 16 bit compiler and one of my coworkers (one of those 'coders' that make it a point of pride to be ignorant of the hardware) asked "why does it not work". I realized rather quickly what was going on and, in the hope that a compiler 'catch' existed reported the problem. I would claim that ANY multiword byte compiler should, at least, warn when a memory location is "typecasted up"
Erik
Thanks for your comments.
My view is that it is obvious non-portable code; but since the coder wrote it explicitly, it is intentional and is not wrong.
It may rely on assumptions concerning compiler and platform, but if those assumptions are constant for said compiler and platform then the assumptions are relatively safe.
When I get out of bed in the morning, I do not put on my wellington boots. I assume that there has been no flood while I've been asleep. It's not wrong to assume that, but it's a fairly safe bet.
If I were to go to (say) the Pacific island of Tivalu, I may well change the assumption; just like I would if I were to change compiler and/or platform.
"since the coder wrote it explicitly, it is intentional and is not wrong."
No, that does not follow at all! You may happen to be right in this case, but you can't generally make that assumption!
"It may rely on assumptions concerning compiler and platform, but if those assumptions are constant for said compiler and platform then the assumptions are relatively safe."
Unfortunately, it may rely upon false assumptions - and it might only work by pure luck!
Also, there is no guarantee that the assumptions will not become false with a compiler update - or possibly even if some options are changed...
That's why all such assumptions should be clearly and fully documented!
And, if the coder didn't bother to provide such documentation, you at least have to wonder if that's because she/he didn't understand the issues...
No, a program using implementation-specific behaviours in C is not considered invalid. Just less portable or in some cases buggy. It's all a question about what assumptions the developer made, and if these assumptions are valid for the intended platform(s).
C did intentionally allow a number of specifics to be left to the architecture or language implementation. If it would have been invalid, the language standard would instead have specified that the compiler should (or must) flag the code as invalid (an error) if it is able to detect the problem.
If I know that the processor is little-endian, the size of an integer, and that the data is aligned or that the target architecture can work with unaligned data, then I am completely allowed to typecast that position of the received byte array into an int pointer, allowing me to read the value in a single instruction, instead of reading it as two (or more) bytes, and shifting the individual bytes into the correct position.
It isn't wrong to do it. It ist just a question of calculated risk contra potential gain. Yes, people can get their foot shot off, but people who don't think about the possibility of division by zero or stack size or numeric overflow can also get their feet shot off.
The compiler isn't allowed to issue an error (unless explicitly allowed by the user) for non-portable usage, because the compiler is required to produce a runnable program. It is then up to the specific hardware if the application will generate an exception or strange results.¨
To give an explicit example: 6.1.3.4 in the language standard notes that it is undefined behaviour to make a conversion from an integer data typ to a different data type that can't handle the full numeric range. Assigning an int to a char for example. Whe may now and then see warnings "Conversion might loose significant digits" etc, but the programs are still valid. Some long definitions are twice the size of int. Some are not. Even if I add a typecast intvar = (int)logvar, it still represents a conversion from long to int. If this represents an invalid program, then most applications larger than "hello world" (and quite a number of them too) would be invalid.
Most source lines we write are based on assumptions. For example the assumption that other code somewhere have verified the input range of all parameters read - if not, every single + or * could overflow. And since the behaviour then can be undefined (note that not all machines are two-complement) then every + or * would require code to establish that they can not fail. But even that code would probably contain code that - depending on situation - may need to make assumptions. How about machines that can have +0 and -0? They exist, and we just have to make assumptions - some math function code is allowed to treat +0 and -0 differently depending on architecture. How many have specifically tried to tell the compiler that your intention is +0?
A different example: How many significant characters do you use in externally linked symbols? Different linkers supports different length of external symbols. Should all programs that doesn't have significant symbol names within the first 6 characters be invalid?
Anyone using memset() to clear a large struct or array? But what is the internal representation of 0 by the harddware?
Is a program invalid if it writes a 100 character text string without any newline characters in it? It is considered undefined behaviour to write past the end of the terminal width - but since there exists handhelds with puny displays, the assumption would then be that anything that writes more than a single character before a newline may trigger 5.2.2.
That comes as no surprise.
When I complained about it to one compiler maker, I was informed "there is nothing in ANSI C about this". I never checked,
Neither does that.
what good does it do to state "there is something in ANSI C about this", if your program does not work because of such.
Well, you see, the standard defines the 'C' language. If you write code that makes assumptions that are not guaranteed by the standard then you cannot reasonably expect your program to work.
I know this will fall on deaf ears, but I'll say it again anyway: if you want to become proficient with a tool you really do need to read and understand the manual.
Mr smoked sardine,
the references to ANSI C had absolutely nothing to do with the point
Let me try to translate my statement to something a smoked sardine can understand: "it makes no sense to discontinue development because of a bug (perceived or real makes no difference) in the tools if there is a workaround"
Word alignment is NOT a C issue but an architecture issue.
It is nice that you see that I have a lot of experience.
If, on the other hand, you were trying to belittle me, you must have missed "BTW the 'error' was not mine, it occurred ... in my post.
Er5ik
The references to ANSI C had absolutely everything to do with my point, however.
No, that isn't a translation of what you said. It may be what you wish you had said, but it certainly doesn't reflect what you did say.
Word alignment is NOT a C issue
So, why did you complain to the compiler vendor about your problem?
Sadly you do seem to have a lot of experience of bugs which you should never have introduced into the code in the first place. Your assumption when something doesn't work as you expect is that the problem lies with the tools - this is quite typical of those who (as you admit openly, in fact you seem proud of it) haven't read the appropriate documentation.
Note that in this case the 'tool' is the 'C' language, and the 'appropriate documentation' is the definition of the language.
you must have missed "BTW the 'error' was not mine, it occurred ... in my post.
Yes, I noticed you try to salvage some credibility in a followup post.
Of course, were I to 'verify' my assumption that a char is 8 bits every time I type char, I would never get anywhere.
In reality, you have to. But you don't have to do it exactly when you write the 'char' keyword. But you - at least - have to verify it when you select a new processor or a new compiler. I don't expect one of my existing compilers to change that behaviour - at least not without adding a very visible note in the release notes :)
Luckily, no chip manufacturer dares to step away från n*8-sized integer data types, because of the very bad feedback they would get from all the people who made the incorrect assumption that their programs will not fail if run on a machine with a 7-bit or 9-bit char data type. If they do add unusual data types, they add them as complements to more standard data types.
One of the first things you should do when looking for a new compiler, is to get your hands on the documentation about their take on the implementation-specific parts of the standard. You want to know if there are significant limits in # of nested if statements, number of case statements in a switch, if you will be able to nest multiple include files, if they have the usual include files (complete with expected contents) etc.
In the same way, you have to make sure if a selected processor has a general stack implementation, or if it may have a hard-coded return stack with a limited # of return etnries. You want to know if it can index data - and to what degree. You want to know if it will glow like Chernobyl or if it has reasonable power-save modes. Some details you will look up. Some details you will forget to look up. Some details you will assume to be ok, and will ignore checking up on. Checking up on everything requires perfect documentation, perfect memory and almost infinite amounts of time.
Why do you not read it all When I complained about it to one compiler maker, I was informed "there is nothing in ANSI C about this". I never checked, what good does it do to state "there is something in ANSI C about this", if your program does not work because of such.
I would claim that ANY multiword byte compiler should, at least, warn when a memory location is "typecasted up" I really do not give ahoot if it is ANSI C or not, it is dead easy for a compiler to 'see' that "a memory location is "typecasted up" and an (optional) warning would make the tool much more useful.
I care about good tools, if the toolmalker decides to hide behind ANSI C that is just too bad.
Sadly you do seem to have a lot of experience of bugs which you should never have introduced into the code in the first place. When being a member of groups, I have most often been "the debugger" thus your uttely stupid assumption that bugs I have seen were those I "have introduced into the code in the first place" is completely false.
Your assumption when something doesn't work as you expect is that the problem lies with the tools Where on earth did you get that impression? Of course in a toolmakers forum tools are the main subject of discussion, it evidently takes a smoked sardine to ASS U ME that limiting a discussion to the subject means that there are no other subjects.
But you - at least - have to verify it when you select a new processor or a new compiler. Of, course; however, I wrote "every time"
Luckily, no chip manufacturer dares to step away från n*8-sized integer data types, because of the very bad feedback they would get from all the people who made the incorrect assumption that their programs will not fail if run on a machine with a 7-bit or 9-bit char data type. If they do add unusual data types, they add them as complements to more standard data types. I would not know about chips, but Univac and (some) DEC machines use 6 bit bytes.
PS "från" slipped into your post, do you write in Swedish first or did a word just slip in ?
No two-step translation involved. However, I sometimes gets parity bit errors - especially when someone comes in and wants to talk to me in the middle of writing :)
Sw: från = Eng: from. They start the same, the same number of letters and similar pronounciation.
IBM had 36-bit big iron, so they impplemented 4x 9-bit characters or 5x 7-bit characters.
I really do not give ahoot if it is ANSI C or not
if the toolmalker decides to hide behind ANSI C that is just too bad.
ANSI 'C', now ISO 'C', is the 'C' language. In a recent thread you demonstrated that you didn't even know how a 'for' statement worked, so I guess it should come as no surprise that you can't grasp that simple fact.
whatever does you snide reply have to do with the fact that I posted "I care about good tools" which you left out in your quote.
What I said, and you decided to overlook, was that if it is possible to put out a warning for risky code (in this case "risky on the particular platdoem") the tool should do so whether it has anything to do with the C standard or not
Your other snide remark re a for. For your information I actually posted that I knew how a for loop worked, just wanted to make sure of one particular fact where the text in the standard was slightly unclear.
Why do you not just go bask to the sardine can someboduy let you out of
Nothing at all. That's why I left it out of the quote. Why do you ask?
You should only rely on a compiler to behave as specified by the standard. It is unreasonable for you to expect a compiler writer to try and anticipate every mistake someone who refuses to read the standard might make and issue a warning. Think about it: if you rely on a non-standard warning issued by a compiler you will have to read the documentation of every compiler you use to determine whether your incorrect code will work, generate a warning or simply fail. Why not just read the standard to determine whether your code is guaranteed to work and be done with it?
For your information I actually posted that I knew how a for loop worked, just wanted to make sure of one particular fact
Either you know something or you don't know something. Why would you need to 'make sure' of something you know?
where the text in the standard was slightly unclear.
Please quote this 'unclear text' from the standard. By the way, I fully expect you to be unable to do so.
I'm sure you're ego would prefer that, but sorry, you're out of luck.
How very amazing. You babble about "the standard" and what we are discussing is "a warning if some architecthure created easy to detect mistake". I have never had alignment problems, be that from typecasting or something else, but have in, at least 5 cases solved somebody elses problem by fixing alignment issues. There are legions of people that believe that a REAL C person should NEVER have even an inkling about what the hardware does, and for such (I have met many) a warning when an easily detectable architecturally dependent mistake is made, what is your opposition to asking for a warning there.
Please quote this 'unclear text' from the standard. By the way, I fully expect you to be unable to do so. sorry not to meet your expectations. the 'description' of a for states "3 is done as long as 2 is valid".
There is nothing said about "updated 3 is written back whatever 2 is"
in addition re an excerpt from you babble above whether your incorrect code will work who on earth cares if incorect code will work, incorect code IS incorrect code.
Think about it: if you rely on a non-standard warning issued by a compiler you will have to read the documentation of every compiler you use to determine whether your incorrect code will work, generate a warning or simply fail.
The question isn't if people rely on warnings. Some people do. But rely or not: We do know that they catch a number of problems. Is a compiler that helps catching problems good or bad?
People don't buy lint programs just because they are stupid or lazy. They know that they make mistakes. They know that projects can have multiple developers, and one developer may make assumptions that are not consistent with the views of the rest of the team. Not too many developers likes to scan every line checked in by all other developers, just in case they have done something stupid.
I try not to rely on my car warning if a door isn't fully closed. However, I still like that extra information. If I chose to distrust everyone, then I would always have to take a full walk around the car whenever someone have touched a door or have thrown in some extra luggage.
How very amazing. You babble about "the standard" and what we are discussing is "a warning if some architecthure created easy to detect mistake".
You misunderstand. If you write code that is guaranteed by the standard to work then you don't have to worry about or rely on the compiler to issue a warning. Simple.
I have never had alignment problems,
Ok, ok. I'll pretend to believe that all these mistakes you seem to have encountered have been made by other people. I'm sure everyone else will as well. Let's face it, you demonstrate your exemplary knowledge from the very basics of 'C' all the way up to complicated concepts like a 'for' statement with refreshing regularity.
There are legions of people that believe that a REAL C person should NEVER have even an inkling about what the hardware does,
A real 'C' person writes code that is guaranteed to work by the standard ie *code that will work on all architectures* wherever possible. Unnecessary use of knowledge of the architecture makes code non-portable, and depending on precisely what non-standard behaviour you are relying on potentially unreliable.
a warning when an easily detectable architecturally dependent mistake is made, what is your opposition to asking for a warning there.
I have no objection to the warning, and the compiler is entitled to issue one if it wishes. What I object to is your reliance on such warnings, or complaints about lack of them, rather than just doing the sensible thing and actually reading the language documentation to find out whether the code is guaranteed to work.
sorry not to meet your expectations. the 'description' of a for states "3 is done as long as 2 is valid".
Please quote the paragraph number, I'm having trouble finding that particular bit of text.
who on earth cares if incorect code will work, incorect code IS incorrect code.
That is precisely my point. You need to write correct code rather than relying on your 'suck it and see' approach.
View all questions in Keil forum