This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

this is non portable code ?


unsigned char buf[100]
.
.
.
unsigned int val;
.
.
.
val = *((unsigned int *)(&buf[1]));
.
.
.

comments?

Parents
  • When I complained about it to one compiler maker, I was informed "there is nothing in ANSI C about this".

    That information is correct. Code like the OP is as wrong as C code can possibly be, while still apparently working sometimes: it causes undefined behaviour. Literally anything can happen, because the language makes no promises whatsoever what such a program may do. "Anything" of course includes "what the coder expected", which makes this kind of error so nasty --- it'll just work for quite a while, but unexpectedly break when you use the same code on a slightly different platform.

    I never checked, what good does it do to state "there is something in ANSI C about this", if your program does not work because of such.

    The "because of such" part is incorrect. The program fails for a much simpler reason: it's wrong. The code makes assumptions about misaligned access via a maltreated pointer, that aren't backed up by any applicable rule. The code will only work if those assumptions happen to be true.

Reply
  • When I complained about it to one compiler maker, I was informed "there is nothing in ANSI C about this".

    That information is correct. Code like the OP is as wrong as C code can possibly be, while still apparently working sometimes: it causes undefined behaviour. Literally anything can happen, because the language makes no promises whatsoever what such a program may do. "Anything" of course includes "what the coder expected", which makes this kind of error so nasty --- it'll just work for quite a while, but unexpectedly break when you use the same code on a slightly different platform.

    I never checked, what good does it do to state "there is something in ANSI C about this", if your program does not work because of such.

    The "because of such" part is incorrect. The program fails for a much simpler reason: it's wrong. The code makes assumptions about misaligned access via a maltreated pointer, that aren't backed up by any applicable rule. The code will only work if those assumptions happen to be true.

Children
  • I never checked, what good does it do to state "there is something in ANSI C about this", if your program does not work because of such.

    The "because of such" part is incorrect. The program fails for a much simpler reason: it's wrong. The code makes assumptions about misaligned access via a maltreated pointer, that aren't backed up by any applicable rule. The code will only work if those assumptions happen to be true.

    what I ment by "because of such" was that if the tool (is supposed to) behave in such a way under such circumstances, well, the thing to do is to avoid "such circumstances", there is no other way to get the product out.

    BTW the 'error' was not mine, it occurred when I was the contact person to the compiler manufacturer while we were using a beta of a 16 bit compiler and one of my coworkers (one of those 'coders' that make it a point of pride to be ignorant of the hardware) asked "why does it not work". I realized rather quickly what was going on and, in the hope that a compiler 'catch' existed reported the problem. I would claim that ANY multiword byte compiler should, at least, warn when a memory location is "typecasted up"

    Erik

  • Thanks for your comments.

    My view is that it is obvious non-portable code; but since the coder wrote it explicitly, it is intentional and is not wrong.

    It may rely on assumptions concerning compiler and platform, but if those assumptions are constant for said compiler and platform then the assumptions are relatively safe.

    When I get out of bed in the morning, I do not put on my wellington boots. I assume that there has been no flood while I've been asleep. It's not wrong to assume that, but it's a fairly safe bet.

    If I were to go to (say) the Pacific island of Tivalu, I may well change the assumption; just like I would if I were to change compiler and/or platform.

  • "since the coder wrote it explicitly, it is intentional and is not wrong."

    No, that does not follow at all!
    You may happen to be right in this case, but you can't generally make that assumption!

    "It may rely on assumptions concerning compiler and platform, but if those assumptions are constant for said compiler and platform then the assumptions are relatively safe."

    Unfortunately, it may rely upon false assumptions - and it might only work by pure luck!

    Also, there is no guarantee that the assumptions will not become false with a compiler update - or possibly even if some options are changed...

    That's why all such assumptions should be clearly and fully documented!

    And, if the coder didn't bother to provide such documentation, you at least have to wonder if that's because she/he didn't understand the issues...

  • No, a program using implementation-specific behaviours in C is not considered invalid. Just less portable or in some cases buggy. It's all a question about what assumptions the developer made, and if these assumptions are valid for the intended platform(s).

    C did intentionally allow a number of specifics to be left to the architecture or language implementation. If it would have been invalid, the language standard would instead have specified that the compiler should (or must) flag the code as invalid (an error) if it is able to detect the problem.

    If I know that the processor is little-endian, the size of an integer, and that the data is aligned or that the target architecture can work with unaligned data, then I am completely allowed to typecast that position of the received byte array into an int pointer, allowing me to read the value in a single instruction, instead of reading it as two (or more) bytes, and shifting the individual bytes into the correct position.

    It isn't wrong to do it. It ist just a question of calculated risk contra potential gain. Yes, people can get their foot shot off, but people who don't think about the possibility of division by zero or stack size or numeric overflow can also get their feet shot off.

    The compiler isn't allowed to issue an error (unless explicitly allowed by the user) for non-portable usage, because the compiler is required to produce a runnable program. It is then up to the specific hardware if the application will generate an exception or strange results.¨

    To give an explicit example: 6.1.3.4 in the language standard notes that it is undefined behaviour to make a conversion from an integer data typ to a different data type that can't handle the full numeric range. Assigning an int to a char for example. Whe may now and then see warnings "Conversion might loose significant digits" etc, but the programs are still valid. Some long definitions are twice the size of int. Some are not. Even if I add a typecast intvar = (int)logvar, it still represents a conversion from long to int. If this represents an invalid program, then most applications larger than "hello world" (and quite a number of them too) would be invalid.

    Most source lines we write are based on assumptions. For example the assumption that other code somewhere have verified the input range of all parameters read - if not, every single + or * could overflow. And since the behaviour then can be undefined (note that not all machines are two-complement) then every + or * would require code to establish that they can not fail. But even that code would probably contain code that - depending on situation - may need to make assumptions. How about machines that can have +0 and -0? They exist, and we just have to make assumptions - some math function code is allowed to treat +0 and -0 differently depending on architecture. How many have specifically tried to tell the compiler that your intention is +0?

    A different example: How many significant characters do you use in externally linked symbols? Different linkers supports different length of external symbols. Should all programs that doesn't have significant symbol names within the first 6 characters be invalid?

    Anyone using memset() to clear a large struct or array? But what is the internal representation of 0 by the harddware?

    Is a program invalid if it writes a 100 character text string without any newline characters in it? It is considered undefined behaviour to write past the end of the terminal width - but since there exists handhelds with puny displays, the assumption would then be that anything that writes more than a single character before a newline may trigger 5.2.2.

  • "It may rely on assumptions"

    some of the longest debugging sessions ... have been a result of relying on assumptions.

    Erik

  • Yes, but life isn't expected to be simple. Anything non-trivial has to be based on a number of assumptions.

    We can't avoid assumptions. Just try to make good ones, and to qualify them. We can made risk assesments for a project - what if our assumptions about used hardware, used tools, available time, stability of customer requirements etc are wrong. We can document our code, specifying what assumptions we have made (or rat least realizes that we have made). We can - if the hhardware permits - perform checked builds, that contains extra integrity-testing code. We can make use of the preprocessor. We can use regression testing...

    While our job is to produce working - and economical - solutions, we can't ignore assumptions. If we think that there are no assumptions involved, then we have just made a very big, and very wrong assumption.

    In short: it is almost impossible to write any non-trivial applications that are guaranteed to work on any existing platform that has an ANSI/ISO-conformant compiler.

  • Unfortunately, it may rely upon false assumptions - and it might only work by pure luck!

    True - But someone who builds up experience of such things can learn to more reliably determine the risk.

    Just to follow on from my previous - Through experience I have determined that I don't have to put on my wellington boots before getting out of bed.

    I would prefer to consider it a calculated risk.

    I would not consider it wrong - I might, possibly, change my mind if I were to get my feet wet one morning ;)

  • True - But someone who builds up experience of such things can learn to more reliably determine the risk.
    there is nothing wrong with experience, if therte was, I would be up the creek re the '51 :)

    Of course, were I to 'verify' my assumption that a char is 8 bits every time I type char, I would never get anywhere.

    I can state my point in another way, which may be better: "when you see a bug, before anything else, verify the correctness of your assumptions"

    Erik

  • My view is that it is obvious non-portable code; but since the coder wrote it explicitly, it is intentional and is not wrong.

    Intention doesn't imply correctness. You're missing the possibility that the coder may just as easily not have the slightest idea what he was doing.

    Code like the OP, particularly without a comment clearly stating the assumptions it relies on and why those assumptions should hold in the case at hand, is wrong.

  • No, a program using implementation-specific behaviours in C is not considered invalid.

    That's quite beside the point --- the code we're talking about is considerably worse than that. It doesn't just rely on implementation-defined behaviour, but rather causes undefined behaviour.

    To give an explicit example: 6.1.3.4 in the language standard notes that it is undefined behaviour to make a conversion from an integer data typ to a different data type that can't handle the full numeric range. Assigning an int to a char for example.

    That example is seriously flawed --- conversion from int to char is not covered by 6.1.3.4. Nor does it cause undefined behaviour. It's covered by 6.1.3.3, and the behaviour is at worst implementation-defined.

  • "You're missing the possibility that the coder may just as easily not have the slightest idea what he was doing."

    There again, you're missing the possibility that I (as the coder) had a very good idea what I was doing!

    The reason I started this thread is that a number of projects I have previously worked on had lines of code very similar to the one I posted; and they relied on the action I expected (on a number of 8 bit and 16 bit processors). Apart from a quirk on an 80x86 core going beyond the 64K boundary, this has been an assumption that served me well.

    Now, however, I am porting the code to an ARM platform and the case of alignment has to be faced. I'm glad to see that the Keil compiler makes specific allowances for this type of situation (re: __packed) thus making my initial job of porting more predictable - At least using the same assumptions I have faced before.

    "Code like the OP, particularly without a comment clearly stating the assumptions it relies on and why those assumptions should hold in the case at hand, is wrong."

    I eliminated the comments from the original code precisely because I wanted to guage the thoughts others would have of the code.

    For those comments, I thank you all; and I appologise to Erik and Jack for re-igniting their (please insert the most appropriate term).

    My conclusion is that I believe that I have followed a pragmatic approach and that I am prepared to accept the fact that I am not a C purist.

  • I'm glad to see that the Keil compiler makes specific allowances for this type of situation (re: __packed)
    do that at your own risk, Jack Sprat (whoever it is that is hiing behind that monniker) will come down on you hard for going outside standard C

    Erik