This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Is this a C51 / uvision bug?

Hi all
Below is the result of some debugging. I have isolated some code from a bigger project and have put it into a stand-alone project.

I basically don't understand why I can do right bit-shifting and AND'ing on single line, when I can't do left bit-shifting and AND'ing on single line.

If it isn't a bug, then what have I missed?

I have included many comments to describe my problem

#include <ADUC832.H>
#include <string.h>

void main(void) {


char ascii[] = "MC";
unsigned char pdu[3];
int w=0, r=0, len;
char ch1, ch2, rr, rl;


 /*  This is what I want to do:

  while-loop run 1:
     1: Assign to var 'ch1':   ch1 = 'M' (= 0x4D = 0100 1101)
     2: Assign to var 'ch2':   ch2 = 'C' (= 0x43 = 0100 0011)
     3: Assign to var 'w'  :     w = 0
     4: OR together the following:
         ((ch1 >>(w%7))&0x7F) | ((ch2 <<(7-(w%7)))&0xFF);
     <=>     0100 1101        |       1000 0000
     <=>                  1100 1101
     <=>                     0xCD

  while-loop run 2:
     1: Assign to var 'ch1':   ch1 = 'C' (= 0x43 = 0100 0011)
     2: Assign to var 'ch2':   ch2 = 0x00
     3: Assign to var 'w'  :     w = 1
     4: OR together the following:
         ((ch1 >>(w%7))&0x7F) | ((ch2 <<(7-(w%7)))&0xFF);
     <=>     0010 0001        |       0000 0000
     <=>                  0010 0001
     <=>                     0x21

 */

len=strlen(ascii);

while (r<len) {

 // ------ First OR-part  -----------------------
 // -------Both versions below are OK  ----------

      // -- VER 1: OK
      //  ch1 = ascii[r];
      //  rr  = (w%7);
      //  ch1 = (ch1 >> rr) & 0x7F;

      // -- VER 2: OK
        ch1 = (ascii[r] >> (w%7)) & 0x7F;    // Bit-shifting and AND'ing
                                             // may be done in one line

 // ------  Second OR-part  -----------------------------
 //-------  Both versions below are NOT OK ??  ----------

      // -- VER 1: OK
        ch2 = ascii[r+1];
        rl = (7-(w%7));
        ch2 = (ch2 << rl) & ((char)0xFF);    // Bit shift and AND'ing can be
                                             // done in one line, IF type cast
                                             // is used - why?
      //  ch2 = ch2 & 0xFF;                  // If splitting into new line
                                             // type cast is not required?

      // -- VER 2: NOT OK
      //  ch2 = (ascii[r+1] << (7-(w%7))) & 0xFF;  // type cast doesn't help
      //  ch2 = ch2 & 0xFF;  // AND'ing must be on seperate line ?


    //----------------------------------------------------------------
    // IS THIS A BUG ??
    //----------------------------------------------------------------
    // Why can we bit-shift and do the AND'ing in a single line
    // for the first OR-part above, but cannot do it for the second
    // OR-part where bit-shifting and AND'ing must be on two seperate
    // lines ???
    //----------------------------------------------------------------

// ------ Do the actual OR'ing -------
        pdu[w]= (ch1 | ch2) ;

        if ((w%7)==6)
           r++;
        r++;
        w++;
    }
    pdu[w]=0; // terminator

    //----------------------------------------------------------------
    // Run to here in debugger and look at content of
    // local variable 'pdu'.
    // When using 'NOT OK' versions from above
    // pdu will contain {0x4D, 0x21, 0x00}
    // and not {0xCD, 0x21, 0x00} as the 'OK' versions
    // produce.
    //----------------------------------------------------------------

   while(1);

}

Parents Reply Children
  • AND'ing a char with 0xFF should leave the char unmodified, right? Then why isn't that the case?
    Try run the code in the keil debugger.

    Your right, a type cast shouldn't really be included.
    But I just experience that this produce correct results

     ch2 = ascii[r+1];  // ch2 is a char
     rl = (7-(w%7));    //  rl is a char
     ch2 = (ch2 << rl) & ((char)0xFF);  // Use type cast
    


    and this doesn't (i.e. without type cast)

     ch2 = ascii[r+1];  // ch2 is a char
     rl = (7-(w%7));    //  rl is a char
     ch2 = (ch2 << rl) & (0xFF);  // don't use type cast
    

  • Ok, I can do everything in a single line:

     pdu[w] = ((ascii[r] >> (w%7)) & 0x7F) | (ascii[r+1] << (7-(w%7)));
    

    But still, why do I get incorrect results when I AND a char with 0xFF?

  • But still, why do I get incorrect results when I AND a char with 0xFF?<p>

    Have you tried different optimization levels ?

    Can you post the resulting assembly ?

  • I sent an 'inconsistency report' to support re the following which may be what this is about (all are U8 except GSloadCnt which is U16

              GCselect = SEL_S882C ;
              GSloadCnt = GClwdt * GClhgt; // momma
              GSloadCnt = GSloadCnt  / GX_ATT.FSLlin; // momma
              GSloadCnt = GSloadCnt  / 2; // momma
    works
    
    this (which is the same concatenated)
             GSloadCnt = (((GClwdt * GClhgt) /  GX_ATT.FSLlin) / 2);
    gives the wrong result (zero)
    
    but this gives the right result
              GSloadCnt = (((GClwdt * GClhgt) / (U16) GX_ATT.FSLlin) / 2);
    

    I did not go for an extended study of the C standard to see what was right and what was wrong, but I do believe that whether the typecast is required or not should be the same in both cases.

    Erik

  • You don't show your data declarations. Are you using signed or unsigned data?

    Note that the compiler normally performs operations on integers. If the compiler upgrades your signed characters to 16-bit integers, the 0x7f constant will be converted to 0x007f (which gives expected result) but the 0xff constant will become 0xffff when sign-extended.

    Hence, if sign-extends are performed, the right-shift will work, but the left-shift will fail.

    Try to write (unsigned char)0xff or 0xffu.

  • I try to avoid signed :) in the above all are unsigned.

    anyhow, whether signed or onsigned, the need or none for typecast should be the same in both cases.

    Erik

    PS what 'constants'

  • ages ago, someone commented something like 'momma help' at some code that was temporary/to be implemented and somehow it stuck. Thus you see //momma for the above temporary code.

    Erik

  • The problem is (GClwdt * GClhgt) ... if you don't typecast the result, it is U8. When you divide the value by a U16, the compiler uses a U16 result.

    By chance are you using NoIntPromote? For those of you in the IDE that means not checking Enable ANSI integer promotion rules.

  • By chance are you using NoIntPromote?

    since I have never been concerned about it, I do not know.

    I do not use the IDE (it can't do what I need) so "NoIntPromote" is (re)set to whatever is the default for commandline compile.

    Still, my query is not what should it be, but why does it work 'split' but not 'combined', the requirements should be the same.

    Erik

  • I think I have narrowed it bit more down.

    This code: ('uchar' and AND'ing )

    void main(void) {
            unsigned char ascii = 'C';
            char ch1, ch2;
    
      ch1 = (ascii <

    and this ('char' and no AND'ing )

    void main(void) {
            char ascii = 'C';
            char ch1, ch2;
    
      ch1 = (ascii <

    produce the same assembly:

    line level    source
    
       1          #include 
       2          #include 
       3
       4          void main(void) {
       5   1        char ascii = 'C';
       6   1        char ch1, ch2;
       7   1
       8   1        ch1 = (ascii <

    However, if defining 'ascii' above as 'char' and include AND'ing, I get

    
    line level    source
    
       1          #include 
       2          #include 
       3
       4          void main(void) {
       5   1        char ascii = 'C';
       6   1        char ch1, ch2;
       7   1
       8   1        ch1 = (ascii <

    Can someone please translate the difference between these assemblies into english

    Note
    This message was edited to reduce width.

  • Still, my query is not what should it be, but why does it work 'split' but not 'combined', the requirements should be the same.

    Not necessarily. What you're overlooking is that by splitting up the computation, you've introduced a couple of implicit casts (if you run a MISRA rules checker on the example, it'll warn you about them not being made explicit). The actual equivalent of the computation statements in this sample

    u8 a, b, c;
    u16 result;
    
    result = a * b;
    result = result / c;
    result = result / 2;
    

    combined into a single line is not, as your example suggested

    result = ((a * b) / c / 2);
    

    but rather something like

    result = ((u16)((u16)(a * b) / (u16)c) / (u16)2);
    

    By assigning the intermediate results to the u16 variable "result", you effectively introduce a cast of these values to type u16. Which in turn requires implicit casts of the latter two operands, c and 2, to the same type. All these casts are missing from your one-liner rendition of the computation, and that's what rightfully breaks the equivalence you expected.

  • By assigning the intermediate results to the u16 variable "result", you effectively introduce a cast of these values to type u16. Which in turn requires implicit casts of the latter two operands, c and 2, to the same type. All these casts are missing from your one-liner rendition of the computation, and that's what rightfully breaks the equivalence you expected.

    Oh, well, I just wondered, after all it works.

    Erik

  • Having played a bit with this funny left bit-shifting issue the conclusion seems to be:
    When left bit-shifting a char variable and AND'ing with 0xFF the MSB (sign bit) is not allowed to change.

    This seems to be a specific char issue. Signed int variables left-shiftet and AND'ed with 0xFFFF are allowed to change MSB.

    Below are some code examples to illustrate.
    In all examples ch1 is a signed char.

    With MSB initially cleared we cannot set it:

      ch1 = 0x01;               // MSB initially NOT set
      ch1 = (ch1 << 7) & 0xFF;  // This produces 0x00
    

    No AND'ing

      ch1 = 0x01;
      ch1 = (ch1 << 7);         // This produces 0x80
    

    When not using a variable, but a casted constant

      ch1 = ((char)0x01 << 7) & 0xFF; // This produces 0x80
    

    MSB can be set when AND'ing with something other than 0xFF

      ch1 = 0x01;
      ch1 = (ch1 << 7) & 0xF0;   // This produce 0x80
    

    Now try the opposite. With MSB initially set,
    left bit-shift will not be able to clear it

      ch1 = 0x80;
      ch1 = (ch1 << 7) & 0xFF;  // This produce 0x80
    
    

    Is this really intentionally? If so, why only for chars? And why only when AND'ing with 0xFF?

  • The central clues remain the same: char is a signed type, and operations in C programs are defined to take place in a type no smaller than 'int'. This line

    ch1 = (ch1 << 7) & 0xff
    

    is implicitly transformed to the following:

    ch1 = (char)(((int)ch1 << 7) & 0xff);
    

    The result of this operation is clearly 0x80. Now, these implied casts are called "ANSI integer promotions", and in C51 they can be turned off (option NOINTPROMOTE). Technically, the moment you use this option, all bets are off as to what the program may do.

    But for the sake of the argument, let's say this option is in action, so the statement gets turned into:

    ch1 = ((ch1 << 7) & (char)0xff);
    

    You've thus shifted a signed value into overflow: 1 << 7 is 128, which is too large for a char to hold. It's anybody's guess what might happen in this case. Well, don't do that then.

    Lesson learned: don't ever shift values of signed types. And while at it, better don't use bitwise operators on them, either.

  • I did not go for an extended study of the C standard to see what was right and what was wrong

    Still refusing to read the manual, then?

    but I do believe that whether the typecast is required or not should be the same in both cases

    Facts, man, facts. This is engineering, not religion.

    I sent an 'inconsistency report' to support

    Oh dear.