This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

incorrect behavior with [somewhat] complex struct...

I have a somewhat complex structure in my C code, and driver functions to read or write data with it.

The simpler parts of the struct (such as ioflags2 below) and their driver functions work just fine. However, the more complex parts of the struct (such as ioflags3) do not.

The symptom is that only the drivers for the first part of the more complex parts of the structure work right, and the rest don't appear to change the "word" part of the structure.

However, if I step through the code with the simulator, all of the individual parts of the more complex parts of the struct work just fine with the driver functions, but the "word" part doesn't change (well, except for the 1st part of the more complex parts, i.e., module1 works, but module2 - module5 don't).

Finally, I cut/paste the whole struct and drivers to a Visual C++ program and stepped through the code with their debugger. *ALL* of the driver functions had the expected effect on *ALL* of the parts of the whole struct.

Can anyone tell me what, if anything, is wrong with this code, and how to make it work with KEIL's ARM tools?

  union{
    u16 word;
    struct{
      unsigned sensor1:1;
      unsigned sensor2:1;
      unsigned sensor3:1;
      unsigned sensor4:1;
      unsigned sensor5:1;
      unsigned sensor6:1;
      unsigned sensor7:1;
      unsigned sensor8:1;
      unsigned sensor9:1;
      unsigned sensor10:1;
      unsigned reserved_bit_10:1;
      unsigned reserved_bit_11:1;
      unsigned reserved_bit_12:1;
      unsigned reserved_bit_13:1;
      unsigned reserved_bit_14:1;
      unsigned reserved_bit_15:1;
    }bit;
  }ioflags2;

  union{
    u16 word;
    struct{
      struct{
        struct{
          unsigned run:1;
          unsigned direction:1;
        }stepper_motor;
      }module1;

      union{
        struct{
          unsigned extend:1;
          unsigned retract:1;
        }ioctl;
        struct{
          unsigned run:1;
          unsigned direction:1;
        }dc_motor;
      }module2;

      union{
        struct{
          unsigned open:1;
          unsigned close:1;
        }ioctl;
        struct{
          unsigned run:1;
          unsigned direction:1;
        }dc_motor;
      }module3;

      union{
        struct{
          unsigned open:1;
          unsigned close:1;
        }ioctl;
        struct{
          unsigned run:1;
          unsigned direction:1;
        }dc_motor;
      }module4;

      union{
        struct{
          unsigned reserved_bit_0:1;
          unsigned fan:1;
        }ioctl;
        struct{
          unsigned direction:1;
          unsigned run:1;
        }dc_motor;
      }module5;

      unsigned reserved_bit_10:1;
      unsigned reserved_bit_11:1;
      unsigned reserved_bit_12:1;
      unsigned reserved_bit_13:1;
      unsigned reserved_bit_14:1;
      unsigned reserved_bit_15:1;
    }bit;
  }ioflags3;

And here's how my drivers manipulate the structure.

int set_module2_retract(int state){
  int previous_state;

  ASSERT(module2_type() == TYPE_IOCTL);

  if(module2_type() != TYPE_IOCTL){
    return(-1);
  }

  if(state){
    set_module1_extend(0); // protect the hardware!
  }

  previous_state = variable.ioflags3.bit.module2.ioctl.retract;

  variable.ioflags3.bit.module2.ioctl.retract = state ? 1u : 0u;

  return(previous_state);
}

Parents
  • "If you modify all bits in the field, you still have to mask the other bits in the container, no? Could you give an example, please?"

    Normally yes, but there are two special cases.

    When I want to set all bits in one field to one, the compiler can directly do the "or" operation, without first masking any other fields.

    When I want to clear all bits in one field to zero, the compiler don't need to perform any "or" operation - it is enough to just do the "and not" to keep everything but this field.

    I have seen compilers that have missed this optimization for fields bigger than one bit. The ARM compiler managing to do a clear+set for a one-bit field was a first. But it was only with zero optimization so it isn't something I would see as a problem.

    But as you have already noted, such optimization failures can be corrected in the next version of the compiler.

    Another thing that can sometimes be funny with some compilers is if you use signed fields. I have seen compilers generating very interesting code then, because of the need of sign-extend when you assign. This is normally a problem with microcontrollers, since you would normally only map unsigned fields to SFR.

    About the example I showed earlier. Yes, any competent programmer should read out the register value into a temporary variable before processing the bits. And I do expect people who writes drivers/ISR to have read the relevant parts of the datasheet. But that doesn't always happens when a software gets old and the responsibility moved to some new guy/gal just out from school and where just some very minor changes are expected.

    The previous example whas a bit "odd" in that few processors perform auto-acknowledge for a interrupt or status register. Auto-acknowledge is normally reserved for the RX register of a communication port. But what is quite common is that you have an interrupt register where each bit flags an interrupt source, and you then write a one to that bit to acknowledge. An example is the timer interrupt register in a NXP LPC17xx (Cortex-M3) or LPC23xx (ARM7) chip.

    It is so very easy to write code like the following:

    void handle_timer0_irq() {
        if (t0ir.mr0) {
            // Handle match 0
            t0ir.mr0 = 1; // Acknowledge MR0 (###)
        }
        if (t0ir.mr1) {
            // Handle match 1
            t0ir.mr1 = 1; // Acknowledge MR1 (###)
        }
        //...
        if (t0ir.cr0) {
            // Handle capture channel 0
            t0ir.cr0 = 1; // Acknowledge CR0 (###)
        }
        if (t0ir.cr1) {
            // Handle capture channel 1
            t0ir.cr1 = 1; // Acknowledge CR0 (###)
        }
    }
    


    The application may normally expect only one interrupt flag at a time, but it isn't impossible that you may have multiple timer events so close together that several is trigged before you enter the ISR, or that one more event happens while in the above code.

    Each of the ### represents an acknowledge of one (1) timer event, but since we are talking about bit fields, the ARM ABI requires a read and a write of the full container for every field assign.

    So when the developer thinks that he acknowledges one timer event, the ARM will read the T0IR register and potentially pick up a number of set fields that will be or:ed with the bit the programmer intended to acknowledge. The net result is a potential loss of some timer events.

    If using the more visible alternative, using standard or/and code, the above would have looked like:

    enum {
        TxIR_MR0 = 1u << 0,
        TxIR_MR1 = 1u << 1,
        TxIR_MR2 = 1u << 2,
        TxIR_MR3 = 1u << 3,
        TxIR_CR0 = 1u << 4,
        TxIR_CR1 = 1u << 5,
        ...
    };
    void handle_timer0_irq() {
        if (t0ir & TxIR_MR0) {
            // Handle match 0
            t0ir |= TxIR_MR0; // Safe acknowledge of MR0
        }
        if (t0ir & TxIR_MR1) {
            // Handle match 1
            t0ir |= TxIR_MR1; // Safe acknowledge of MR1
        }
        //...
        if (t0ir & TxIR_CR0) {
            // Handle capture channel 0
            t0ir |= TxIR_CR0; // Safe acknowledge of CR0
        }
        if (t0ir.cr1) {
            // Handle capture channel 1
            t0ir |= TxIR_CR1; // Safe acknowledge of CR1
        }
    }
    


    In the end, there is nothing wrong with bit fields. It is just that you always have to think about the hidden "bonus" you may get.

    If you are using bit fields, you may find that you need to spend more time adding extra comments than what you save from the neater code.

Reply
  • "If you modify all bits in the field, you still have to mask the other bits in the container, no? Could you give an example, please?"

    Normally yes, but there are two special cases.

    When I want to set all bits in one field to one, the compiler can directly do the "or" operation, without first masking any other fields.

    When I want to clear all bits in one field to zero, the compiler don't need to perform any "or" operation - it is enough to just do the "and not" to keep everything but this field.

    I have seen compilers that have missed this optimization for fields bigger than one bit. The ARM compiler managing to do a clear+set for a one-bit field was a first. But it was only with zero optimization so it isn't something I would see as a problem.

    But as you have already noted, such optimization failures can be corrected in the next version of the compiler.

    Another thing that can sometimes be funny with some compilers is if you use signed fields. I have seen compilers generating very interesting code then, because of the need of sign-extend when you assign. This is normally a problem with microcontrollers, since you would normally only map unsigned fields to SFR.

    About the example I showed earlier. Yes, any competent programmer should read out the register value into a temporary variable before processing the bits. And I do expect people who writes drivers/ISR to have read the relevant parts of the datasheet. But that doesn't always happens when a software gets old and the responsibility moved to some new guy/gal just out from school and where just some very minor changes are expected.

    The previous example whas a bit "odd" in that few processors perform auto-acknowledge for a interrupt or status register. Auto-acknowledge is normally reserved for the RX register of a communication port. But what is quite common is that you have an interrupt register where each bit flags an interrupt source, and you then write a one to that bit to acknowledge. An example is the timer interrupt register in a NXP LPC17xx (Cortex-M3) or LPC23xx (ARM7) chip.

    It is so very easy to write code like the following:

    void handle_timer0_irq() {
        if (t0ir.mr0) {
            // Handle match 0
            t0ir.mr0 = 1; // Acknowledge MR0 (###)
        }
        if (t0ir.mr1) {
            // Handle match 1
            t0ir.mr1 = 1; // Acknowledge MR1 (###)
        }
        //...
        if (t0ir.cr0) {
            // Handle capture channel 0
            t0ir.cr0 = 1; // Acknowledge CR0 (###)
        }
        if (t0ir.cr1) {
            // Handle capture channel 1
            t0ir.cr1 = 1; // Acknowledge CR0 (###)
        }
    }
    


    The application may normally expect only one interrupt flag at a time, but it isn't impossible that you may have multiple timer events so close together that several is trigged before you enter the ISR, or that one more event happens while in the above code.

    Each of the ### represents an acknowledge of one (1) timer event, but since we are talking about bit fields, the ARM ABI requires a read and a write of the full container for every field assign.

    So when the developer thinks that he acknowledges one timer event, the ARM will read the T0IR register and potentially pick up a number of set fields that will be or:ed with the bit the programmer intended to acknowledge. The net result is a potential loss of some timer events.

    If using the more visible alternative, using standard or/and code, the above would have looked like:

    enum {
        TxIR_MR0 = 1u << 0,
        TxIR_MR1 = 1u << 1,
        TxIR_MR2 = 1u << 2,
        TxIR_MR3 = 1u << 3,
        TxIR_CR0 = 1u << 4,
        TxIR_CR1 = 1u << 5,
        ...
    };
    void handle_timer0_irq() {
        if (t0ir & TxIR_MR0) {
            // Handle match 0
            t0ir |= TxIR_MR0; // Safe acknowledge of MR0
        }
        if (t0ir & TxIR_MR1) {
            // Handle match 1
            t0ir |= TxIR_MR1; // Safe acknowledge of MR1
        }
        //...
        if (t0ir & TxIR_CR0) {
            // Handle capture channel 0
            t0ir |= TxIR_CR0; // Safe acknowledge of CR0
        }
        if (t0ir.cr1) {
            // Handle capture channel 1
            t0ir |= TxIR_CR1; // Safe acknowledge of CR1
        }
    }
    


    In the end, there is nothing wrong with bit fields. It is just that you always have to think about the hidden "bonus" you may get.

    If you are using bit fields, you may find that you need to spend more time adding extra comments than what you save from the neater code.

Children
  • Hi Per

    Of course we have moved far off topic by now. The OP asked about a
    data structure and not HW registers.

    > It is so very easy to write code like the following:
    >
    > [...]
    >
    > So when the developer thinks that he acknowledges one timer event,
    > the ARM will read the T0IR register and potentially pick up a number
    > of set fields that will be or:ed with the bit the programmer
    > intended to acknowledge. The net result is a potential loss of some
    > timer events.

    Yes that is true.

    > If using the more visible alternative, using standard or/and code,
    > the above would have looked like:
    >
    > [...]

    Hmm, and in which way does the second example differ from the first
    one? Both ISR read the control register multiple times and may loose
    timer events. In fact I compiled both versions and with the exception
    of a single instruction the code is identical as expected. I fail to
    see the advantage of the second solution.

    Please provide a full minimal example that demonstrates what you were
    thinking of.

    Thanks
    Marcus
    http://www.doulos.com/arm/

  • And... getting back to the original question, could I have used the __packed attribute to make the original code work like I expected it to (i.e., to shrink the number of bytes required to implement the ioflags variables to the min size possible)?

  • I believe what Per was saying (although if you get lost in the discussion it has the distinct possibility of being lost). Do not rely on the compiler to deal with the hardware in a certain manner. This is VERY important especially if you change processors (which I have done numerous tmies) or compilers (same there).

    What I garnered is that he was suggesting using the compiler for what is known to work with normal C/C++ conventions in ALL compilers. That is use the 'ugly' standard C '&' '|' operators so that it will work irregardless of the compilers arbitrary support methods for bit fields.

    The assumption that the compiler you have available (IE Turbo C++ Borland C++ GCC {in it's numerous incarnations} ) will work the same irregardless of what target or version you are using is a bad one.

    Summary make your code blatantly obvious what it is doing, and comment it heavily. So that no one can guess what is being modified. Do not rely on any feature of a compiler (again bit fields is the example) to make the code work. Readability also includes understanding what is being done and why, not just having nice looking code (that hides lots of potential problems, and sins, EX one small change is made such as recompiling the project with a different version of the compiler).

    A good resource is http://www.ganssle.com/ and http://www.embedded.com for this, and numerous other problems you might encounter.

    Stephen

  • No. If you pack the data each union will shrink to 1 byte and you will need a 6th byte for the 6 reserved bits you have last.

    That means that the 2 bytes of the u16 will be overlaid with 6 bytes.

    Your unions module1, module2, module3, module4 and module5 will not magically merge into a single 16-bit container.

    Paragraph 6.7.2.1 of the C standard says in section 13 that "Within a structure object, the non-bit-field members and the units in which bit-fields
    reside have addresses that increase in the order in which they are declared."

    In 14, the standard specifies that "A pointer to a
    union object, suitably converted, points to each of its members (or if a member is a bitfield,
    then to the unit in which it resides), and vice versa."

    The language standard will not merge your bit fields since that would mean that the struct directly following the u16 would have multiple members having the same address. But only a union may have multiple members with the same address.

  • The problem example in my first post was for a processor that auto-acknowledges status bits as they are read. That means that the software must do one (1) read and then process all bits.

    The solution to that (if we ignore other problems with bit fields) is to copy the register value once to a union of an unsigned and the bit fields and write:

    com1_status_t tmp_status;
    
    // Sample and acknowledge  status bits.
    tmp_status.raw = com1_status;
    // Process any detected errors
    if (tmp_status.b.rx_overrun) ...
    if (tmp_status.b.rx_parity) ...
    

    The problem example in the later post was for a processor where you write a '1' to a bit position to a write-only register overlapping the status register to acknowledge a specific event.

    Such a processor does not have a problem with multiple read operations - a later read may possibly pick up one or more extra bits, but that isn't dangerous. But such a processor does not allow a read-modify-write for acknowledging a single bit since the design has a read-only register overlapping a write-only register.

    Alas, I managed to **** up the example of the old-style solution. There shouldn't have been any |= in the assign, but just an assign. Sorry for the confusion.

    t0ir = TxIR_MR1; // Safe acknowledge of MR1 since no read-modify-write
    

    The solution to the second problem, if using bit-fields (and ignoring all the other problems with bit-fields) is to use a temporary union of an unsigned and bit fields for the acknowledge part:

    timer_ir_t ack;
    ack.raw = 0; // Don't know anything to acknowledge yet
    if (t0ir.b.tm0) {
        // handle event.
        ack.b.tm0 = 1; // should ack tm0 - read-modify-write in RAM is safe
    }
    if (t0ir.b.tm1) {
        // handle tm1 event.
        ack.b.tm1 = 1; // should ack tm1 - read-modify-write in RAM is safe
    }
    ...
    t0ir.raw = ack.raw; // acknowledge processed events - no read-modify-write, just a write
    

    One thing here is that when moving between different ARM compilers, the ARM ABI locks down the behaviour of bit-fields by being way stricter than the C language standard. But when leaving the ARM world, it isn't even known if the bits will be allocated low-to-high or high-to-low, so switching between two different compilers but staying with the same processor can completely break everything.

  • > The problem example in my first post was for a processor that
    > auto-acknowledges status bits as they are read. That means that the
    > software must do one (1) read and then process all bits.

    I think I got that part.

    > The solution to that (if we ignore other problems with bit fields) is
    > to copy the register value once to a union of an unsigned and the bit
    > fields and write:

    Sure, as I said. But the requirement of having to read the register
    just once in this case is not specific to bit fields, is it.

    > Alas, I managed to **** up the example of the old-style
    > solution. There shouldn't have been any |= in the assign, but just an
    > assign. Sorry for the confusion.

    No problem, but to be sure that we are on the same page I would still
    like to see a self contained minimal example, and preferably an
    example of a device (ideally ARM based) which exibits the difference
    behavior that you are trying to show. And I am not talking about a
    handful of perhaps redundant instructions, but code that actually
    behaves incorrectly.

    > But when leaving the ARM world, it isn't even known if the bits will
    > be allocated low-to-high or high-to-low, so switching between two
    > different compilers but staying with the same processor can
    > completely break everything.

    Possibly. It all depends on what degree of protability I need. I don't
    expect to find e.g. ARMv6/v7 page table descriptors in other
    architectures. It is perhaps a choice whether I prefer using the
    "traditional" approach with macros/enums/bit-ops everywhere or
    whether I would rather have an architecture specific header file with
    appropriate bit-field definitions.

    I don't claim that bit fields are always the better approach, but it
    seems that many people show some knee-jerk reactions against bit
    fields wherever the question comes up.

    There are valid arguments against using bit fields in certain
    situations. But most arguments thrown out by people can be disproved
    easily.

    Regards
    Marcus
    http://www.doulos.com/arm/

  • "I would still like to see a self contained minimal example, and preferably an example of a device (ideally ARM based) which exibits the difference behavior that you are trying to show. And I am not talking about a handful of perhaps redundant instructions, but code that actually
    behaves incorrectly."

    you will find, as many others have, that Per is an expert on irrelevancy and s/he will talk about anything but.

    Put my words down: Per will NOT produce such an example in the next 10 posts of his.