We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
I have a somewhat complex structure in my C code, and driver functions to read or write data with it.
The simpler parts of the struct (such as ioflags2 below) and their driver functions work just fine. However, the more complex parts of the struct (such as ioflags3) do not.
The symptom is that only the drivers for the first part of the more complex parts of the structure work right, and the rest don't appear to change the "word" part of the structure.
However, if I step through the code with the simulator, all of the individual parts of the more complex parts of the struct work just fine with the driver functions, but the "word" part doesn't change (well, except for the 1st part of the more complex parts, i.e., module1 works, but module2 - module5 don't).
Finally, I cut/paste the whole struct and drivers to a Visual C++ program and stepped through the code with their debugger. *ALL* of the driver functions had the expected effect on *ALL* of the parts of the whole struct.
Can anyone tell me what, if anything, is wrong with this code, and how to make it work with KEIL's ARM tools?
union{ u16 word; struct{ unsigned sensor1:1; unsigned sensor2:1; unsigned sensor3:1; unsigned sensor4:1; unsigned sensor5:1; unsigned sensor6:1; unsigned sensor7:1; unsigned sensor8:1; unsigned sensor9:1; unsigned sensor10:1; unsigned reserved_bit_10:1; unsigned reserved_bit_11:1; unsigned reserved_bit_12:1; unsigned reserved_bit_13:1; unsigned reserved_bit_14:1; unsigned reserved_bit_15:1; }bit; }ioflags2; union{ u16 word; struct{ struct{ struct{ unsigned run:1; unsigned direction:1; }stepper_motor; }module1; union{ struct{ unsigned extend:1; unsigned retract:1; }ioctl; struct{ unsigned run:1; unsigned direction:1; }dc_motor; }module2; union{ struct{ unsigned open:1; unsigned close:1; }ioctl; struct{ unsigned run:1; unsigned direction:1; }dc_motor; }module3; union{ struct{ unsigned open:1; unsigned close:1; }ioctl; struct{ unsigned run:1; unsigned direction:1; }dc_motor; }module4; union{ struct{ unsigned reserved_bit_0:1; unsigned fan:1; }ioctl; struct{ unsigned direction:1; unsigned run:1; }dc_motor; }module5; unsigned reserved_bit_10:1; unsigned reserved_bit_11:1; unsigned reserved_bit_12:1; unsigned reserved_bit_13:1; unsigned reserved_bit_14:1; unsigned reserved_bit_15:1; }bit; }ioflags3;
And here's how my drivers manipulate the structure.
int set_module2_retract(int state){ int previous_state; ASSERT(module2_type() == TYPE_IOCTL); if(module2_type() != TYPE_IOCTL){ return(-1); } if(state){ set_module1_extend(0); // protect the hardware! } previous_state = variable.ioflags3.bit.module2.ioctl.retract; variable.ioflags3.bit.module2.ioctl.retract = state ? 1u : 0u; return(previous_state); }
"Sorry, but this is a made up example. Possible in theory, but in real life HW registers just won't consist of multiple containers."
There is a saying that if sometihng seems to be too good to be true, it usually is.
If you find it very unlikely that a hardware has a HW register consisting of multiple containers, why are you then convinced that I was discussing such a "made up example"?
I was talking about transparency.
I was thinking about that junior developer who gets the support of the software a couple of years from now. He sees assigns to <container>.<field> and decides to make some modification, adding one more such assign. But what isn't visible when reading the *.c files, is that the assign to the field did not just write to the named field, but did perform one read and one write of the full container.
And quite a number of processors have SFR where a read or a write has some form of side effect. And that side effect is not always immediately visible, so an introduced bug can survive a long time before people start to notice the result of that extra side effect.
One processor I worked with in a project had a status register where a read reported all error cases seen since the previous read. No need to write to any register to acknowledge the error flags - just reading them was enough to also clear them.
If mapping such a regsiter to bit fields, the expression
if (com1.status.rx_overrun) ...
would automagically also clear any reported framing, parity, ... errors that may have been detected.
void set_bit(void) { FIO0PIN_BITS.b2 = 1; }
When compiled for a Cortex-M3 processor (Thumb2 instruction set) with -O0, the following code is generated (b2 is a one-bit field):
120: void set_bit(void) { 0x000001CA 4770 BX lr 121: FIO0PIN_BITS.b2 = 1; 0x000001CC 480D LDR r0,[pc,#52] ; @0x00000204 0x000001CE 6940 LDR r0,[r0,#0x14] 0x000001D0 F0200004 BIC r0,r0,#0x04 0x000001D4 1D00 ADDS r0,r0,#4 0x000001D6 490B LDR r1,[pc,#44] ; @0x00000204 0x000001D8 6148 STR r0,[r1,#0x14] 122: }
When compiled with -O1, the problem is solved:
120: void set_bit(void) { 0x000001C4 4770 BX lr 121: FIO0PIN_BITS.b2 = 1; 0x000001C6 4809 LDR r0,[pc,#36] ; @0x000001EC 0x000001C8 6941 LDR r1,[r0,#0x14] 0x000001CA F0410104 ORR r1,r1,#0x04 0x000001CE 6141 STR r1,[r0,#0x14] 122: }
Switching from a Cortex-M3 processor to a processor with full 32-bit ARM instructions, the compiler manages without the clear even ast -O0:
124: void set_bit(void) { 0x00000130 E12FFF1E BX R14 125: FIO0PIN_BITS.b2 = 1; 0x00000134 E59F0034 LDR R0,[PC,#0x0034] 0x00000138 E5901014 LDR R1,[R0,#0x0014] 0x0000013C E3811004 ORR R1,R1,#0x00000004 0x00000140 E5801014 STR R1,[R0,#0x0014] 126: }
Since I'm on vacation right now, I tested the above with the evaluation version of MDK-ARM 3.70.
"True, but compilers have improved over the years and every once in a while we should revisit our findings from the past."
Correct. But note that I did not write about a specific compiler. I did write "But with some compilers, the compiler will produce worse code than if you manually mask the bits." That should be read as: Do not assume that the compiler you have will produce good code for bit fields - check first!
Compound literals helps assign all field values in a single assign. Just note that you may find C++ compilers that will not support it. ARMCC and gnuarm does support it also for C++ sources.
But you seem to have brought up compound literals as a response to my sentence "when your assign clears or sets all bits of the bit field". I wasn't talking about the assign of all fields in a container, but a set or clear of all bits in a single field. When setting all bits, no mask operation is needed. When clearing all fields, no or is needed.
"I was talking about transparency."
Per, I think people will benefit a lot from your presence here if you stop talking.
ciao.
Hi Per
> I was thinking about that junior developer who gets the support of the > software a couple of years from now. He sees assigns to > <container>.<field> and decides to make some modification, adding one > more such assign. But what isn't visible when reading the *.c files, > is that the assign to the field did not just write to the named field, > but did perform one read and one write of the full container.
We were talking about different things then. Sorry. I was referring to the rules of access to volatile bit field structs with multiple, perhaps overlapping (yuck!) containers.
> If mapping such a regsiter to bit fields, the expression > > if (com1.status.rx_overrun) ... > > > would automagically also clear any reported framing, parity, > ... errors that may have been detected.
But since you carefully read the data sheet before starting to write code I am sure that you copied the entire register into a local variable before accessing its contents :^)
Seriously, I see the issue, but I fail to see in which way any other approach would be safer. Does something like
if (com1.status & COM_STATUX_RX_OVERRUN) ...
make a difference here?
> When compiled for a Cortex-M3 processor (Thumb2 instruction set) with > -O0, the following code is generated (b2 is a one-bit field): > [...] > > Switching from a Cortex-M3 processor to a processor with full 32-bit > ARM instructions, the compiler manages without the clear even ast > -O0:
With a CM3 target I can reproduce this. Strange indeed. Thanks for the additional info.
> I wasn't talking about the assign of all fields in a container, but > a set or clear of all bits in a single field. When setting all bits, > no mask operation is needed. When clearing all fields, no or is > needed.
Understand now what you meant but I don't seem to get it. If you modify all bits in the field, you still have to mask the other bits in the container, no? Could you give an example, please?
Thanks Marcus http://www.doulos.com/arm/
"If you modify all bits in the field, you still have to mask the other bits in the container, no? Could you give an example, please?"
Normally yes, but there are two special cases.
When I want to set all bits in one field to one, the compiler can directly do the "or" operation, without first masking any other fields.
When I want to clear all bits in one field to zero, the compiler don't need to perform any "or" operation - it is enough to just do the "and not" to keep everything but this field.
I have seen compilers that have missed this optimization for fields bigger than one bit. The ARM compiler managing to do a clear+set for a one-bit field was a first. But it was only with zero optimization so it isn't something I would see as a problem.
But as you have already noted, such optimization failures can be corrected in the next version of the compiler.
Another thing that can sometimes be funny with some compilers is if you use signed fields. I have seen compilers generating very interesting code then, because of the need of sign-extend when you assign. This is normally a problem with microcontrollers, since you would normally only map unsigned fields to SFR.
About the example I showed earlier. Yes, any competent programmer should read out the register value into a temporary variable before processing the bits. And I do expect people who writes drivers/ISR to have read the relevant parts of the datasheet. But that doesn't always happens when a software gets old and the responsibility moved to some new guy/gal just out from school and where just some very minor changes are expected.
The previous example whas a bit "odd" in that few processors perform auto-acknowledge for a interrupt or status register. Auto-acknowledge is normally reserved for the RX register of a communication port. But what is quite common is that you have an interrupt register where each bit flags an interrupt source, and you then write a one to that bit to acknowledge. An example is the timer interrupt register in a NXP LPC17xx (Cortex-M3) or LPC23xx (ARM7) chip.
It is so very easy to write code like the following:
void handle_timer0_irq() { if (t0ir.mr0) { // Handle match 0 t0ir.mr0 = 1; // Acknowledge MR0 (###) } if (t0ir.mr1) { // Handle match 1 t0ir.mr1 = 1; // Acknowledge MR1 (###) } //... if (t0ir.cr0) { // Handle capture channel 0 t0ir.cr0 = 1; // Acknowledge CR0 (###) } if (t0ir.cr1) { // Handle capture channel 1 t0ir.cr1 = 1; // Acknowledge CR0 (###) } }
The application may normally expect only one interrupt flag at a time, but it isn't impossible that you may have multiple timer events so close together that several is trigged before you enter the ISR, or that one more event happens while in the above code.
Each of the ### represents an acknowledge of one (1) timer event, but since we are talking about bit fields, the ARM ABI requires a read and a write of the full container for every field assign.
So when the developer thinks that he acknowledges one timer event, the ARM will read the T0IR register and potentially pick up a number of set fields that will be or:ed with the bit the programmer intended to acknowledge. The net result is a potential loss of some timer events.
If using the more visible alternative, using standard or/and code, the above would have looked like:
enum { TxIR_MR0 = 1u << 0, TxIR_MR1 = 1u << 1, TxIR_MR2 = 1u << 2, TxIR_MR3 = 1u << 3, TxIR_CR0 = 1u << 4, TxIR_CR1 = 1u << 5, ... }; void handle_timer0_irq() { if (t0ir & TxIR_MR0) { // Handle match 0 t0ir |= TxIR_MR0; // Safe acknowledge of MR0 } if (t0ir & TxIR_MR1) { // Handle match 1 t0ir |= TxIR_MR1; // Safe acknowledge of MR1 } //... if (t0ir & TxIR_CR0) { // Handle capture channel 0 t0ir |= TxIR_CR0; // Safe acknowledge of CR0 } if (t0ir.cr1) { // Handle capture channel 1 t0ir |= TxIR_CR1; // Safe acknowledge of CR1 } }
In the end, there is nothing wrong with bit fields. It is just that you always have to think about the hidden "bonus" you may get.
If you are using bit fields, you may find that you need to spend more time adding extra comments than what you save from the neater code.
Of course we have moved far off topic by now. The OP asked about a data structure and not HW registers.
> It is so very easy to write code like the following: > > [...] > > So when the developer thinks that he acknowledges one timer event, > the ARM will read the T0IR register and potentially pick up a number > of set fields that will be or:ed with the bit the programmer > intended to acknowledge. The net result is a potential loss of some > timer events.
Yes that is true.
> If using the more visible alternative, using standard or/and code, > the above would have looked like: > > [...]
Hmm, and in which way does the second example differ from the first one? Both ISR read the control register multiple times and may loose timer events. In fact I compiled both versions and with the exception of a single instruction the code is identical as expected. I fail to see the advantage of the second solution.
Please provide a full minimal example that demonstrates what you were thinking of.
And... getting back to the original question, could I have used the __packed attribute to make the original code work like I expected it to (i.e., to shrink the number of bytes required to implement the ioflags variables to the min size possible)?
I believe what Per was saying (although if you get lost in the discussion it has the distinct possibility of being lost). Do not rely on the compiler to deal with the hardware in a certain manner. This is VERY important especially if you change processors (which I have done numerous tmies) or compilers (same there).
What I garnered is that he was suggesting using the compiler for what is known to work with normal C/C++ conventions in ALL compilers. That is use the 'ugly' standard C '&' '|' operators so that it will work irregardless of the compilers arbitrary support methods for bit fields.
The assumption that the compiler you have available (IE Turbo C++ Borland C++ GCC {in it's numerous incarnations} ) will work the same irregardless of what target or version you are using is a bad one.
Summary make your code blatantly obvious what it is doing, and comment it heavily. So that no one can guess what is being modified. Do not rely on any feature of a compiler (again bit fields is the example) to make the code work. Readability also includes understanding what is being done and why, not just having nice looking code (that hides lots of potential problems, and sins, EX one small change is made such as recompiling the project with a different version of the compiler).
A good resource is http://www.ganssle.com/ and http://www.embedded.com for this, and numerous other problems you might encounter.
Stephen
No. If you pack the data each union will shrink to 1 byte and you will need a 6th byte for the 6 reserved bits you have last.
That means that the 2 bytes of the u16 will be overlaid with 6 bytes.
Your unions module1, module2, module3, module4 and module5 will not magically merge into a single 16-bit container.
Paragraph 6.7.2.1 of the C standard says in section 13 that "Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared."
In 14, the standard specifies that "A pointer to a union object, suitably converted, points to each of its members (or if a member is a bitfield, then to the unit in which it resides), and vice versa."
The language standard will not merge your bit fields since that would mean that the struct directly following the u16 would have multiple members having the same address. But only a union may have multiple members with the same address.
The problem example in my first post was for a processor that auto-acknowledges status bits as they are read. That means that the software must do one (1) read and then process all bits.
The solution to that (if we ignore other problems with bit fields) is to copy the register value once to a union of an unsigned and the bit fields and write:
com1_status_t tmp_status; // Sample and acknowledge status bits. tmp_status.raw = com1_status; // Process any detected errors if (tmp_status.b.rx_overrun) ... if (tmp_status.b.rx_parity) ...
The problem example in the later post was for a processor where you write a '1' to a bit position to a write-only register overlapping the status register to acknowledge a specific event.
Such a processor does not have a problem with multiple read operations - a later read may possibly pick up one or more extra bits, but that isn't dangerous. But such a processor does not allow a read-modify-write for acknowledging a single bit since the design has a read-only register overlapping a write-only register.
Alas, I managed to **** up the example of the old-style solution. There shouldn't have been any |= in the assign, but just an assign. Sorry for the confusion.
t0ir = TxIR_MR1; // Safe acknowledge of MR1 since no read-modify-write
The solution to the second problem, if using bit-fields (and ignoring all the other problems with bit-fields) is to use a temporary union of an unsigned and bit fields for the acknowledge part:
timer_ir_t ack; ack.raw = 0; // Don't know anything to acknowledge yet if (t0ir.b.tm0) { // handle event. ack.b.tm0 = 1; // should ack tm0 - read-modify-write in RAM is safe } if (t0ir.b.tm1) { // handle tm1 event. ack.b.tm1 = 1; // should ack tm1 - read-modify-write in RAM is safe } ... t0ir.raw = ack.raw; // acknowledge processed events - no read-modify-write, just a write
One thing here is that when moving between different ARM compilers, the ARM ABI locks down the behaviour of bit-fields by being way stricter than the C language standard. But when leaving the ARM world, it isn't even known if the bits will be allocated low-to-high or high-to-low, so switching between two different compilers but staying with the same processor can completely break everything.
> The problem example in my first post was for a processor that > auto-acknowledges status bits as they are read. That means that the > software must do one (1) read and then process all bits.
I think I got that part.
> The solution to that (if we ignore other problems with bit fields) is > to copy the register value once to a union of an unsigned and the bit > fields and write:
Sure, as I said. But the requirement of having to read the register just once in this case is not specific to bit fields, is it.
> Alas, I managed to **** up the example of the old-style > solution. There shouldn't have been any |= in the assign, but just an > assign. Sorry for the confusion.
No problem, but to be sure that we are on the same page I would still like to see a self contained minimal example, and preferably an example of a device (ideally ARM based) which exibits the difference behavior that you are trying to show. And I am not talking about a handful of perhaps redundant instructions, but code that actually behaves incorrectly.
> But when leaving the ARM world, it isn't even known if the bits will > be allocated low-to-high or high-to-low, so switching between two > different compilers but staying with the same processor can > completely break everything.
Possibly. It all depends on what degree of protability I need. I don't expect to find e.g. ARMv6/v7 page table descriptors in other architectures. It is perhaps a choice whether I prefer using the "traditional" approach with macros/enums/bit-ops everywhere or whether I would rather have an architecture specific header file with appropriate bit-field definitions.
I don't claim that bit fields are always the better approach, but it seems that many people show some knee-jerk reactions against bit fields wherever the question comes up.
There are valid arguments against using bit fields in certain situations. But most arguments thrown out by people can be disproved easily.
Regards Marcus http://www.doulos.com/arm/
"I would still like to see a self contained minimal example, and preferably an example of a device (ideally ARM based) which exibits the difference behavior that you are trying to show. And I am not talking about a handful of perhaps redundant instructions, but code that actually behaves incorrectly."
you will find, as many others have, that Per is an expert on irrelevancy and s/he will talk about anything but.
Put my words down: Per will NOT produce such an example in the next 10 posts of his.