This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

efficient packing

woody baker over 10 years ago

First, I am using 7.5 so it is an old compiler, but that is the way it is.

I have a number of bit variables that I am trying to pack into a byte. they are scattered all across the bit area

volatile bit var1;
volatile bit var2;
volatile bit var3;

char result;

The compiler won't let me do

          result=   (var1 <<1) | (var2 <<2) | (var3 <<4);

the only way I have been able to get this to work is

           result= var1;
           result= ((((result << 1) | var2)<<1) | var3)<<1);

This results in very inefficient code
           mov a,result
           mov  r7,a

           mov   c,v2
           clr  a
           rlc  A
           orl  A,R7;
           add  A,ACC
           mov  a,r7

           mov  c,var3
           rlc  A
           orl  A,R7
           add  A,ACC
           mov  a,r7
etc.
as compared to:

           mov A,result
           mov c,v2
           rlc a
           mov  c,v3
           rlc a

would be much better.  Does anyone know how this might be forced in C, without resorting to assembly?

I'm wondering if because the ACC and PSW are evenly divisible by 8, whether or not you can use
those and explicitly use the _crol_ intrinsic to accomplish this.  I have not had any luck so far.

In this case memory efficiency is more important than speed.

0 edPer Westermark over 10 years ago

What happens if you rewrite:

result=   (var1 <<1) | (var2 <<2) | (var3 <<4);

into:

result = ((char)var1 <<1) | ((char)var2 <<2) | ((char)var3 <<4);

or:

result = 0;
if (var1) result |= 2;
if (var2) result |= 4;
if (var3) result |= 8;

or:

result = (var1 ? 2 : 0) | (var2 ? 4 : 0) | (var3 ? 8 : 0);

0 woody baker over 10 years ago in reply to edPer Westermark

Since the variables are actually bit flags in the

bit mapped data area, I think converting them to chars will be messy, and large code space.

However This resulted in a reduction of 17 bytes.

result = 0;
if (var1) result |= 2;
if (var2) result |= 4;
if (var3) result |= 8;

result = (var1 ? 2 : 0) | (var2 ? 4 : 0) | (var3 ? 8 : 0);

WELL.... different code, but because it used a jnb jump chain, it compiled to the exact same size of code as originally the case. There is an OR of 0 on each test.

I have several of these and need to conserve code space.

Basically, it is a state dump of the medical device up to a secondary controller, so since I am doing both ends, I can do it pretty much however I want, but I just have about 1K of free flash at this point, so the more memory I can save the better.

The existing 8051 is a silabs F040 with 64K flash (except that they burn pages 0xFD00 - FFFF for manufacturing test and protection etc.

Thanks once again for the excellent suggestions.
Cancel
Vote up 0 Vote down

Cancel
0 edPer Westermark over 10 years ago in reply to woody baker

Type casting to char might be messy. But ((char)bitx << 4) doesn't mean the compiler actually have to convert bitx to a char. But it means the compiler must produce a result as if it had - the language standard isn't about expected processor instructions but about expected end results.

A compiler that happens to have matching optimization rules could still perform single-bit operations on a char-sized temp variable that happens to also be bit-addressable. So something similar to the following - the assembler program has a byte-addressable variable in the bit-addressable region to allow bit operations to set individual bits:
http://www.keil.com/support/docs/1877.htm
Cancel
Vote up 0 Vote down

Cancel

0 e. dl2iab over 10 years ago

bdata unsigned char sample_result = 0;

sbit var1 = sample_result ^ 0;
sbit var2 = sample_result ^ 1;
sbit var3 = sample_result ^ 2;

main ()
{
        var1 = 0;
        var2 = 1;
        var3 = 1;

        sample_result = 0xFFu;

Generated code:

    79:         var1 = 0;
C:0x070D    C200     CLR      var1(0x20.0)
    80:         var2 = 1;
C:0x070F    D201     SETB     var2(0x20.1)
    81:         var3 = 1;
    82:
C:0x0711    D202     SETB     var3(0x20.2)
    83:         sample_result = 0xFFu;
    84:
C:0x0713    7520FF   MOV      sample_result(0x20),#0xFF

0 woody baker over 10 years ago in reply to e. dl2iab

bdata unsigned char sample_result = 0;

sbit var1 = sample_result ^ 0;
sbit var2 = sample_result ^ 1;
sbit var3 = sample_result ^ 2;
volatile bit source_bit;

main ()
{ var1 = 0; var2 = 1; var3 = 1;

sample_result=0xFF;

This is not what I am trying to accomplish.

The desired result is to PACK the 3 bits into positions in the result, which will be a bit mapped representation of the various bits scattered throughout bit (bdata) memory.

So if V1 =1; V2=0;v3=0 then result would be b00000100 or 0x04.
so assigning sample_result = 0xFF would defeat the purpose.

I have about 17 bits free, so I tried it and it....

WORKS LIKE A BOSS! VERY nice. Thank you. generates a set of Var1=source_bit; mov c,source_bit mov (sample_result.0),c
cut generated code down from x1B6 to x153 bytes. very significant savings.
Cancel
Vote up 0 Vote down

Cancel
0 edPer Westermark over 10 years ago in reply to woody baker

"That's what bdata was invented for"

Yes, bdata is nice. But the question is if any C sequences will make the compiler itself make use of the concept with an intermediate variable, instead of forcing the developer to explicitly create a bdata tmp variable + 8 overlapping sbit variables.

Such as if any code sequence could get the compiler to make use of the accumulator, which just happens to be both bit and byte addressable - but which doesn't require the consumption of 8 dummy bits.

But maybe the Keil compiler has no such optimization rule, in which case it isn't possible to find any "magic" C sequence that will make the compiler in the background make a byte clear, multiple bit assigns and then store a byte to the final target address. Such an optimization rule would allow efficient bit packing to happen even for bytes that doesn't fit in bdata, without #define macros to hide all the sub-steps of the computation/assign.

By the way - it's rather brilliant how newer ARM Cortex chips emulates the bdata concept outside of the core in a language-neutral way.
Cancel
Vote up 0 Vote down

Cancel
0 Andrew Feel over 10 years ago in reply to edPer Westermark

cut generated code down from x1B6 to x153 bytes. very significant savings.

How much of that sentence would be appreciated by one of the multitude of C# fraternity?

Luv it :)
Cancel
Vote up 0 Vote down

Cancel

0 woody baker over 10 years ago in reply to edPer Westermark

Well there was a wrinkle.

bdata unsigned char zz;
sbit Bit7 ^ zz;
...

compiler optimized things away.
The compiler started by setting R7 to (state<<5)  which is fine.
Then the compiler issued a the correct bit set instructions.
THEN it called sputchar passing R7 to it... Well this resulted in
only the state being transmitted.

zz= (state <<5); // take the state variable, and put it in the upper 3 bits of the byte
Bit4=1;
Bit5=0;
sputchar(zz);


So I changed the declaration to
volatile bdata unsigned char zz;

and got a syntax error near unsigned.
And the Bit4=1; generates an invalid base address.

WTF?
Trying to find a way to force the compiler to reload zz into R7 before calling sputchar.

In this case all worked fine..

zz=0;
and followed by 7 Bitx = y statements.
It reloaded ZZ into R7 before calling sputchar....

So something in the compiler is a bit to smart, gotta figure out how to force the load....
may have to try zz+=0;     and see if that makes it reload.....

0 edPer Westermark over 10 years ago in reply to woody baker

No, the compiler isn't too smart. It just fails to understand the aliasing between bit and byte - that there is an overlapping union of byte and bits. So the optimizer doesn't realizes that something have changed the byte variable, and thereby invalidating the value already in the register.

Will the compiler accept the volatile keyword if you move it after the bdata keyword?
Cancel
Vote up 0 Vote down

Cancel
0 woody baker over 10 years ago in reply to edPer Westermark

I'll try that. what I did in the meantime, is to follow the last Bitx= line with one that does nothing, and consumes 2 or 3 bytes. sfrsave=SFRPAGE;
Basically a data page move from an SFR. That forced the compiler to do explicit loads and saves.
I'll try bdata volatile.. but I've always put volatile first.....

Well it swallowed that just fine, and I was able to take out the sfrsave=SFRPAGE line.

so why does volatile bit variable; work while
volatile bdata unsigned char variable; does not work and
bdata volatile unsigned char variable; DOES work.

That looks like a bug in the compiler to me.

pretty easy to demonstrate

volatile bdata unsigned char variable;
sbit Bit7 = variable ^ 7;
sbit Bit6 = variable ^6;

hope someone at keil will take note.
Cancel
Vote up 0 Vote down

Cancel
0 woody baker over 10 years ago in reply to woody baker

I'd actually like someone to verify whether or not this works on the current version of the compiler.
Cancel
Vote up 0 Vote down

Cancel
0 B Reddy over 10 years ago in reply to woody baker

Might well be worth considering a small bit of assembler instead of trying to persuade the compiler to do something like this.
Cancel
Vote up 0 Vote down

Cancel
0 edPer Westermark over 10 years ago in reply to woody baker

The order of attributes shouldn't matter, as long as we don't bring in pointers where the attribute binds to the pointer or the pointed-at value.

So volatile bdata xx and bdata volatile xx should both work.

But the optimizer really should also know about the aliasing between sbit and bdata since there is a hard-coded relation between them. And the whole idea with bdata is to get the aliasing effect, allowing the processor-specific one-bit instructions to be used. volatile isn't intended to solve aliasing issues, but to tell the compiler about asynchronous updates - i.e. that a task, ISR or actual hardware may change the variable state between two accesses.
Cancel
Vote up 0 Vote down

Cancel
0 woody baker over 10 years ago in reply to edPer Westermark

I don't know if Keil has addressed this in later compilers, but

Per has it right.

The reason I tagged it volatile, was to force the compiler to do the load after the bit manipulations.

It worked in one order, but not in a different order. I think this is a bug.
Cancel
Vote up 0 Vote down

Cancel
0 Andrew Feel over 10 years ago in reply to woody baker

For it to be a bug, it has to be doing something unintended. Unless you can find a document stating exactly what is intended you might find it very difficult convincing anyone else that it might be.
Cancel
Vote up 0 Vote down

Cancel