This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Efficiently splitting a long int into bytes

Hi All,

This is related to my previous post, "Efficiently combining bytes into a long int".

I'd like to find an efficient way to do this. My current implementation (below) results in (unnecessary) calls to the library function for shifting a long int.

sendLong(unsigned long out_data);
{
    sendByte((unsigned char)(out_data >> 24));
    sendByte((unsigned char)(out_data >> 16));
    sendByte((unsigned char)(out_data >> 8));
    sendByte((unsigned char)(out_data));
}

I have tried to use unions (which worked well for my previous problem), but I can't seem to figure out the syntax for putting the argument into the "long" of the union, and sending the "bytes" of the union.

I tried creating a temporary union and copying the argument (out_data) to it:

union   LONGBYTES
{
    uchar b[4];
    ulong l;
};

void sendLong(unsigned long out_data)
{
    union LONGBYTES temp;
    temp.l = out_data;
    sendByte(temp.b[0]);
    sendByte(temp.b[1]);
    sendByte(temp.b[2]);
    sendByte(temp.b[3]);
}

And that produced this code:

             ; FUNCTION _sendLong (BEGIN)
                                           ; SOURCE LINE # 349
;---- Variable 'out_data' assigned to Register 'R4/R5/R6/R7' ----
                                           ; SOURCE LINE # 350
                                           ; SOURCE LINE # 352
                 R     MOV     temp+03H,R7
                 R     MOV     temp+02H,R6
                 R     MOV     temp+01H,R5
                 R     MOV     temp,R4
                                           ; SOURCE LINE # 353
                 R     MOV     R7,temp
                 E     CALL    _sendByte
                                           ; SOURCE LINE # 354
                 R     MOV     R7,temp+01H
                 E     CALL    _sendByte
                                           ; SOURCE LINE # 355
                 R     MOV     R7,temp+02H
                 E     CALL    _sendByte
                                           ; SOURCE LINE # 356
                 R     MOV     R7,temp+03H
                 E     CALL    _sendByte
             ; FUNCTION _sendLong (END)

But this results in an unnecessary allocation of 4 bytes for the temporary union and unnecessary copying of the data from registers (register parameter passing) into the ram allocated for the union and back into registers to pass parameters to sendByte() function. Any ideas?

Thanks,
Bob

Parents
  • and the same answers apply!

    Actually, if you read the post, you see that I already aknowledged the answer you gave in the previous post. My *new* question was if it was possible to use the long "alter-ego" of the union in the argument list and use the byte "alter-ego" of the union in the calling of the sendByte() function and thus eliminate the unnecessary allocation of ram for the temp variable and associated copying. I tried a couple different purmutations, but wasn't able to come up with something the compiler would accept. The closest thing I found in K&R's C bible is initialization of a union.

    As you said in your previous post, this is may be more a generix C question than Keil specific, but I would argue that since I am looking for efficient assembly output, and not necessary for the best C implementation, I think it's an appropriate question. Besides, the Keil compiler isn't 100% ANSI C compatible, so there a chance that a "generic" C solution might not produce the results I'm looking for. Nonetheless, if you could suggest a "generic C" forum where this question might be more appropriate, I'd be more than happy to move this thread.

    Thanks again,
    Bob

Reply
  • and the same answers apply!

    Actually, if you read the post, you see that I already aknowledged the answer you gave in the previous post. My *new* question was if it was possible to use the long "alter-ego" of the union in the argument list and use the byte "alter-ego" of the union in the calling of the sendByte() function and thus eliminate the unnecessary allocation of ram for the temp variable and associated copying. I tried a couple different purmutations, but wasn't able to come up with something the compiler would accept. The closest thing I found in K&R's C bible is initialization of a union.

    As you said in your previous post, this is may be more a generix C question than Keil specific, but I would argue that since I am looking for efficient assembly output, and not necessary for the best C implementation, I think it's an appropriate question. Besides, the Keil compiler isn't 100% ANSI C compatible, so there a chance that a "generic" C solution might not produce the results I'm looking for. Nonetheless, if you could suggest a "generic C" forum where this question might be more appropriate, I'd be more than happy to move this thread.

    Thanks again,
    Bob

Children

  • The question here is efficiency, not correctness. Robert's already got a correct solution (for either ANSI C or Keil). He just wants a way to get the Keil code generator to generate tighter code.

    Given a U32 in R4-R7, you need to move each individual byte to R7 and then call a function that takes a 1-byte parameter. You need no extra storage to solve this problem -- unless you start from R4 (MSB) first. In that case, the param needs to get moved to R7, which means you need to save the original R7.

    In general, the U32/array union is a good way to hint to the compiler that it can do byte accesses rather than shifts, and can get you smaller code requiring many fewer clocks. In this case, using one forces a copy of all four bytes to temporary storage.

    You can't pass a array/U32 union instead of a U32 to the function, as that will get assigned to the "stack" (memory) instead of registers. So, assume we're passing a U32 into the function.

    You can't take the address of the formal parameter inside the function (say, to cast the U32* to a UnionType*), because that will require it to _have_ an address, which seems to force the formal out of registers into memory. (In the 8051, unlike most processors, registers do have an address, so it would be theoretically possible to generate code for this case. But in my experiment I just wound up with another memory variable.)

    You can't use shifts, as the compiler will take those very literally, and happily ripple a single bit 24 times down the length of a U32 rather than just grabbing the high byte.

    Robert's example was the obvious use of a temporary, but that doesn't seem to get optimized away, so you're still stuck with an extra copy in memory. (Incidentally, when you get obsessive over the generated code, it's sometimes worthwhile to take a look at the .cod file produced by the linker as well as the assembly output from the compiler. The linker does some optimization of its own in some cases.)

    I fooled around a bit, but didn't come up with anything but negative answers, alas. This is the point where I would just hand-code a SendLong() routine in assembler if it were important, since that won't take long compared to conducting experiments on the code generator. Alternatively, you push the burden to the caller, and make sure their data type is byte-oriented, getting rid of sendLong entirely and replacing it with some sort of loop calling sendByte.