This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

How to generate little endian code

Hi guys,

Does anyone know how to make the Keil tool generate little endian code? Is that possible? I am using uVision 1.32

Can newest version of Keil tools generate little endian code?

Thanks,

Anh

Parents
  • It's not difficult, but it can be horribly inefficient if you have to do this frequently (as we do in our hardware design).

    I'm currently working on an 8051 with a custom register interface that was designed to be little endian (before your compiler was selected by the software team).

    As a result, 16-bit pointers cannot be used or byte swapping has to be done on every register access, which is extremely inefficient. The software team chose to do two byte accesses for each register access in the C code because of this.

    Now this code must be ported to a 16-bit processor that is little endian and it should run in both places efficiently. The lack of a compiler option to store integers in little endian format is making this task more difficult than it seems it should be. Isn't the endianness of your compiler an arbitrary decision on your part? Why not support both?

    Anyone have a suggestion? Thanks!

Reply
  • It's not difficult, but it can be horribly inefficient if you have to do this frequently (as we do in our hardware design).

    I'm currently working on an 8051 with a custom register interface that was designed to be little endian (before your compiler was selected by the software team).

    As a result, 16-bit pointers cannot be used or byte swapping has to be done on every register access, which is extremely inefficient. The software team chose to do two byte accesses for each register access in the C code because of this.

    Now this code must be ported to a 16-bit processor that is little endian and it should run in both places efficiently. The lack of a compiler option to store integers in little endian format is making this task more difficult than it seems it should be. Isn't the endianness of your compiler an arbitrary decision on your part? Why not support both?

    Anyone have a suggestion? Thanks!

Children
  • Isn't the endianness of your compiler an arbitrary decision on your part? Why not support both?

    No. It isn't and it wasn't. Having that option REQUIRES that we:

    1. Create double the number of libraries (since a big endian and a little endian library are required).

    2. Add features and correct problems in 2 places.

    3. Double efforts on ANSI compliance and regression testing.

    4. Make massive changes (that may introduce bugs) to a very stable development tool chain.

    I don't think that this is justified. Anyone using an 8051 because of its blazing data processing performance has probably chosen the wrong microcontroller.

    Jon

  • "I'm currently working on an 8051 with a custom register interface that was designed to be little endian (before your compiler was selected by the software team). As a result, 16-bit pointers cannot be used or byte swapping has to be done on every register access, which is extremely inefficient."

    I'm asking this not to be an 'a**hole' here, but just to set the stage for anybody making assumptions about endianness on a natively 8-bit architecture --

    What 8051 16-bit data memory access instructions are you attempting to use?

    The only one that comes to mind is "LCALL", which causes a 16-bit return address to be pushed on the stack (in [i]data memory space). That said, I would suggest that in that context (data memory-wise) and other than picking return addresses off of the stack (which is only occasionaly useful), the 8051 is mostly endian-agnostic.

    "Now this code must be ported to a 16-bit processor that is little endian and it should run in both places efficiently."

    I don't post often, but when I do, readers have come to know that on these issues, I am posting from a multi-platform standpoint.

    Portability between architectures "is a ***" isn't it? ;-)

    Your problem is that someone on the hardware team had a notion that an 8-bit processor had an endian bias.

    "The lack of a compiler option to store integers in little endian format is making this task more difficult than it seems it should be. Isn't the endianness of your compiler an arbitrary decision on your part?"

    An architecture's endianness is (mostly) defined by the interface between memory and its internal ALU/accumulator/register(s) and since the 8051 can only transfer a single byte at a time, it has no endianness other than what is achieved through means that are for the most part inaccessible to you. I you are uncomfortable with that notion and really want to call it one way or the other, you could say that the 8051 is little-endian only because you could defend that pick on the basis of the byte-order (direct or causal) of those few 16-bit operations that the 8051 does perform.

    Some (> 8-bit) architectures and toolchains do support selecting endianness, but none that I am aware of are truly, to-the-core endian-neutral. For example, the PowerPC toolchains and CPU "support" little-endian although they are big-endian by nature. "support" is quoted because there are some gotcha's.

    The bottom line is that it is (IMHO) unrealistic to expect endian-allegience from an architecture that does not support anything more than a native (atomic) 8-bit memory interface.

  • I'm currently working on an 8051 with a custom register interface that was designed to be little endian

    I have a similar situation, except that my device has 32-bit little endian registers.

    My solution was simply to define functions along the lines of

    RegRead (addr)
    RegWrite (addr, val)
    RegOrEq (addr, val)
    RegAndEqNot (addr, val)
    etc

    When the endianness of your device matches the endianness of your processor/toolchain, these functions can become macros with straightforward assignments, and the compiler handles the details.

    #define RegRead(addr) *addr
    #define RegWrite(addr, val) *addr = val;
    #define RegOrEq (addr, val) *addr |= val;
    #define RegAndEqNot (addr, val) *addr &= ~val;

    When the endianess of your device does not match the toolchain, you write a function.

    Note that you do not need to explicitly byte swap as a separate operation from access. These functions know they're moving to swapped registers, and can do both jobs at the same time. Let's say you read a 16-bit value, returned in R6/R7. Your device is in xdata space. You'll have a routine that looks something like:

    init DPTR
    mov @DPTR to R7
    inc DPTR
    mov @DPTR to R6

    Notice that while the DPTR is moving from lower addresses to higher, the store into the bytes of the result, as determined by the sequence of the code, is moving from higher to lower. This order costs no more than reading the registers in the "correct" order, and results in swapping the bytes while they're read. If the device were big endian, the routine would be the same, only with the registers used in the opposite order:

    init DPTR
    mov @DPTR to R6
    inc DPTR
    mov @DPTR to R7

    Either way, the execution cost is the same.

    Software can internally deal with the values always big-endian, and the Reg* routines flip them only on actual hardware access.

  • "The only one that comes to mind is "LCALL", which causes a 16-bit return address to be pushed on the stack (in [i]data memory space)."

    ...as a little endian value. As does ACALL.

    And then there's DPTR - it's little endian as well.

    I wonder what percentage of 8051 projects actually interface to some other architecture? Mine usually do and it's always an Intel PC. I don't really care that I have to swap bytes - with a 50 watt CPU under the hood it's no trouble. It's handy for microwaving my lunch, as well.

  • "... always an Intel PC. I don't really care that I have to swap bytes"

    The conversion routines come free: htons(), ntohs(), etc are, I believe, standard & portable on both Windoze and *NIX.

  • Loading a short, and then byte swapping is quite inefficient; however loading a short value as 2 8 bits values produce much more efficient code.

    Erik