It's time to do some work on you're compiler. It generates the same code it did 20 years ago. I have to do tricks, instead of writing normal portable code, to get it to generate good code or just use inline assembler. This of course is nothing new, I've been telling you this for almost a decade. Keep in mind that all Intel processors are little endian and that you're compiler generates big endian code, why? You should at least offer a compiler switch to select endianness. Why don't you support C++ and MISRA? Oh and BTW IAR does all of the above TODAY! So clearly they haven't been standing still. (Notice who wrote the paper.)
www.eetimes.com/.../The-Inefficiency-of-C--Fact-or-Fiction-
My point is why pay for maintance
Oh, for Pete's sake. So don't pay and do us all a favour and get lost, why don't you?
Byte suddenly being little or big-endian? Impossible.
The bits in a byte are just that - bits. Don't spend too much time figuring out the bit addressing, thinking that would have anything to do with the little/big endian issue.
Almost all processors defines bytes/words/... to have bit 0 as least significant bit. Creating a bit-addressing instruction, it would be very stupid to suddenly number the bits: 7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8 ... just to make a statement about big/little order.
But if you do want to have fun - look at some Freescale processors where the bits are numbered in reverse order. Truly magnificent to read a schematic and figure out that A30 is not a really "big" address line but one of the smallest. These bit issues probably resulted from some people stupidly trying to extrapolate word endianness into the naming of individual bits too.
I don't think I have ever seen a little-endian processor numbering bit 0 as the most significant bit. But big-endian processors sometimes names the most significant bit 0, and sometimes the least significant bit.
Hence, even if you do try to use the bit endianness in the discussion, the use of bit0 as least significant can't be used as any proof since it is used by bost little-endian and big-endian processors.
Next thing. If you have a 3-byte instruction, one byte op-code and two byte address, you shouldn't try to merge the opcode with the first byte of the address and imply big or little endian. That is just an ugly extrapolation.