We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
For those of ARM's customers, who design microcontrollers, I would like to recommend a Big Endian implementation (or at least an option to select Big Endian).
This is due to that I have designs that need to work with Big Endian data.
Big Endian is the network native endianness, thus it would be a good optimization for IoT.
In addition, Big Endian is less confusing. Using Big Endian results in fewer programming bugs (bugs that are introduced due to confusion, not necessarily the actual memory organization).
In my case, I would need Big Endian + Network + LCD + 2*I2S + EMC + SD/MMC + high speed + a free (fast 65MHz+) GPIO[31:0] + a fair speed GPIO[15:0].
-So that's a chip with a lot of pins.
(Note: This was originally posted under Cortex-M7, as it's my wish that there will be Cortex-M7 microcontrollers with the mentioned features)
Yes, many Cortex-A (perhaps all) offers big endian support.
But I am positive, that the Cortex-M could benefit strongly from having big endian implementations.
Most people don't know what big endian is, because they've worked with little endian all their lives; they know bytes are arranged MSB first, etc, but they do not know the benefits of programming in a big endian environment.
I've been working with both big and little endian for many years, and I absolutely prefer big endian (everything is logical, also bit-shifting).
Some of the benefits are that the sign-bit is the most significant bit and is stored in the Most Significant Byte, thus it'll always be at byte[0].
Reading fixed point values from memory also results in fewer operations.
Standard image (pixel) conversions can be done in less operations.
Network code can use the values directly, reducing code size and increasing performance (using less CPU time for networking)
Various peripherals expect bytes stored/sent as MSB first; I2S for instance, would benefit from this as well.
The rev instruction is a good instruction, however I wish that ARM had made load-with-byte-reverse and byte-reverse-store instructions.
That would have solved most of the problems, which has to do with performance.
If a load takes a uses a single cycle and a store uses a single cycle, then converting a 32-bit value in memory would use 3 clock cycles.
Compared to the above, a load-with-byte-reverse and a store (or a load + store-with-byte-reverse) would use only 2/3 of the time.
But in general, there are still problems when disassembling ARM code using objdump or OpenOCD.
If you dump a 16-bit value, you do not see the actual values, but they're byte-swapped "for clarity". This is highly confusing.
Same about 32-bit values.
And the disassembly is so confusing that I have not yet been able to find the pattern.
-So even these approaches to "fix" the little endian problem (I call it little-endian bug, heh, no offense meant to anyone who likes little endian), have not been successful.
I would expect that Freescale would be among the first companies to make a Cortex-M with big endian support. Especially because their previous architectures are big endian and they already know the benefits, but also because they seem to provide a broad palette of choices in their Cortex-M range.
I know that other companies are very open to suggestions (they're good listeners), so I believe there's a chance there will be more than one big endian provider, when it comes to Cortex-M.