We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
ARMCC 5.02 has this nice new feature of inline assembly support for Cortex-M4 (I use STM32F4).
But unfortunately it seems to be impossible to convince the inline assembler to use direct register access for the low and high register of a long long (64-bit) register variable.
In the example Keil\examples\Inline\Inline.C they use the following macros:
#define lo64(a) (((unsigned int *)&a)[0]) /* Low 32-bits of a long long */ #define hi64(a) ((( int *)&a)[1]) /* High 32-bits of a long long */
Unfortunately this produces extremely inefficient assembly code - even if a should be a long long register variable (which is already residing in two adjecent register variables), it will store these two registers into some memory place with STR and then reload back again with LDR into a register (instead of directly accessing the correct register).
It does this in any optimization level, also if you select "Optimize for time".
Has anyone an idea, how to convince the inline assembly to use the direct register access for such long long register variables?
(or is there some other intrinsic method as __loreg(...) ... but I assume not - otherwise it probably would have been used in the inline example??).
a should be a long long register variable
Too bad it's not allowed to take the address of a register variable. It would have made much more sense to stick with the obvious method that actually does what you want, instead of mucking around with pointers like this was still 1985:
#define UINT64_LO32(a) ((uint32_t)a) #define UINT64_HI32(a) ((uint32_t)(a >> 32))
Thank you for your support - I think we need enough crying people to get this done ... (most easy way I think would be an intrinsic function __reg( variable, n) [n= 0, 1, 2, ...] especially for inline assembly - not allowed to be used in "normal" C code ).