We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
Hi, I would like to run the following function for saturation controlled 64-bit-addition to inline assembly:
__asm LLONG qadd_ll( LLONG a, LLONG b) { ADDS R0, R0, R2 ADCS R1, R1, R3 BVS Oflw_qadd_ll BX LR Oflw_qadd_ll BPL Uflw_qadd_ll // overflow into neg.: limit to 0x7FFFF... MOV R0, #0 SUBS R0, #1 LSRS R1, R0, #1 BX LR Uflw_qadd_ll // underflow into pos.: limit to 0x8000... MOV R0, #0 MOV R1, #1 LSLS R1, R1, #31 BX LR }
This would speed up my software in critical parts by a factor 2, because I then avoid the function call stuff.
Has anyone a hint how this can be possibly done with the Keil c/cpp compiler 5.05 (MDK 5.12)?
The problem is the access to the low/high register of a llong c variable.
I tried the following:
typedef union{ struct{ int iLo; int iHi; }; LLONG ll; }U_LLONG; LLONG func_c( LLONG a, LLONG b){ U_LLONG ua; U_LLONG ub; U_LLONG uc; ua.ll= a; ub.ll= b; uc.ll= ua.ll+ub.ll; uc.ll= uc.ll+ub.ll; return uc.ll; } LLONG func_asm( LLONG a, LLONG b){ U_LLONG ua; U_LLONG ub; U_LLONG uc; ua.ll= a; ub.ll= b; asm( "ADDS ub.iLo, ua.iLo, ub.iLo\n" "ADCS ub.iHi, ua.iHi, ub.iHi\n"); asm( "ADDS uc.iLo, uc.iLo, ub.iLo\n" "ADCS uc.iHi, uc.iHi, ub.iHi\n"); return uc.ll; }
If I invoke func_c, this will generate nice register code.
If I invoke func_asm, this will generate very cumbersome non-register code with lots of unnecessary stack allocation.
Has anybody perhaps a hint how to convince the compiler to use register code also for the inline assembly 64 addition?