We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
I have code (borrowed from another platform )that attempts to optimize by using assembly code. The function is an addition of arrays and is a C function such as uint32_t AddNumbers ( uint32_t *p_addResult , uint32_t *p_left , uint32_t *p_right , const uint ARRAY_SIZE)
Can I get pointed to a syntax where the passed variables can be assigned to registers and assembly variables.
uint32_t AddNumbers ( uint32_t *p_addResult , uint32_t *p_left , uint32_t *p_right , const uint ARRAY_SIZE) { uint32_t l_counter = ARRAY_SIZE; uint32_t l_carry = 0; uint32_t l_left; uint32_t l_right; __asm( .. "ldmia %[lptr]!, {%[left]} \n\t" "ldmia %[rptr]!, {%[right]} \n\t" "lsrs %[carry], #1 \n\t" "adcs %[left], %[right] \n\t" ... : [dptr] "+l" (p_addResult), [lptr] "+l" (p_left), [rptr] "+l" (p_right), [ctr] "+l" (l_counter), [carry] "+l" (l_carry), [left] "=l" (l_left), [right] "=l" (l_right) }
I am getting syntax errors for the part above that assigns the C variables to registers and assembly variables.
Thanks
On another platform ( ARM based) we've experienced 50% improvement in speed using assembler vs C code. On this platform I am unable to compare as I haven't been able to get this going.
Who is "we" ?
Something doesn't quite add up: if you (they?) could do so well on that platform, how come you're now stuck on basics...?
Fair question.
The code for this module was sourced and has worked fine on our current platforms(ARM based) that do not use Keil as a toolchain.
Now in working with a vendor we want to port test some of our benchmark code to their M0 based hardware and their environment uses Keil. We're facing the issue with being able to translate the assembly lines to something that Keil accepts.
The easy solution?
Don't implement any assembler functions in C/C++ source files - do it in a real assembler file.
Note also that the optimizer doesn't care if a processor instruction comes from inlined assembler or as result of compiling C code. It will still try to remap registers, change order of instructions etc. The only way to "own" the instruction sequence is by having it in an assembler file.
I am willing to explore this. Any good documentation on how I could go about doing this?
My use case is
A.c refers to a function in B.c. Some functions in B need assembly optimization. How do I specify these functions ( like my addition example) and associate them in B.c
Apologize for being such a noob !
adder.s
PRESERVE8 THUMB AREA |.text|, CODE, READONLY ; M0 version AddNumbers PROC EXPORT AddNumbers ; R0 = p_addResult ; R1 = p_left ; R2 = p_right ; R3 = size push {r4, r5} movs r5, #0 ; clear carry _AddLoop ldm r1!, { r4 } lsrs r5, #1 ; get carry from low order R5 ldm r2!, { r5 } adcs r4, r5 stm r0!, { r4 } adcs r5, r5 ; get carry into R5 subs r3, #1 bne _AddLoop movs r0, #1 ands r0, r5 ; return carry in R0 pop {r4, r5} bx lr ENDP ; AddNumbers END
Thanks for this snippet. Had a few questions
1. Where is the association done between the function params and the registers ?
Specifically where is the binding done ? ; R0 = p_addResult ; R1 = p_left ; R2 = p_right ; R3 = size
2. Penned my understanding of the logic. Learning to fish here!
push {r4, r5} // Push to the stack whatever is in r4 and r5 . movs r5, #0 ; clear carry // clear the carry bit in register r5 but why? _AddLoop ldm r1!, { r4 } // copy value from r1 to r4 lsrs r5, #1 ; get carry from low order R5 // Shift right by 1 ? What is this achieving? ldm r2!, { r5 } // copy value from r2 to r5 adcs r4, r5 // add with carry r4 and r5 stm r0!, { r4 } // store result in r0 the result in r4 adcs r5, r5 ; get carry into R5 // ? subs r3, #1 decrement size ? however we are we affecting the pointers to point to the next array value . where are we going from p_left[8] to using p_left[7] while iterating? bne _AddLoop // loop back movs r0, #1 // moving 1 into r0 . ands r0, r5 ; return carry in R0 // And'ing with r5 but why did we do this ? pop {r4, r5} bx lr
Some questions(?) inlined within code
PRESERVE8 THUMB AREA |.text|, CODE, READONLY ; M0 version AddNumbers PROC EXPORT AddNumbers ; R0 = p_addResult ; R1 = p_left ; R2 = p_right ; R3 = size push {r4, r5} // push whatever is in r4 and r5 into stack movs r5, #0 ; clear carry _AddLoop ldm r1!, { r4 } // load p_left onto r4 lsrs r5, #1 ; get carry from low order R5 // left shift right r5 by 1? Perhaps I am missing the crux for addition with carry . adcs below adds r4 and r5 . so the carry is part of r5 and we're trying to extract it here ? ldm r2!, { r5 } // load p_right onto r5 adcs r4, r5 // ok stm r0!, { r4 } // result of addition into r0 adcs r5, r5 ; get carry into R5 // ? what does this achieve? subs r3, #1 // decrement size. bne _AddLoop // loop again, ? however where do we affect going from index to index -1 for the C arrays ? movs r0, #1 //? why ands r0, r5 ; return carry in R0 ? ANDING r0 and R5 to extract carry flag but why do we do this ? pop {r4, r5} // restore context bx lr // branch with link but what's lr? ENDP ; AddNumbers END