Hi, I am trying to rewrite the arm assembly code generated by gcc in release mode (for optimization purpose). The below code is in a loop and so it will get executed many times. Can some one please let me know how can I optimize it for cycles. Are there any bottle necks in this code? Thanks in advance..
Assembly code:
ldr r4, [r1, #8] ldr r6, [r1] ldr r7, [r7, #-12] and r11, r4, #7 ldr r6, [r6, r4, lsr #3] rev r6, r6 lsl r6, r6, r11 lsr r6, r6, #23 lsl r6, r6, #2 add r11, r7, r6 ldrsh r6, [r7, r6] ldrsh r7, [r11, #2] add r4, r7, r4 ldr r7, [r1, #16] cmp r4, r7 strls r4, [r1, #8] strhi r7, [r1, #8]
It is better to start from the C code for optimizing - there's no good indication here of what you're hoping the code will do. Changing the data structure or algorithm a bit is often the best path only using assembly if really really desperate and the C can't be improved. Also you haven't included the whole loop - there's no branch. At a quick glace though I'd wonder what 'ldrsh r6, [r7, r6]' is doing as r6 doesn't seem to be then used. Also is there a big difference in how often the two predicated stores at the end are done?
Thanks daith for the response... Actually the above code of assembly is a small portion in a loop. You are right about r6 but this will be used in the latter part of the loop which I have not included in the post. I have done the c optimization and reached a saturation point. So I have started writing assembly. One more thing, can I use r13 ( link register) to store data as I am running out of registers? Will it cause any problems to any operations..?
Do you mean the link register r14? If you're willing to ignore the procedure call standard and not get good debug trace back then you only need worry about the sp which is r13, your system may require some other register kept too but not many do. The sp is required in case you need to deal with a signal or on Cortex-M to save the registers when an interrupt happens. It is best not to mess with sp.
Yes, I mean link register r14 (sorry for the mistake). I am not using any branch with link instructions in the function, so I think it is safe to use link register (of course, I will take care of pushing at the entrance and popping at the exit of function). I agree with you that sp is not something to mess with. Also could you please let me know are there any free softwares which will do the pipeline stall analysis instead of manually checking each instruction? Thanks