here are the 3 codes i need to change to shorten execution time.
any help will be greatly appreciated.
I am a newbie to assembly language programming.
I need to make a few changes to the code to to reduce execution time.
I have have managed to replace LDRs with a single a single LDMIA.
can any one point me in the right direction please.
here is a sample of the code peterharris
; copy and process blocks of 8 words
block_loop
LDMIA r0!,{r5-r12} ; get 8 words to copy as a block
MOV r4,r5 ; get first item
BL data_processing ; process first item
MOV r5,r4 ; keep first item
MOV r4,r6 ; get second item
BL data_processing ; process second item
MOV r6,r4 ; keep second item
MOV r4,r7 ; get third item
BL data_processing ; process third item
MOV r7,r4 ; keep third item
MOV r4,r8 ; get fourth item
BL data_processing ; process fourth item
MOV r8,r4 ; keep fourth item
MOV r4,r9 ; get fifth item
BL data_processing ; process fifth item
MOV r9,r4 ; keep fifth item
MOV r4,r10 ; get sixth item
BL data_processing ; process sixth item
MOV r10,r4 ; keep sixth item
MOV r4,r11 ; get seventh item
BL data_processing ; process seventh item
MOV r11,r4 ; keep seventh item
MOV r4,r12 ; get eighth item
BL data_processing ; process eighth item
MOV r12,r4 ; keep eighth item
STMIA r1!,{r5-r12} ; copy the 8 words
SUBS r3,r3,#1 ; move on to the next block
BNE block_loop ; continue until last block reached
PLEASE HELP!
thanks peter, im not quite sure what you mean by "branch overhead".
im thinking alond the lines of doing the data processing in one single loop, so it doesnt keep getting called.
Im really confused to be honest.
Ignore the current code, and design the solution from a clean sheet; the best optimizations are those which solve the problem a different way rather than trying to move instructions about.
If someone told you you needed to double four numbers as quickly as possible how would you do it?
Unless you like making your code really convoluted you would probably end up with something simple like:
LDMIA {r0-r3}, [src] ADD r0, r0, r0 ADD r1, r1, r1 ADD r2, r2, r2 ADD r3, r3, r3 STMIA {r0-r3}, [dst]
No moves, no branches. Can you apply the same principle to your code?
im not quite sure what you mean by "branch overhead".
Overhead = anything not helping compute the final value you want. Moves, branches, stack loads and stores, etc are just overhead added by the "framework" needed to run the algorithm, but they are not helping generate the actual value the algorithm emits.
HTH, Pete
Thanks pete, let me have a go at it and then i will let you how far i get.
View all questions in Cortex-M / M-Profile forum