here are the 3 codes i need to change to shorten execution time.
any help will be greatly appreciated.
thanks peter, im not quite sure what you mean by "branch overhead".
im thinking alond the lines of doing the data processing in one single loop, so it doesnt keep getting called.
Im really confused to be honest.
Ignore the current code, and design the solution from a clean sheet; the best optimizations are those which solve the problem a different way rather than trying to move instructions about.
If someone told you you needed to double four numbers as quickly as possible how would you do it?
Unless you like making your code really convoluted you would probably end up with something simple like:
LDMIA {r0-r3}, [src] ADD r0, r0, r0 ADD r1, r1, r1 ADD r2, r2, r2 ADD r3, r3, r3 STMIA {r0-r3}, [dst]
No moves, no branches. Can you apply the same principle to your code?
im not quite sure what you mean by "branch overhead".
Overhead = anything not helping compute the final value you want. Moves, branches, stack loads and stores, etc are just overhead added by the "framework" needed to run the algorithm, but they are not helping generate the actual value the algorithm emits.
HTH, Pete
Thanks pete, let me have a go at it and then i will let you how far i get.
hi pete, is this what you mean?
i have removed all the branching instuctions.
; Perform block copying of data words from one memory location to another ; Before copying, the values are divided by 2 and then saturated to a maximum ; value of 5. ; It can be assumed that the data values are non-negative
; set up the exception addresses; THUMB AREA RESET, CODE, READONLY EXPORT __Vectors EXPORT Reset_Handler__Vectors DCD 0x00180000 ; top of the stack DCD Reset_Handler ; reset vector - where the program starts
AREA Task2b_Code, CODE, READONLYReset_Handler ENTRY num_words EQU (end_source-source)/4 ; number of words to copy
start LDR r0,=source ; point to the start of the area of memory to copy from LDR r1,=dest ; point to the start of the area of memory to copy to MOV r2,#num_words ; get the number of words to copy ; find out how many blocks of 8 words need to be copied - it is assumed ; that it faster to load 8 data items at a time, rather than load ; individuallyblock MOVS r3,r2,LSR #3 ; find the number of blocks of 8 words BEQ individ ; if no blocks to copy, just copy individual words ; copy and process blocks of 8 words block_loop LDMIA r0!,{r5-r12} ; get 8 words to copy as a block CMP r5,#10 ; check whether saturation is needed MOVLT r5,r5,LSR #1 ; perform scaling MOVLE r5,#5 ; saturate to 5 CMP r6,#10 ; check whether saturation is needed MOVLT r6,r6,LSR #1 ; perform scaling MOVLE r6,#5 ; saturate to 5 CMP r7,#10 ; check whether saturation is needed MOVLT r7,r7,LSR #1 ; perform scaling MOVLE r7,#5 ; saturate to 5 CMP r8,#10 ; check whether saturation is needed MOVLT r8,r8,LSR #1 ; perform scaling MOVLE r8,#5 ; saturate to 5 CMP r9,#10 ; check whether saturation is needed MOVLT r9,r9,LSR #1 ; perform scaling MOVLE r9,#5 ; saturate to 5 CMP r10,#10 ; check whether saturation is needed MOVLT r10,r10,LSR #1 ; perform scaling MOVLE r10,#5 ; saturate to 5 CMP r11,#10 ; check whether saturation is needed MOVLT r11,r11,LSR #1 ; perform scaling MOVLE r11,#5 ; saturate to 5 CMP r12,#10 ; check whether saturation is needed MOVLT r12,r12,LSR #1 ; perform scaling MOVLE r12,#5 ; saturate to 5 STMIA r1!,{r5-r12} ; copy the 8 words SUBS r3,r3,#1 ; move on to the next block BNE block_loop ; continue until last block reached
; there may now be some data items available (fewer than 8) ; find out how many of these individual words need to be copied individ ANDS r3,r2,#7 ; find the number of words that remain to copy individually BEQ exit ; skip individual copying if none remains
; copy the excess of wordsindivid_loop LDR r4,[r0],#4 ; get next word to copy CMP r4,#10 ; check whether saturation is needed MOVLT r4,r4,LSR #1 ; perform scaling MOV r4,#5 ; saturate to 5
STR r4,[r1],#4 ; copy the word SUBS r3,r3,#1 ; move on to the next word BNE individ_loop ; continue until the last word reached
; languish in an endless loop once all is doneexit B exit
; subroutine to scale a value by 0.5 and then saturate values to a maximum of 5
AREA Task2b_ROData, DATA, READONLYsource ; some data to copy DCD 1,2,3,4,5,6,7,8,9,10,11,0,4,6,12,15,13,8,5,4,3,2,1,6,23,11,9,10 end_source
AREA Task2b_RWData, DATA, READWRITEdest ; copy to this area of memory SPACE end_source-sourceend_dest END
hey peterharris
Im getting the wrong answers with the above changes.
Any idea why?
Hi Ali, can you please explain further why its wrong. i dont fully understand.
Hi
individ_loop
MOV r4,#5 ----------------> MOVLE r4,#5
this is wrong :
CMP rx,#10
MOVLT rx,rx,LSR #1 --------------> rx=5 (always)
MOV rx,#5