This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

what changes to the source code of ARM Cortex-M3 can i make in order to shorten execution time?

here are the 3 codes i need to change to shorten execution time.

any help will be greatly appreciated.

5734.zip
  • Given the names of the files, are you giving us your homework to do?

    What have you tried so far, how do it go?

  • I am a newbie to assembly language programming.

    I need to make a few changes to the code to to reduce execution time.

    I have have managed to replace LDRs with a single a single LDMIA.

    can any one point me in the right direction please.

    here is a sample of the code peterharris

      ; copy and process blocks of 8 words

    block_loop

      LDMIA r0!,{r5-r12}  ; get 8 words to copy as a block

      MOV r4,r5           ; get first item

      BL data_processing  ; process first item

      MOV r5,r4           ; keep first item

      MOV r4,r6           ; get second item

      BL data_processing  ; process second item

      MOV r6,r4           ; keep second item

      MOV r4,r7           ; get third item

      BL data_processing  ; process third item

      MOV r7,r4           ; keep third item 

      MOV r4,r8           ; get fourth item

      BL data_processing  ; process fourth item

      MOV r8,r4           ; keep fourth item

      MOV r4,r9           ; get fifth item

      BL data_processing  ; process fifth item

      MOV r9,r4           ; keep fifth item 

      MOV r4,r10          ; get sixth item

      BL data_processing  ; process sixth item

      MOV r10,r4       ; keep sixth item

      MOV r4,r11          ; get seventh item

      BL data_processing  ; process seventh item

      MOV r11,r4          ; keep seventh item

      MOV r4,r12          ; get eighth item

      BL data_processing  ; process eighth item

      MOV r12,r4          ; keep eighth item 

      STMIA r1!,{r5-r12}  ; copy the 8 words

      SUBS r3,r3,#1       ; move on to the next block

      BNE block_loop      ; continue until last block reached

    PLEASE HELP!

  • The moves and branches don't add any value to the processing - they are just overhead because of the algorithm you are using, and the use of the data_processing function adds branch overhead.

    How would you consider removing it?

    If you have a short function in C what is the common means to remove the overhead of the function call, and can you apply this technique here?

  • thanks peter, im not quite sure what you mean by  "branch overhead".

    im thinking alond the lines of doing the data processing in one single loop, so it doesnt keep getting called.

    Im  really confused to be honest.

  • Ignore the current code, and design the solution from a clean sheet; the best optimizations are those which solve the problem a different way rather than trying to move instructions about.

    If someone told you you needed to double four numbers as quickly as possible how would you do it?

    Unless you like making your code really convoluted you would probably end up with something simple like:

    LDMIA {r0-r3}, [src]
    ADD r0, r0, r0
    ADD r1, r1, r1
    ADD r2, r2, r2
    ADD r3, r3, r3
    STMIA {r0-r3}, [dst]
    
    
    

    No moves, no branches. Can you apply the same principle to your code?

    im not quite sure what you mean by  "branch overhead".

    Overhead = anything not helping compute the final value you want. Moves, branches, stack loads and stores, etc are just overhead added by the "framework" needed to run the algorithm, but they are not helping generate the actual value the algorithm emits.

    HTH,
    Pete

  • Thanks pete, let me have a go at it  and then i will let you how far i get.

  • hi pete, is this what you mean?

    i have removed all the branching instuctions.

    ; Perform block copying of data words from one memory location to another
      ; Before copying, the values are divided by 2 and then saturated to a maximum
      ; value of 5.
      ; It can be assumed that the data values are non-negative

      ; set up the exception addresses
    ;  THUMB
      AREA RESET, CODE, READONLY
      EXPORT  __Vectors
      EXPORT Reset_Handler
    __Vectors
      DCD 0x00180000     ; top of the stack
      DCD Reset_Handler  ; reset vector - where the program starts

      AREA Task2b_Code, CODE, READONLY
    Reset_Handler
      ENTRY
     
    num_words EQU (end_source-source)/4  ; number of words to copy

    start
      LDR r0,=source     ; point to the start of the area of memory to copy from
      LDR r1,=dest       ; point to the start of the area of memory to copy to
      MOV r2,#num_words  ; get the number of words to copy
     
      ; find out how many blocks of 8 words need to be copied - it is assumed
      ; that it faster to load 8 data items at a time, rather than load
      ; individually
    block
      MOVS r3,r2,LSR #3  ; find the number of blocks of 8 words
      BEQ individ        ; if no blocks to copy, just copy individual words
     
      ; copy and process blocks of 8 words
    block_loop
      LDMIA r0!,{r5-r12}  ; get 8 words to copy as a block
     
      CMP r5,#10           ; check whether saturation is needed
      MOVLT r5,r5,LSR #1     ; perform scaling
      MOVLE r5,#5            ; saturate to 5
     
      CMP r6,#10           ; check whether saturation is needed
      MOVLT r6,r6,LSR #1     ; perform scaling
      MOVLE r6,#5            ; saturate to 5
     
      CMP r7,#10           ; check whether saturation is needed
      MOVLT r7,r7,LSR #1     ; perform scaling
      MOVLE r7,#5            ; saturate to 5
     
      CMP r8,#10           ; check whether saturation is needed
      MOVLT r8,r8,LSR #1     ; perform scaling
      MOVLE r8,#5            ; saturate to 5
     
      CMP r9,#10           ; check whether saturation is needed
      MOVLT r9,r9,LSR #1     ; perform scaling
      MOVLE r9,#5            ; saturate to 5
      
      CMP r10,#10           ; check whether saturation is needed
      MOVLT r10,r10,LSR #1     ; perform scaling
      MOVLE r10,#5            ; saturate to 5
     
      CMP r11,#10           ; check whether saturation is needed
      MOVLT r11,r11,LSR #1     ; perform scaling
      MOVLE r11,#5            ; saturate to 5
     
      CMP r12,#10           ; check whether saturation is needed
      MOVLT r12,r12,LSR #1     ; perform scaling
      MOVLE r12,#5            ; saturate to 5
     
      STMIA r1!,{r5-r12}  ; copy the 8 words
      SUBS r3,r3,#1       ; move on to the next block
      BNE block_loop      ; continue until last block reached

      ; there may now be some data items available (fewer than 8)
      ; find out how many of these individual words need to be copied
    individ
      ANDS r3,r2,#7   ; find the number of words that remain to copy individually
      BEQ exit        ; skip individual copying if none remains

      ; copy the excess of words
    individ_loop
      LDR r4,[r0],#4      ; get next word to copy
     
      CMP r4,#10           ; check whether saturation is needed
      MOVLT r4,r4,LSR #1     ; perform scaling
      MOV r4,#5            ; saturate to 5

      STR r4,[r1],#4 
        ; copy the word
      SUBS r3,r3,#1       ; move on to the next word
      BNE individ_loop    ; continue until the last word reached

      ; languish in an endless loop once all is done
    exit   
      B exit

      ; subroutine to scale a value by 0.5 and then saturate values to a maximum of 5

      AREA Task2b_ROData, DATA, READONLY
    source  ; some data to copy
      DCD 1,2,3,4,5,6,7,8,9,10,11,0,4,6,12,15,13,8,5,4,3,2,1,6,23,11,9,10
    end_source

      AREA Task2b_RWData, DATA, READWRITE
    dest  ; copy to this area of memory
      SPACE end_source-source
    end_dest
      END

  • hey peterharris

    Im getting the wrong answers with the above changes.

    Any idea why?

  • Hi Ali, can you please explain further why its wrong. i dont fully understand.

  • Hi

    individ_loop


    MOV r4,#5     ---------------->     MOVLE r4,#5

  • Hi

    this is wrong :

           CMP rx,#10

           MOVLT rx,rx,LSR #1          --------------> rx=5  (always)

          MOV rx,#5

  • Sorry whats does MOVLT AND MOVLE stand for ?
  • It means read the manual, but perhaps or someone can help by pointing you towards the manual so you can learn :)
  • i didnt see it in the instruction set manual pls could someone help