This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Large memcpy

We are using Keil C-51 V7.06.

There is a structure defined:

typedef struct
{
  unsigned char address_bus_h;
  unsigned char address_bus_l;
  unsigned char data_bus;
}instbus_raw_t;

Using three simple lines to copy the structure to another it enlarges code by 9 bytes, but this is not an elegant solution:

instbus_raw.address_bus_h = instbus_raw_local.address_bus_h;
instbus_raw.address_bus_l = instbus_raw_local.address_bus_l;
instbus_raw.data_bus = instbus_raw_local.data_bus;

Using the normal library function memcpy
blows up the code by 300 bytes!

memcpy(&instbus_raw,&instbus_raw_local,sizeof(instbus_raw_t));

Using an own function my_memcpy the code increases by 167 bytes:

void *(my_memcpy)(void *s1, const void *s2, size_t n)
{
  char *su1 = (char *)s1;
  const char *su2 = (const char *)s2;

  for (; 0 < n; --n)
    *su1++ = *su2++;
  return (s1);
}

my_memcpy(&instbus_raw,&instbus_raw_local,sizeof(struct instbus_raw_t));

In a project with a little chip of 2k Flash, 300 bytes for copying few bytes are considerable!

Does anyone remarking same effects with library functions wasting resources?

Regards Peter

  • Aren't you making life hard for yourself?

    What's wrong with:

    instbus_raw=instbus_raw_local;

    ?

  • Aren't you making life hard for yourself?

    Normally not!

    Nothing is wrong but it does exactly the same! ..and wastes the same amount of ROM!

  • Does anyone remarking same effects with library functions wasting resources?

    There's always a trade-off with vendor-provided library routines. And, no matter HOW we implemented a routine, someone would be unhappy.

    The memcpy routine in the library contains a subroutine for every memory space. When memcpy runs it first figures out the source memory type and the destination memory type and calls the appropriate routine (there is a routine for data to data, data to xdata, code to data, xdata to xdata, and so on). Each of these routines uses registers and is optimized for speed. Of course, there is a size penalty.

    The memcpy you provide executes very slowly. It uses generic pointers (3-bytes) that are stored in the default memory space (XDATA in large model). Each read and each write requires a function call into the C runtime library. However, if you only need to copy something once or if you don't need a high-speed routine, this is probably just fine.

    Jon

  • One more thing. The my_memcpy is not reentrant (the one in the library is). So, be sure you don't invoke it from the main program and an interrupt.

    Jon

  • "..and wastes the same amount of ROM!"

    Last time I looked, I found that structure assignment produced very tight, inline code!

  • Here is a compact version of memcpy() that is reenterant and is reasonably fast given its size.



    compact_memcpy.c:

    #include "compact_memcpy.h"
    
    //
    //  Compact Memory Copy
    //
    //  Author: Graham Cole
    //
    //  This function compies n bytes from s2 to s1.
    //
    //  The address of memory block is a Keil generic pointer in R1/R2/R3.
    //
    //      R1 - low part of address.
    //      R2 - high part of address.
    //      R3 - memory type where:
    //
    //  The address of memory block s2 is a Keil generic pointer in R0/R4/R5.
    //
    //      R0 - low part of address.
    //      R4 - high part of address.
    //      R5 - memory type where:
    //
    //  The value of s2 is loaded from parameter passing memory.
    //
    //  Where memory type is as follows:
    //
    //      0x00 - indicates idata memory.
    //      0x01 - indicates xdata memory.
    //      0xFE - indicates pdata memory.
    //      0xFF - indicates code memory.
    //
    //  The value of n is held in registers R6/R7.
    //
    //  The first lines of C code will suppress the UNUSED variable warning and the
    //  Keil C51 compiler will optimise this code out. The return() statement is
    //  compiled to an unused RET instruction.
    //
    
    #pragma ASM
    
        $REGUSE _compact_memcpy( A, B, PSW, DPH, DPL, R0, R1, R2, R3, R4, R5, R7 )
    
    #pragma ENDASM
    
    void *compact_memcpy(void *s1, void *s2, int n)
    {
        s1 = s1;                                            //Suppress UNUSED
        s2 = s2;                                            //Suppress UNUSED
        n  = n;                                             //Suppress UNUSED
    
            #pragma ASM
                                                        ;
            EXTRN CODE( primary_generic_write )         ;
            EXTRN CODE( auxilliary_generic_read )       ;
                                                        ;
                    MOV     R5,s2?041+00H               ;Load s2 into registers.
                    MOV     R4,s2?041+01H               ;(auxilliary generic pointer)
                    MOV     R0,s2?041+02H               ;
                                                        ;
                    MOV     R6,n?042+00H                ;Load n into registers.
                    MOV     R7,n?042+01H                ;
                                                        ;
            compact_memcpy:                             ;
                                                        ;
                    MOV     s1?040+00H,R3               ;Save s1 into memory.
                    MOV     s1?040+01H,R2               ;(primary generic pointer)
                    MOV     s1?040+02H,R1               ;
                                                        ;
                    MOV     A,R6                        ;
                    ORL     A,R7                        ;
                    JZ      ?compact_memcpy_generic_end ;If zero bytes to copy, terminate.
                                                        ;
                    INC     R6                          ;Pre increment n bytes for DJNZ.
                                                        ;
            ?compact_memcpy_loop:                       ;
                                                        ;
                    CALL    auxilliary_generic_read     ;Read byte at pointer s2.
                    CALL    primary_generic_write       ;Write byte to pointer s1.
                                                        ;
                    INC     R1                          ;Increment s1 generic pointer...
                    CJNE    R1,#0x00,?compact_memcpy_generic_skip_0
                    INC     R2                          ;
                                                        ;
            ?compact_memcpy_generic_skip_0:             ;
                                                        ;
                    INC     R0                          ;Increment s2 generic pointer...
                    CJNE    R0,#0x00,?compact_memcpy_generic_skip_1
                    INC     R4                          ;
                                                        ;
            ?compact_memcpy_generic_skip_1:             ;
                                                        ;
                    DJNZ    R7,?compact_memcpy_loop     ;
                    DJNZ    R6,?compact_memcpy_loop     ;..and iterate.
                                                        ;
            ?compact_memcpy_generic_end:                ;
                                                        ;
                    MOV     R3,s1?040+00H               ;Restore s1 into registers.
                    MOV     R2,s1?040+01H               ;
                    MOV     R1,s1?040+02H               ;
                    RET                                 ;
                                                        ;
            #pragma ENDASM
    
    
        return( 0 );                                    // Dummy return.
    }
    

  • And, of course, you will be needing this:

    compact_memcpy.h:

    #ifndef _COMPACT_MEMCPY_H_
    
    #define _COMPACT_MEMCPY_H_
    
    void *compact_memcpy(void *s1, void *s2, int n);
    
    #endif
    

    And this:

    //
    //  Primary Generic Read
    //
    //  This function reads from a location defined by a generic pointer
    //
    //  The arguments of the generic pointer are as follows:
    //
    //      R1 - low part of address.
    //      R2 - high part of address.
    //      R3 - memory type where:
    //
    //          0x00 - indicates idata memory.
    //          0x01 - indicates xdata memory.
    //          0xFE - indicates pdata memory.
    //          0xFF - indicates code memory.
    //
    //  This function is functionally identical to, and interchangable
    //  with, the comiler function C?CLDPTR.
    //
    
    #pragma ASM
    
    PUBLIC primary_generic_read
    
    ?PR?primary_generic_read   SEGMENT CODE
    
        RSEG  ?PR?primary_generic_read
    
    primary_generic_read:
    
            MOV     A,R3                        ;
            JNB     Acc.0,?pgr_not_dptr         ;
            MOV     DPL,R1                      ;
            MOV     DPH,R2                      ;
            JB      Acc.1,?pgr_not_xdata        ;
            MOVX    A,@DPTR                     ;
            RET                                 ;
    ?pgr_not_xdata:                             ;
            CLR     A                           ;
            MOVC    A,@A+DPTR                   ;
            RET                                 ;
    ?pgr_not_dptr:                              ;
            JNB     Acc.1,?pgr_not_pdata        ;
            MOVX    A,@R1                       ;
            RET                                 ;
    ?pgr_not_pdata:                             ;
            MOV     A,@R1                       ;
            RET                                 ;
    
    #pragma ENDASM
    
    //
    //  Primary Generic Write
    //
    //  This function writes accumulator to a location defined by a generic pointer
    //
    //  The arguments of the generic pointer are as follows:
    //
    //      R1 - low part of address.
    //      R2 - high part of address.
    //      R3 - memory type where:
    //
    //          0x00 - indicates idata memory.
    //          0x01 - indicates xdata memory.
    //          0xFE - indicates pdata memory.
    //          0xFF - indicates code memory.
    //
    
    #pragma ASM
    
    PUBLIC primary_generic_write
    
    ?PR?primary_generic_write   SEGMENT CODE
    
        RSEG  ?PR?primary_generic_write
    
    primary_generic_write:
    
            MOV     B,R3                        ;
            JNB     B.0,?pgw_not_dptr           ;
            MOV     DPL,R1                      ;
            MOV     DPH,R2                      ;
            JB      B.1,?pgw_not_xdata          ;
            MOVX    @DPTR,A                     ;
            RET                                 ;
    ?pgw_not_xdata:                             ;
            RET                                 ;
    ?pgw_not_dptr:                              ;
            JNB     B.1,?pgw_not_pdata          ;
            MOVX    @R1,A                       ;
            RET                                 ;
    ?pgw_not_pdata:                             ;
            MOV     @R1,A                       ;
            RET                                 ;
    
    #pragma ENDASM
    
    //
    //  Auxilliary Generic Read
    //
    //  This function reads from a location defined by a generic pointer
    //
    //  The arguments of the generic pointer are as follows:
    //
    //      R0 - low part of address.
    //      R4 - high part of address.
    //      R5 - memory type where:
    //
    //          0x00 - indicates idata memory.
    //          0x01 - indicates xdata memory.
    //          0xFE - indicates pdata memory.
    //          0xFF - indicates code memory.
    //
    
    #pragma ASM
    
    PUBLIC auxilliary_generic_read
    
    ?PR?auxilliary_generic_read   SEGMENT CODE
    
        RSEG  ?PR?auxilliary_generic_read
    
    auxilliary_generic_read:
    
            MOV     A,R5                        ;
            JNB     Acc.0,?agr_not_dptr         ;
            MOV     DPL,R0                      ;
            MOV     DPH,R4                      ;
            JB      Acc.1,?agr_not_xdata        ;
            MOVX    A,@DPTR                     ;
            RET                                 ;
    ?agr_not_xdata:                             ;
            CLR     A                           ;
            MOVC    A,@A+DPTR                   ;
            RET                                 ;
    ?agr_not_dptr:                              ;
            JNB     Acc.1,?agr_not_pdata        ;
            MOVX    A,@R0                       ;
            RET                                 ;
    ?agr_not_pdata:                             ;
            MOV     A,@R0                       ;
            RET                                 ;
    
    #pragma ENDASM
    
    

  • ...actually, compact_memcpy() is not reenterant.

  • Many Thanks for your postings!
    The routine that we are needing, has to be reentrant. In our project, the chip does only have DATA and memcpy has to copy same bytes from DATA to DATA.

  • Then you will be wanting something like this:

    //
    //  Data Memory Copy
    //
    //  Author: Graham Cole
    //
    //  This function compies n bytes from s2 to s1.
    //
    //  The address of memory block s1 is a Keil generic pointer in R7.
    //
    //  The address of memory block s2 is a Keil data pointer in R5.
    //
    //  The value of n is held in registers R3.
    //
    //  The first lines of C code will suppress the UNUSED variable warning and the
    //  Keil C51 compiler will optimise this code out. The return() statement is
    //  compiled to an unused RET instruction.
    //
    
    #pragma ASM
    
        $REGUSE _data_memcpy( A, PSW, R0, R1, R3 )
    
    #pragma ENDASM
    
    char data *data_memcpy(char data *s1, char data *s2, unsigned char n)
    {
        s1 = s1;                                        //Suppress UNUSED
        s2 = s2;                                        //Suppress UNUSED
        n  = n;                                         //Suppress UNUSED
    
            #pragma ASM
                                                        ;
            data_memcpy:                                ;
                                                        ;
                    MOV     A,R7                        ;Load s1 into register R0.
                    MOV     R0,A                        ;
                    MOV     A,R5                        ;Load s2 into register R1.
                    MOV     R1,A                        ;
                                                        ;
                    MOV     A,R3                        ;
                    JZ      ?data_memcpy_generic_end    ;If zero bytes to copy, terminate.
                                                        ;
            ?data_memcpy_loop:                          ;
                                                        ;
                    MOV     A,@R1                       ;Read from pointer s2.
                    MOV     @R0,A                       ;Write to pointer s1.
                                                        ;
                    INC     R1                          ;Increment s2 pointer.
                    INC     R0                          ;Increment s1 pointer.
                                                        ;
            ?data_memcpy_generic_skip_1:                ;
                                                        ;
                    DJNZ    R3,?data_memcpy_loop        ;..and iterate.
                                                        ;
            ?data_memcpy_generic_end:                   ;
                                                        ;
                                                        ;Return with s1 still in R7.
                    RET                                 ;
                                                        ;
            #pragma ENDASM
    
    
        return( 0 );                                    // Dummy return.
    }
    
    Which, of course, I have not actually tested!

  • For a bit of amusement I compiled the following code under v7.01 optimisation level 9:

    unsigned char data *my_data_memcpy(unsigned char data *dest, unsigned char data *src, unsigned char n)
    {
    	unsigned char data *temp=dest;
    
    
    	while(n)
    	{
    		*temp=*src;
    		temp++;
    		src++;
    		n--;
    	}
    
    	return(dest);
    }
    

    and found that it produces code that is two instructions shorter than Graham's data_memcpy() function.

    Sadly I was unable to convince the compiler to generate a suitably short version that was also reentrant. I also noted that the code was probably less efficient than Graham's, although I didn't actually bother to count either the instruction cycles or the total opcode byte count.

    Still, it does show that one only needs to resort to assembler when one has very specific requirements.

    Finally I also noticed that the return(0); in Graham's function generates a MOV R7,#00H instruction as well as a RET, which is a shame.

  • Ah, yes, sometime I just cannot helpmyself from getting into assembler... In fact, the C version could probably be slightly improved by using

        do
        {
        ...
        }while(--n != 0);
    
    In which case, I dare say the compiler code would be identical to my assembler.

    It is a pity that the compiler rules for passing parameters do not allow for two generic pointers and a count entirely in registers. This is quite a common requirement and the C51 compiler itself seems to be able to override these rules.

    Given that there are miriad ways of implementing memcpy() and other string.h functions, it would be very helpful for implementors to be able to write their own string.h libraries by having access to C51's special parameter passing rules. My guess is that this would not be too dificult to do, though it may require the addition of a new keyword to indicate use of the special parameter passing rules to the compiler. Such a facility could make a substantial difference to code size (as well as speed) and this could be very significant in the case of small applications.

    Implementors could choose between large and fast functions or slow but compact. Also, such functions could then easily be made reenterant.

    I have started a new thread on the subject of copying structures(): http://www.keil.com/forum/docs/thread4380.asp#msg18741

  • "Finally I also noticed that the return(0); in Graham's function generates a MOV R7,#00H instruction as well as a RET, which is a shame."

    Like I always say: if you need some assembler, have it properly as an assembler module - don't mess about with inline assembler in 'C' source files!

    Graham's function actually contains four lines whose sole function is to suppress compiler warnings.
    Luckily, the compiler happens to be smart enough to spot that the 1st three are irrelevant, and optimises them out.
    That just leaves the "spurious" RET.

    Using the SRC directive is a great way to create 'C'-compatible assembler source - with all the right calling & naming conventions, parameter passing, etc - but once you've done that, the 'C' file is of no further use; so throw it away!

  • Maybe a bit off topic, but I just wanted to share a thought.
    Take a look at the webpage of a new programming language called D:
    http://www.digitalmars.com/d/overview.html
    Here is a quote:

    Modern compiler technology has progressed to the point where language features for the purpose of compensating for primitive compiler technology can be omitted. (An example of this would be the 'register' keyword in C, a more subtle example is the macro preprocessor in C.) We can rely on modern compiler optimization technology to not need language features necessary to get acceptable code quality out of primitive compilers.

    Yet a lot of discussions around the use of C in microcontroller programming boil down to how to get more optimal code from a particular C compiler.
    Is it that Keil's compilers have not caught up with the latest and greatest in compiler technology? Or am I too picky?

    - mike

  • Is it that Keil's compilers have not caught up with the latest and greatest in compiler technology?

    Can you give me an example (manufacturer and version) of a compiler that is the latest and greated in technology. That way, I can let you know if we've caught up with them.

    Jon