This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

C51 Not usig Dual datapointer

We assumed that this verry good compiler could recognice "for" statements moving bytes from one buffer to another. This seams not to be right. From the manual I can se that the dual dp only works with a limited number of lib functions (strcpy, memcpy...)
Do anyone have experience in using dual dp on dallas 320 and do we have to set up any directives for the compiler to use the dual dp all over the program.

Henning

  • I think you're right - the only time C51 uses the extra DPTR(s) is in that very limited set of library functions.

    You could write some test routines & use the Performance Analyser to determine whether using memcpy with dual DPTRs is actually better than a 'for' loop.

    The trouble with memcpy is that it uses Generic pointers, which might negate the benefits of the 2nd DPTR?

    You could always write your own multi-DPTR-enabled library routines?

  • You are right. Dual data pointers are only used by Keil library functions and any assembly language functions you may happen to write.

    The choice whether to use dual memory pointers is a target level option.

    Library functions such as memcpy will be significantly faster when when dual memory pointers are selected. However, there is an impact on interrupt service routines. With dual memory pointers selected, interrupt sevice routines have to save two data pointers where there was only one before.

    If your application has high frequency interrupts, but low frequency library function calls, you may be better off keeping dual data pointers disabled.

  • Dual data pointers are SO overrated. There some benchmarks available at the following URL:

    http://www.keil.com/support/docs/1173.htm

    However, you should note that they compare a 12MHz 8051 to a 12MHZ Dallas 320. The 320 is ALREADY faster than the 8051 based on clock speed. So, the performance increase due to the dual data pointers is not so dramatic.

    More information about why there is no in-line code generation for dual data pointers may be found at the following URL:

    http://www.keil.com/support/docs/1604.htm

    If you currently use C code to copy buffers, you should consider using the memcpy routine from the library. It will be faster.

    Jon

  • The benchmarks show a significant difference between "small" (on the order of 1-10) and "large" (on the order of hundreds & thousands).

    How much of this is due to the overhead of the Generic Pointers?

    Do you have corresponding benchmarks using memory-specific pointers; eg,

    void xdata *memcpy_xdata_xdata( 
                   void xdata *dst,
                   void xdata *src,
                   unsinged char end,
                   unsigned int  len )
    

  • The benchmarks show a significant difference between "small" (on the order of 1-10) and "large" (on the order of hundreds & thousands).

    How much of this is due to the overhead of the Generic Pointers?

    Do you have corresponding benchmarks using memory-specific pointers; eg,


    I guess we could have written a really crappy memcpy that doesn't know about the memory areas, but we didn't. :-)

    That's why you like the Keil compiler so much!

    The memcpy routine figures out which memory area you are reading and which area you are writing and invokes a specific routine to copy the data. Therefore, the routine you suggest would be faster only by 4 or 5 instructions (the ones that figure out the source and destination memory areas).

    There only need to be the following:

    CODE  -> XDATA
    CODE  -> IDATA
    XDATA -> XDATA
    XDATA -> IDATA
    IDATA -> XDATA
    IDATA -> IDATA
    

    Since there is no write line for CODE, we don't have routines to write to it.

    The following code is generated for an XDATA to XDATA copy using the memcpy function:

    loop:
    movx  a,@dptr  ; 2 cycles
    inc   dptr     ; 2 cycles
    xch   a,r0     ; 1 cycle
    xch   a,dpl    ; 1 cycle
    xch   a,r0     ; 1 cycle
    xch   a,r4     ; 1 cycle
    xch   a,dph    ; 1 cycle
    xch   a,r4     ; 1 cycle
    movx  @dptr,a  ; 2 cycles
    inc   dptr     ; 2 cycles
    xch   a,r0     ; 1 cycle
    xch   a,dpl    ; 1 cycle
    xch   a,r0     ; 1 cycle
    xch   a,r4     ; 1 cycle
    xch   a,dph    ; 1 cycle
    xch   a,r4     ; 1 cycle
    djnz  r7,loop  ; 2 cycles
    djnz  r6,loop  ; 2 cycles
    

    This sure looked like a lot of code to me. However, casual observation and perusal of the instruction set led me to nothing that was shorter or faster. Maybe someone else will suggest a faster xdata to xdata copy routine.

    The loop portion (counting only the inner-loop DJNZ since it is taken each time) takes 22 cycles to copy 1 byte.

    The following code is generated for the Dallas dual data pointers:

    loop:
    movx  a,@dptr   ; 2 cycles
    inc   dptr      ; 2 cycles
    inc   dps       ; 1 cycle
    movx  @dptr,a   ; 2 cycles
    inc   dptr      ; 2 cycle
    inc   dps       ; 1 cycle
    djnz  r7,loop   ; 2 cycles
    djnz  r6,loop   ; 2 cycles
    

    This looks a lot faster. The loop portion (counting only the inner-loop DJNZ) takes 12 cycles to copy 1 byte.

    So, assuming an apples to apples comparison, the Dallas semiconductor dual data pointers allow you to copy data in 12/22 of the speed of a standard 8051 with only one data pointer. This is a BEST CASE time.

    As I pointed out in the link in my previous post, at about 100 bytes to copy, this starts to pay off.

    Jon

  • Does C51 ever use more than 2 DPTRs, if the chip supports it?

  • Yes,

    The Infineon C517 type devices are the only ones that I know of that have more than 2 data pointers. For these devices, C51 uses one PAIR of data pointers for each register bank.

    Jon