We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
Hi there! I need to optimize the accessing the xdata memory (the speed) with C51. I am sending some files via modem and the transmission could be a bit faster. At present, I am reaching a transmission speed around 2kbyte/sec, which is kinda slow. Right now, my buffers are large arrays in xdata memory and before a byte reaches the UART, it went through a couple of buffers (for the prtotocol layers). I cannot prevent that. But maybe one or the other way can speed up my access time to xdata. I am not sure, I have heard somewhere that C51 will be faster if pointers instead of arrays are used. I didn't find anything about that in the search (maybe due to my keywords). Does anybody know more about that topic?
I think it means the following: imagine we have to fill an array with zeros. There are at least 2 ways to do that:
int array[100]; int i; for (i=0; i<100; i++) array[i] = 0;
int array[100]; int* ptr; for (ptr=array+100; ptr != array; ) *--ptr = 0; // or something like that
"The code generated for '*--ptr = 0' should be more efficient than the code generated for 'array[i] = 0'" This is not necessarily so - see: http://www.keil.com/forum/msgpage.asp?MsgID=4108 Note also that it's more efficient to have your for loop counting down to zero, as the DJNZ instruction can then be used. If you do use pointers, be sure to use memory-specific pointers. Make sure that the pointer or loop index is in DATA. You may find that its best to use the Library routines like memcpy as they are (hopefully) pretty well optimised, or write your own optimised version in assembler. Enabling extra DPTR(s) should help a bit. Does your processor have DMA?
Another thought: Can you use PDATA? This should be quicker, as it only needs an 8-bit address? If you're really desperate, you could move the PDATA page for buffers >256 bytes - but that's probably need assembler! Can you turn your clock frequency up!?
Since you need to dereference only one pointer, and since you need to keep track of a count, either method is good.
5 char xdata X[1000]; 6 extern void fn( char ); 7 8 void main( void ) 9 { 10 1 char xdata* pX; 11 1 unsigned int i; 12 1 13 1 for( i = sizeof(X), pX = X; i != 0; --i, ++pX ) 14 1 fn( *pX ); 15 1 16 1 17 1 for( i = 0; i != sizeof(X); ++i ) 18 1 fn( X[ i ] ); 19 1 } C51 COMPILER V6.20c MAIN 01/11/2002 06:16:05 PAGE 2 ASSEMBLY LISTING OF GENERATED OBJECT CODE ; FUNCTION main (BEGIN) ; SOURCE LINE # 8 ; SOURCE LINE # 9 ; SOURCE LINE # 13 0000 750003 R MOV i,#03H 0003 7500E8 R MOV i+01H,#0E8H 0006 750000 R MOV pX,#HIGH X 0009 750000 R MOV pX+01H,#LOW X 000C ?C0001: 000C E500 R MOV A,i+01H 000E 4500 R ORL A,i 0010 601D JZ ?C0002 ; SOURCE LINE # 14 0012 850082 R MOV DPL,pX+01H 0015 850083 R MOV DPH,pX 0018 E0 MOVX A,@DPTR 0019 FF MOV R7,A 001A 120000 E LCALL _fn 001D E500 R MOV A,i+01H 001F 1500 R DEC i+01H 0021 7002 JNZ ?C0008 0023 1500 R DEC i 0025 ?C0008: 0025 0500 R INC pX+01H 0027 E500 R MOV A,pX+01H 0029 70E1 JNZ ?C0001 002B 0500 R INC pX 002D ?C0009: 002D 80DD SJMP ?C0001 002F ?C0002: ; SOURCE LINE # 17 002F E4 CLR A 0030 F500 R MOV i,A 0032 F500 R MOV i+01H,A 0034 ?C0004: ; SOURCE LINE # 18 0034 7400 R MOV A,#LOW X 0036 2500 R ADD A,i+01H 0038 F582 MOV DPL,A 003A 7400 R MOV A,#HIGH X 003C 3500 R ADDC A,i 003E F583 MOV DPH,A 0040 E0 MOVX A,@DPTR 0041 FF MOV R7,A 0042 120000 E LCALL _fn 0045 0500 R INC i+01H 0047 E500 R MOV A,i+01H 0049 7002 JNZ ?C0010 004B 0500 R INC i 004D ?C0010: 004D B4E8E4 CJNE A,#0E8H,?C0004 0050 E500 R MOV A,i 0052 B403DF CJNE A,#03H,?C0004 ; SOURCE LINE # 19 0055 ?C0007: 0055 22 RET ; FUNCTION main (END)
My guess is that your protocall layers are killing you. Replace c = X[i] with c = 0, and see if you can even measure a speed increase.
You say that you have some large buffers. In that case you will have to place them in xdata. Presumably you are using an interrupt driven UART driver. Consider making the buffer that is accessed by the ISR as small as possible so that it at least can be placed in pdata – that will keep your interrupts as fast as possible. If you can, keep your buffers down to 256 elements or less; on the 8051, 8-bit arithmetic is very much faster that 16-bit. I assume that you are using circular buffers. Although having your buffers in xdata may be inevitable because of their size. Don't place the buffer and the control variables in one structure, it looks neat but is generally slower. If you can, place the control variables (read and write pointers/indexes, count etc.) in the fastest available memory e.g. pdata or preferably data. With a circular buffer it is necessary to increment an index modulo the length of the buffer. Make your buffer 2^n elements long and C51 will covert you modulo operator to an AND mask – which is nice and quick. This is probably the main reason why, in similar contexts, I have found no real advantage in using pointers rather than indexes – a fast increment modulo 2^n is essential. BTW: in general I have noticed that C51 is not very cleaver when pre/post increment/decrements are used within an expression and that it is generally the case that shorter, faster code results by placing these increments/decrements in separate C statements. The exceptions to this rule are
unsigned char count; ... if ( --count ) { ... }
do { ... } while( --count != 0 )