Has someone a good hint how to implement a parity-calculation of a byte in the most efficient way on an C51-device?
The comma operator is perfect for yet another nice trick: update a 16bit timer in C, by adding the 16bit value of the period constant to the timer registers:
/////////////////////////////////////////////////////////////////////////// // The following macros reload the 1ms timer, compensating for the interrupt // latency to achieve an average frequency of 1ms, corrected at every period. // // The code uses a few C tricks: // - Accesses the PSW carry directly, to obtain the overflow from an unsigned // char addition. We use the comma operator for that, to perform a side effect // addition right before of a carry bit test, not allowing the compiler to // reorder the code and make the carry dirty. // The comma operator is used also to insert padding NOPs before the addition, // in the cases that the lower byte of the adjust value is {0,1,2}, because the // C51 compiler generates INC instead, or suppresses the operation. // // This code was verified with optimization levels from 0 to 9, and the compiler // doesn't try to break the code with optimizations. // // The whole procedure is wrapped in a function-like macro. // // This code was verified in a P89C668 @18.432MHz, and the frequency deviation // measured was ±20ppm, essentially the cpu crystal variation, completely // removing the timer interrupt latency. /////////////////////////////////////////////////////////////////////////// #ifdef ADJ_DELAY #undef ADJ_DELAY #endif #ifdef RELOAD_TIMER0 #undef RELOAD_TIMER0 #endif #if (((TMR_1MS & 0xff) < (0xff-8)) || ((TMR_1MS & 0xff) > (0xff-6))) #define ADJ_DELAY 9 #define RELOAD_TIMER0() do { \ TR0 = 0; \ if (TL0 += ((TMR_1MS & 0xff) + ADJ_DELAY), CY) TH0++; \ TH0 += (TMR_1MS >> 8); \ TR0 = 1; \ } while(0) #elif ((TMR_1MS & 0xff) == (0xff-6)) #define ADJ_DELAY 10 #define RELOAD_TIMER0() do { \ TR0 = 0; \ if (_nop_(), TL0 += ((TMR_1MS & 0xff) + ADJ_DELAY), CY) TH0++; \ TH0 += (TMR_1MS >> 8); \ TR0 = 1; \ } while(0) #elif ((TMR_1MS & 0xff) == (0xff-7)) #define ADJ_DELAY 11 #define RELOAD_TIMER0() do { \ TR0 = 0; \ if (_nop_(), _nop_(), TL0 += ((TMR_1MS & 0xff) + ADJ_DELAY), CY) TH0++; \ TH0 += (TMR_1MS >> 8); \ TR0 = 1; \ } while(0) #elif ((TMR_1MS & 0xff) == (0xff-8)) #define ADJ_DELAY 12 #define RELOAD_TIMER0() do { \ TR0 = 0; \ if (_nop_(), _nop_(), _nop_(), TL0 += ((TMR_1MS & 0xff) + ADJ_DELAY), CY) TH0++; \ TH0 += (TMR_1MS >> 8); \ TR0 = 1; \ } while(0) #endif
In the timer handler, a call to RELOAD_TIMER() invokes the macro, that is synthesized for any reload value.
Another use of the comma operator: you can get the 16bit result from the multiplication of 2 unsigned char without any call to the math clib:
unsigned char data a, b; unsigned int data x; a = 36; b = 130; // this is 400% faster than a x = (unsigned int) a * b *(unsigned char*)&x = ((((unsigned char*)&x)[1] = a * b), B);
I am just too lazy: you can make the int variable a union to access the msb and lsb parts, to improve readability, but the generated code is the same: 11 instruction cycles against 40 cycles using a 16bit cast.