I and a co-worker are programming an 8051 uController in C. Last week we were struggling with a poor perfomance time of our program. So I talked to the assembly guy and we figured that our way of software timer handling was too slow. We were using 16 bit variables. Now I tweaked the timer software a bit and it runs about 4x faster, which is fast enough.
Digging a little deeper in the produced assembly code we found a 'persistent nuisance'. As a test I wrote these lines in code:
uint8 j; for(j=10;j--;){ rightTorqueArray[j] = j; }
The array is an unsigned char array but when we observe the assembly
MOV R7,#0AH ?C0001: MOV R6,AR7 DEC R7 MOV A,R6 JZ ?C0002 ; rightTorqueArray[j] = j; } ; SOURCE LINE # 52 MOV A,#LOW (rightTorqueArray) ADD A,R7 MOV DPL,A CLR A ADDC A,#HIGH (rightTorqueArray) MOV DPH,A MOV A,R7 MOVX @DPTR,A SJMP ?C0001 ?C0002:
We noticed that the array is adressed with LOW and HIGH so apparantly it is treated as a 16 bit variable. But my assembly-nese is not so well, so please correct me if I am wrong.
I set the Code Optimalization at level 8: reuse Common entry code and the emphasis at Favor speed.
The assembly was produced as a .SRC file using #pragma SRC on top of the C-file.
The code may be a little bit faster, if arrays will be located in pdata memory (first 256-bytes sector of XDATA). Addressing mode via @R0 and @R1 may be used by compiler.
Indeed - good one!
Although this is probably another area where the OP will need help from the "Assembly Guy" (to explain the concept & operation; not write the code).
See http://www.keil.com/support/docs/1848.htm for starters.
It might even be that each of the arrays could be given its own page in PDATA ...
Rather than collecting the data into arrays, and then running through those arrays to sum them - could you not just sum the data as it arrives ... ?
I could but I'd have to substract the value of the sample from 3 cycles ago. So I still have to memorize all 4 samples either way.
The torque measurement is a constant process. One of the four samples get swapped for a new sample, and than the calculation over the 4 samples must be done.
I have yet to try out the pointers but I am currently busy with making some other changes.
Are you doing a running average, then?
Here's an implementation which doesn't require keeping the old samples:
www.daycounter.com/.../Moving-Average.phtml
Interesting mathematics behind it. But I cannot imagine that the execution of that method is actually faster at least not significant. At the moment I have to take 1 sample and add 4 unsigned char variables to 1 unsigned int variable before the calculation. From the description of that link I have to do 1 division, 1 substraction and 2 shift operations. Because I doubted it would be quicker, I did not translated the calculation to C.
Currently I use a switch case with 4 cases for each sample (x2). From what I learned here and from what I see in the assembly output is that this method is relative quick. The addresses are fixed at compile time and that makes a difference.
switch(torqueIndex){ case 0: firstPollLeft = leftServoTorque; firstPollRight = rightServoTorque; break; case 1: secondPollLeft = leftServoTorque; secondPollRight = rightServoTorque; break; case 2: thirdPollLeft = leftServoTorque; thirdPollRight = rightServoTorque; break; case 3: fourthPollLeft = leftServoTorque; fourthPollRight = rightServoTorque; break;} torqueIndex++; if(torqueIndex==SAMPLE_AMMOUNT) torqueIndex=0;
You may well be right.
As you're only taking 8 samples, can you fit them in DATA? That would certainly be faster than XDATA ...
Or PDATA?