This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Speed of code between simulator and target on XC164.

I am running a 10KHz interrupt running some mathmatical DSP code. It has to complete within 100 microseconds and preferably less than 40 microseconds to give me spare processing power.

I decided to run the code on the simulator and using the Performance Analyzer Window on the software simulator it said the interrupt routine was taking 33 microseconds and as taking a third of the overall CPU power. Great I thought!

However when I run the code for real I found it was taking abou 3 times as long and there is barely any spare processing power for my other routines. Any ideas?

Some technical information:

- I am using PK166 V5.03

- Both the simulator and target are running at 40MHz.

- The code is a deterministic block of maths, i.e. there are no time outs or waiting for I/Os.

- The interrupt routine has the highest priority, also I checked on single stepping the target that no other interrupts are active.

- I extensive tested the speed of the target code using a oscillscope and a spare I/O. I found that no single piece of the code is taking longer. The whole code just seems to run slower than the simulator by about 2 - 3 times.

- I know the target is running at 40MHz because all the PWM frequencies and CAN frequencies are as expected.

Parents
  • Hi, thanks for replying.

    Yes I should have said about that. I am using the XC164 in single chip mode. Yesterday I checked the number of wait states and found it was set at 4 waitstates. After consulting the keil website I found that for XC16X chips working above 32 MHz 2 waitstates are recommended.

    Great I thought - back to full speed! Once I did this I only got a 20% boost in speed so I'm still more than twice as slow as the Performance Analyser says I should be. (80 versus 33 microseconds).

    I've got all the options turned on for CPU branch prediction, instruction FIFOs set at 3 and all that stuff.

    Hmmm must be something obvious but I don't know what......

Reply
  • Hi, thanks for replying.

    Yes I should have said about that. I am using the XC164 in single chip mode. Yesterday I checked the number of wait states and found it was set at 4 waitstates. After consulting the keil website I found that for XC16X chips working above 32 MHz 2 waitstates are recommended.

    Great I thought - back to full speed! Once I did this I only got a 20% boost in speed so I'm still more than twice as slow as the Performance Analyser says I should be. (80 versus 33 microseconds).

    I've got all the options turned on for CPU branch prediction, instruction FIFOs set at 3 and all that stuff.

    Hmmm must be something obvious but I don't know what......

Children
  • Hi I am not a DSP expert,

    but do you use the DSP library provided by Infineon ?
    There are handoptimized assembler routines, which uses DPRAM.
    ( Maybe you can take some inspiration from this )
    And as if the problem is regarding the DSP part itself (speed), could the alternative be to
    execute this part from PSRAM ?

    Stefan

  • That is me again,

    I remember that the default reset values for the PLL where changed.
    E.g. the AC steps have
    maximum VCO base frequency / 16
    and
    BA/BB steps have
    minimum VCO base frequency / 16.

    Did you check these settings?

    Stefan

  • Hi thanks for replying,

    Yes I am using the infineon dsp libary and alot of 32/16 bit integer divides using the inbuilt math routines.

    What upsets me is not so much how long its taking, it more that the simulator is producing a different timing answer to the reality. Maybe it doesn't properly include the time spent in the maths routines which are not inline but called from my 10KHz function.

    I have explictly set the pll. I am using a 8MHz crystal with PLL setting of:

    _PLLODIV EQU 3 ; 0 .. 14 Fpll = Fvco / (PLLODIV+1)
    ; 15 = reserved
    ;
    ; <o> PLLIDIV: PLL Input Divider (PLLCON.4 .. PLLCON.5) <0-3>
    ; Fin = Fosc / (PLLIDIV+1)
    _PLLIDIV EQU 0 ; 0 .. 3 Fin = Fosc / (PLLIDIV+1)
    ;
    ; <o> PLLVB: PLL VCO Band Select (PLLCON.6 .. PLLCON.7)
    ; <0=> Ouput:100-150MHz / Base:20-80MHz <1=> Ouput:150-200MHz / Base:40-130MHz
    ; <2=> Ouput:200-250MHz / Base:60-180MHz <3=> (250...300 MHz) Reserved
    _PLLVB EQU 1 ; ValueVCO output frequency Base frequency
    ; 0 = 100...150 MHz 20...80 MHz
    ; 1 = 150...200 MHz 40...130 MHz
    ; 2 = 200...250 MHz [def.] 60...180 MHz
    ; 3 = (250...300 MHz) Reserved
    ;
    ; <o> PLLMUL: PLL Multiplication Factor (PLLCON.8 .. PLLCON.12) <6-31>
    ; Fvco = Fin * (PLLMUL+1)
    _PLLMUL EQU 19 ; 7 .. 31 Fvco = Fin * (PLLMUL+1)
    ; 0 .. 6 = reserved
    ;
    ; <o> PLLCTRL: PLL Operation Control (PLLCON.13 .. PLLCON.14)
    ; <0=> Bypass PLL clock mult., the VCO is off <1=> Bypass PLL clock mult., the VCO is running
    ; <2=> VCO clock used, input clock switched off <3=> VCO clock used, input clock connected
    _PLLCTRL EQU 3 ; 0 = Bypass PLL clock mult., the VCO is off
    ; 1 = Bypass PLL clock mult., the VCO is running
    ; 2 = VCO clock used, input clock switched off
    ; 3 = VCO clock used, input clock connected

    I know the cpu clock is running at 40MHz because the UART and CAPCOM6 signals run off the same clock tree and produce the expected output frequencies.

    I assume other people have verified code timings produce by this simulator?