This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Speed of code between simulator and target on XC164.

I am running a 10KHz interrupt running some mathmatical DSP code. It has to complete within 100 microseconds and preferably less than 40 microseconds to give me spare processing power.

I decided to run the code on the simulator and using the Performance Analyzer Window on the software simulator it said the interrupt routine was taking 33 microseconds and as taking a third of the overall CPU power. Great I thought!

However when I run the code for real I found it was taking abou 3 times as long and there is barely any spare processing power for my other routines. Any ideas?

Some technical information:

- I am using PK166 V5.03

- Both the simulator and target are running at 40MHz.

- The code is a deterministic block of maths, i.e. there are no time outs or waiting for I/Os.

- The interrupt routine has the highest priority, also I checked on single stepping the target that no other interrupts are active.

- I extensive tested the speed of the target code using a oscillscope and a spare I/O. I found that no single piece of the code is taking longer. The whole code just seems to run slower than the simulator by about 2 - 3 times.

- I know the target is running at 40MHz because all the PWM frequencies and CAN frequencies are as expected.

  • If you are using internal FLASH to store your program, are you taking into account how many wait states you have configured in the IMBCTR register?

    FYI, Infineon has an early problem notification report that recommends how many to use depending on the exact device and clock frequency you are using.

  • Hi, thanks for replying.

    Yes I should have said about that. I am using the XC164 in single chip mode. Yesterday I checked the number of wait states and found it was set at 4 waitstates. After consulting the keil website I found that for XC16X chips working above 32 MHz 2 waitstates are recommended.

    Great I thought - back to full speed! Once I did this I only got a 20% boost in speed so I'm still more than twice as slow as the Performance Analyser says I should be. (80 versus 33 microseconds).

    I've got all the options turned on for CPU branch prediction, instruction FIFOs set at 3 and all that stuff.

    Hmmm must be something obvious but I don't know what......

  • Hi I am not a DSP expert,

    but do you use the DSP library provided by Infineon ?
    There are handoptimized assembler routines, which uses DPRAM.
    ( Maybe you can take some inspiration from this )
    And as if the problem is regarding the DSP part itself (speed), could the alternative be to
    execute this part from PSRAM ?

    Stefan

  • That is me again,

    I remember that the default reset values for the PLL where changed.
    E.g. the AC steps have
    maximum VCO base frequency / 16
    and
    BA/BB steps have
    minimum VCO base frequency / 16.

    Did you check these settings?

    Stefan

  • Hi thanks for replying,

    Yes I am using the infineon dsp libary and alot of 32/16 bit integer divides using the inbuilt math routines.

    What upsets me is not so much how long its taking, it more that the simulator is producing a different timing answer to the reality. Maybe it doesn't properly include the time spent in the maths routines which are not inline but called from my 10KHz function.

    I have explictly set the pll. I am using a 8MHz crystal with PLL setting of:

    _PLLODIV EQU 3 ; 0 .. 14 Fpll = Fvco / (PLLODIV+1)
    ; 15 = reserved
    ;
    ; <o> PLLIDIV: PLL Input Divider (PLLCON.4 .. PLLCON.5) <0-3>
    ; Fin = Fosc / (PLLIDIV+1)
    _PLLIDIV EQU 0 ; 0 .. 3 Fin = Fosc / (PLLIDIV+1)
    ;
    ; <o> PLLVB: PLL VCO Band Select (PLLCON.6 .. PLLCON.7)
    ; <0=> Ouput:100-150MHz / Base:20-80MHz <1=> Ouput:150-200MHz / Base:40-130MHz
    ; <2=> Ouput:200-250MHz / Base:60-180MHz <3=> (250...300 MHz) Reserved
    _PLLVB EQU 1 ; ValueVCO output frequency Base frequency
    ; 0 = 100...150 MHz 20...80 MHz
    ; 1 = 150...200 MHz 40...130 MHz
    ; 2 = 200...250 MHz [def.] 60...180 MHz
    ; 3 = (250...300 MHz) Reserved
    ;
    ; <o> PLLMUL: PLL Multiplication Factor (PLLCON.8 .. PLLCON.12) <6-31>
    ; Fvco = Fin * (PLLMUL+1)
    _PLLMUL EQU 19 ; 7 .. 31 Fvco = Fin * (PLLMUL+1)
    ; 0 .. 6 = reserved
    ;
    ; <o> PLLCTRL: PLL Operation Control (PLLCON.13 .. PLLCON.14)
    ; <0=> Bypass PLL clock mult., the VCO is off <1=> Bypass PLL clock mult., the VCO is running
    ; <2=> VCO clock used, input clock switched off <3=> VCO clock used, input clock connected
    _PLLCTRL EQU 3 ; 0 = Bypass PLL clock mult., the VCO is off
    ; 1 = Bypass PLL clock mult., the VCO is running
    ; 2 = VCO clock used, input clock switched off
    ; 3 = VCO clock used, input clock connected

    I know the cpu clock is running at 40MHz because the UART and CAPCOM6 signals run off the same clock tree and produce the expected output frequencies.

    I assume other people have verified code timings produce by this simulator?