This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

How do I find in my software that the core has been stopped and released at the breakpoint?

Hi,

In my application i have timing synchronization. When I stop on break point i lose the timing synchronization. When I run the program again after break point I have to detect that program execution was stopped and to re-synchronize my internal time.

The question is how to detect in my program that core was stopped/halted on break point?

  • Original C MACRO:

    #define MC2M(x)
    {
    c1 = *coef;
    coef++;
    c2 = *coef;
    coef++;
    vLo = *(vb1+(x));
    vHi = *(vb1+(23-(x)));
    sum1L = MADD64(sum1L, vLo, c1);
    sum2L = MADD64(sum2L, vLo, c2);
    sum1L = MADD64(sum1L, vHi, -c2);
    sum2L = MADD64(sum2L, vHi, c1);
    }

    This is used 8 times in a row with x0 to x7 so 32 multiplies.

    LDR r12,[r2],#4 ;c1 = *coef++
    LDR r14,[r2],#4 ;c2 = *coef++
    LDR r0,[r1,#0] ;vLo = *(vb1+(x))
    LDR r3,[r1,#0x5c] ;vHi = *(vb1+(23-(x)))
    SMLAL r4,r5,r0,r12 ;sum1L = MADD64(sum1L, vLo, c1)
    SMLAL r6,r7,r0,r14 ;sumL2 = MADD64(sum2L, vLo, c2)
    RSB r14,r14,#0 ;-c2
    SMLAL r4,r5,r3,r14 ;sum1L = MADD64(sum1L, vHi, -c2)
    SMLAL r6,r7,r3,r12 ;sum2L = MADD64(sum2L, vHi, c1)

    This is hand-written ARM v3< code so it JUST fits into registers.

    ;r0-r4 used by MULSHIFT32
    ;r12 lo & r5 hi of sum1L
    ;r14 lo & r6 hi of sum2L
    ;r7 base-address of vb1
    ;r8 c1
    ;r9 c2
    ;r10 vLo/vHi
    ;r11 index into vb1 & loop-count

    MC2M:
    mov r0,#4 ;
    negs r0,r0 ;setup inner-loop count.
    mov r11,r0 ;
    .inner_loop:
    pop r0-r1 ;get c1 & c2
    mov r9,r1 ;store c2
    mov r8,r0 ;store c1

    mov r2,r11 ;
    add r2,#4 ;update index.
    mov r11,r2 ;

    mov r0,[r7,r2] ;vLo
    mov r10,r0

    mulshift32

    add r12,r0 ;
    cmp r0,r12 ;sum1L += (vLo x c1)
    adcs r5,r1 ;

    mov r0,r9 ;c2
    mov r1,r10 ;vLo

    mulshift32

    add r14,r0 ;
    cmp r0,r14 ;sum2L += (vLo x c2)
    adcs r6,r1 ;

    mov r2,r11 ;
    mov r3,#$5c ;$5C - index
    sub r2,r3,r2 ;

    mov r0,[r7,r2] ;vHi
    mov r10,r0

    mov r1,r9 ;-c2
    neg r1,r1 ;

    mulshift32

    add r12,r0 ;
    cmp r0,r12 ;sum1L += (vHi x -c2)
    adcs r5,r1 ;

    mov r0,r8 ;c1
    mov r1,r10 ;vHi

    mulshift32

    add r14,r0 ;
    cmp r0,r14 ;sum2L += (vHi x c1)
    adcs r6,r1 ;

    mov r0,r10
    cmp r0,#$24
    bl .inner_loop


    Above is my hand-written M0+ Thumb code to do the same thing. I had to use r13 (SP) to read values because r0-r4 are used by your multiply, the hi-words of the 2 64-bit accumulators need a lo register to allow ADCS instruction and r7 is the base-address of data being processed. I COULD unroll it but tt would not be much % faster since your multiply takes 17 cycles.

    I have been looking for a routine that only returns the top 32-bits of a 64-bit value but it seems that your 17 cycle masterpiece cannot be improved on.

    I am writing this MP2/MP2.5/MP3 fixed-point decoder in the order of how many cycle each routine uses. I am presuming I can either disable all interrupts during this routine OR to find some way to use both MSP & PSP. If that were possible, I could store & restore the SP value in a simpler way. 


    Note to all - if anyone has the source code for a fixed-point ACELP decoder then I would love to convert that into Thumb. While I appreciate that hardware solutions are possible for MP3 & ACELP, I'm also going to be adding LPC10, LPC10e & MELP i.e. a general audio decoder and 4 different speech decoders. I think an M0+ based audiobook player could be very cheap.