Hi,
In my application i have timing synchronization. When I stop on break point i lose the timing synchronization. When I run the program again after break point I have to detect that program execution was stopped and to re-synchronize my internal time.
The question is how to detect in my program that core was stopped/halted on break point?
Original C MACRO:#define MC2M(x){ c1 = *coef; coef++; c2 = *coef; coef++; vLo = *(vb1+(x)); vHi = *(vb1+(23-(x))); sum1L = MADD64(sum1L, vLo, c1); sum2L = MADD64(sum2L, vLo, c2); sum1L = MADD64(sum1L, vHi, -c2); sum2L = MADD64(sum2L, vHi, c1);}This is used 8 times in a row with x0 to x7 so 32 multiplies. LDR r12,[r2],#4 ;c1 = *coef++ LDR r14,[r2],#4 ;c2 = *coef++ LDR r0,[r1,#0] ;vLo = *(vb1+(x)) LDR r3,[r1,#0x5c] ;vHi = *(vb1+(23-(x))) SMLAL r4,r5,r0,r12 ;sum1L = MADD64(sum1L, vLo, c1) SMLAL r6,r7,r0,r14 ;sumL2 = MADD64(sum2L, vLo, c2) RSB r14,r14,#0 ;-c2 SMLAL r4,r5,r3,r14 ;sum1L = MADD64(sum1L, vHi, -c2) SMLAL r6,r7,r3,r12 ;sum2L = MADD64(sum2L, vHi, c1)This is hand-written ARM v3< code so it JUST fits into registers.
;r0-r4 used by MULSHIFT32;r12 lo & r5 hi of sum1L;r14 lo & r6 hi of sum2L;r7 base-address of vb1;r8 c1;r9 c2;r10 vLo/vHi;r11 index into vb1 & loop-count
MC2M: mov r0,#4 ; negs r0,r0 ;setup inner-loop count. mov r11,r0 ;.inner_loop: pop r0-r1 ;get c1 & c2 mov r9,r1 ;store c2 mov r8,r0 ;store c1
mov r2,r11 ; add r2,#4 ;update index. mov r11,r2 ;
mov r0,[r7,r2] ;vLo mov r10,r0 mulshift32 add r12,r0 ; cmp r0,r12 ;sum1L += (vLo x c1) adcs r5,r1 ;
mov r0,r9 ;c2 mov r1,r10 ;vLo
mulshift32
add r14,r0 ; cmp r0,r14 ;sum2L += (vLo x c2) adcs r6,r1 ;
mov r2,r11 ; mov r3,#$5c ;$5C - index sub r2,r3,r2 ;
mov r0,[r7,r2] ;vHi mov r10,r0
mov r1,r9 ;-c2 neg r1,r1 ;
add r12,r0 ; cmp r0,r12 ;sum1L += (vHi x -c2) adcs r5,r1 ;
mov r0,r8 ;c1 mov r1,r10 ;vHi
add r14,r0 ; cmp r0,r14 ;sum2L += (vHi x c1) adcs r6,r1 ;
mov r0,r10 cmp r0,#$24 bl .inner_loopAbove is my hand-written M0+ Thumb code to do the same thing. I had to use r13 (SP) to read values because r0-r4 are used by your multiply, the hi-words of the 2 64-bit accumulators need a lo register to allow ADCS instruction and r7 is the base-address of data being processed. I COULD unroll it but tt would not be much % faster since your multiply takes 17 cycles.I have been looking for a routine that only returns the top 32-bits of a 64-bit value but it seems that your 17 cycle masterpiece cannot be improved on.I am writing this MP2/MP2.5/MP3 fixed-point decoder in the order of how many cycle each routine uses. I am presuming I can either disable all interrupts during this routine OR to find some way to use both MSP & PSP. If that were possible, I could store & restore the SP value in a simpler way. Note to all - if anyone has the source code for a fixed-point ACELP decoder then I would love to convert that into Thumb. While I appreciate that hardware solutions are possible for MP3 & ACELP, I'm also going to be adding LPC10, LPC10e & MELP i.e. a general audio decoder and 4 different speech decoders. I think an M0+ based audiobook player could be very cheap.
Wrong thread?!