This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

How do I find in my software that the core has been stopped and released at the breakpoint?

Hi,

In my application i have timing synchronization. When I stop on break point i lose the timing synchronization. When I run the program again after break point I have to detect that program execution was stopped and to re-synchronize my internal time.

The question is how to detect in my program that core was stopped/halted on break point?

0 Sean Dunlevy over 3 years ago

Original C MACRO:

#define MC2M(x)
{
c1 = *coef;
coef++;
c2 = *coef;
coef++;
vLo = *(vb1+(x));
vHi = *(vb1+(23-(x)));
sum1L = MADD64(sum1L, vLo, c1);
sum2L = MADD64(sum2L, vLo, c2);
sum1L = MADD64(sum1L, vHi, -c2);
sum2L = MADD64(sum2L, vHi, c1);
}

This is used 8 times in a row with x0 to x7 so 32 multiplies.

LDR r12,[r2],#4 ;c1 = *coef++
LDR r14,[r2],#4 ;c2 = *coef++
LDR r0,[r1,#0] ;vLo = *(vb1+(x))
LDR r3,[r1,#0x5c] ;vHi = *(vb1+(23-(x)))
SMLAL r4,r5,r0,r12 ;sum1L = MADD64(sum1L, vLo, c1)
SMLAL r6,r7,r0,r14 ;sumL2 = MADD64(sum2L, vLo, c2)
RSB r14,r14,#0 ;-c2
SMLAL r4,r5,r3,r14 ;sum1L = MADD64(sum1L, vHi, -c2)
SMLAL r6,r7,r3,r12 ;sum2L = MADD64(sum2L, vHi, c1)

This is hand-written ARM v3< code so it JUST fits into registers.

;r0-r4 used by MULSHIFT32
;r12 lo & r5 hi of sum1L
;r14 lo & r6 hi of sum2L
;r7 base-address of vb1
;r8 c1
;r9 c2
;r10 vLo/vHi
;r11 index into vb1 & loop-count

MC2M:
mov r0,#4 ;
negs r0,r0 ;setup inner-loop count.
mov r11,r0 ;
.inner_loop:
pop r0-r1 ;get c1 & c2
mov r9,r1 ;store c2
mov r8,r0 ;store c1

mov r2,r11 ;
add r2,#4 ;update index.
mov r11,r2 ;

mov r0,[r7,r2] ;vLo
mov r10,r0

mulshift32

add r12,r0 ;
cmp r0,r12 ;sum1L += (vLo x c1)
adcs r5,r1 ;

mov r0,r9 ;c2
mov r1,r10 ;vLo

mulshift32

add r14,r0 ;
cmp r0,r14 ;sum2L += (vLo x c2)
adcs r6,r1 ;

mov r2,r11 ;
mov r3,#$5c ;$5C - index
sub r2,r3,r2 ;

mov r0,[r7,r2] ;vHi
mov r10,r0

mov r1,r9 ;-c2
neg r1,r1 ;

mulshift32

add r12,r0 ;
cmp r0,r12 ;sum1L += (vHi x -c2)
adcs r5,r1 ;

mov r0,r8 ;c1
mov r1,r10 ;vHi

mulshift32

add r14,r0 ;
cmp r0,r14 ;sum2L += (vHi x c1)
adcs r6,r1 ;

mov r0,r10
cmp r0,#$24
bl .inner_loop

Above is my hand-written M0+ Thumb code to do the same thing. I had to use r13 (SP) to read values because r0-r4 are used by your multiply, the hi-words of the 2 64-bit accumulators need a lo register to allow ADCS instruction and r7 is the base-address of data being processed. I COULD unroll it but tt would not be much % faster since your multiply takes 17 cycles.

I have been looking for a routine that only returns the top 32-bits of a 64-bit value but it seems that your 17 cycle masterpiece cannot be improved on.

I am writing this MP2/MP2.5/MP3 fixed-point decoder in the order of how many cycle each routine uses. I am presuming I can either disable all interrupts during this routine OR to find some way to use both MSP & PSP. If that were possible, I could store & restore the SP value in a simpler way.

Note to all - if anyone has the source code for a fixed-point ACELP decoder then I would love to convert that into Thumb. While I appreciate that hardware solutions are possible for MP3 & ACELP, I'm also going to be adding LPC10, LPC10e & MELP i.e. a general audio decoder and 4 different speech decoders. I think an M0+ based audiobook player could be very cheap.
Cancel
Up 0 Down

Cancel
0 42Bastian Schick over 3 years ago in reply to Sean Dunlevy

Wrong thread?!
Cancel
Up 0 Down

Cancel