<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="https://community.arm.com/utility/feedstylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>How do I find in my software that the core has been stopped and released at the breakpoint?</title><link>https://community.arm.com/developer/ip-products/processors/f/cortex-m-forum/47954/how-do-i-find-in-my-software-that-the-core-has-been-stopped-and-released-at-the-breakpoint</link><description> Hi, 
 In my application i have timing synchronization. When I stop on break point i lose the timing synchronization. When I run the program again after break point I have to detect that program execution was stopped and to re-synchronize my internal</description><dc:language>en-US</dc:language><generator>Telligent Community 10</generator><item><title>RE: How do I find in my software that the core has been stopped and released at the breakpoint?</title><link>https://community.arm.com/thread/168485?ContentTypeID=1</link><pubDate>Fri, 06 Nov 2020 04:51:23 GMT</pubDate><guid isPermaLink="false">dd9e70c8-6d3c-4c71-b136-2456382a7b5c:122836a5-004f-42b9-b63a-10eb393c22f5</guid><dc:creator>42Bastian Schick</dc:creator><description>&lt;p&gt;Wrong thread?!&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item><item><title>RE: How do I find in my software that the core has been stopped and released at the breakpoint?</title><link>https://community.arm.com/thread/168481?ContentTypeID=1</link><pubDate>Thu, 05 Nov 2020 20:23:06 GMT</pubDate><guid isPermaLink="false">dd9e70c8-6d3c-4c71-b136-2456382a7b5c:5c7d3669-61a8-4ff6-9f56-1b3071b17941</guid><dc:creator>Sean Dunlevy</dc:creator><description>&lt;p&gt;Original C MACRO:&lt;br /&gt;&lt;br /&gt;#define MC2M(x)&lt;br /&gt;{&lt;br /&gt; c1 = *coef;&lt;br /&gt; coef++;&lt;br /&gt; c2 = *coef;&lt;br /&gt; coef++;&lt;br /&gt; vLo = *(vb1+(x));&lt;br /&gt; vHi = *(vb1+(23-(x)));&lt;br /&gt; sum1L = MADD64(sum1L, vLo, c1);&lt;br /&gt; sum2L = MADD64(sum2L, vLo, c2);&lt;br /&gt; sum1L = MADD64(sum1L, vHi, -c2);&lt;br /&gt; sum2L = MADD64(sum2L, vHi, c1);&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;This is used 8 times in a row with x0 to x7 so 32 multiplies.&lt;br /&gt;&lt;br /&gt; LDR r12,[r2],#4 ;c1 = *coef++&lt;br /&gt; LDR r14,[r2],#4 ;c2 = *coef++&lt;br /&gt; LDR r0,[r1,#0] ;vLo = *(vb1+(x))&lt;br /&gt; LDR r3,[r1,#0x5c] ;vHi = *(vb1+(23-(x)))&lt;br /&gt; SMLAL r4,r5,r0,r12 ;sum1L = MADD64(sum1L, vLo, c1)&lt;br /&gt; SMLAL r6,r7,r0,r14 ;sumL2 = MADD64(sum2L, vLo, c2)&lt;br /&gt; RSB r14,r14,#0 ;-c2&lt;br /&gt; SMLAL r4,r5,r3,r14 ;sum1L = MADD64(sum1L, vHi, -c2)&lt;br /&gt; SMLAL r6,r7,r3,r12 ;sum2L = MADD64(sum2L, vHi, c1)&lt;br /&gt;&lt;br /&gt;This is hand-written ARM v3&amp;lt; code so it JUST fits into registers.&lt;br /&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;;r0-r4 used by MULSHIFT32&lt;br /&gt;;r12 lo &amp;amp; r5 hi of sum1L&lt;br /&gt;;r14 lo &amp;amp; r6 hi of sum2L&lt;br /&gt;;r7 base-address of vb1&lt;br /&gt;;r8 c1&lt;br /&gt;;r9 c2&lt;br /&gt;;r10 vLo/vHi&lt;br /&gt;;r11 index into vb1 &amp;amp; loop-count&lt;/p&gt;
&lt;p&gt;MC2M:&lt;br /&gt; mov r0,#4 ;&lt;br /&gt; negs r0,r0 ;setup inner-loop count.&lt;br /&gt; mov r11,r0 ;&lt;br /&gt;.inner_loop:&lt;br /&gt; pop r0-r1 ;get c1 &amp;amp; c2&lt;br /&gt; mov r9,r1 ;store c2&lt;br /&gt; mov r8,r0 ;store c1&lt;/p&gt;
&lt;p&gt;mov r2,r11 ;&lt;br /&gt; add r2,#4 ;update index.&lt;br /&gt; mov r11,r2 ;&lt;/p&gt;
&lt;p&gt;mov r0,[r7,r2] ;vLo&lt;br /&gt; mov r10,r0&lt;br /&gt; &lt;br /&gt; mulshift32 &lt;br /&gt; &lt;br /&gt; add r12,r0 ;&lt;br /&gt; cmp r0,r12 ;sum1L += (vLo x c1)&lt;br /&gt; adcs r5,r1 ;&lt;/p&gt;
&lt;p&gt;mov r0,r9 ;c2&lt;br /&gt; mov r1,r10 ;vLo&lt;/p&gt;
&lt;p&gt;mulshift32&lt;/p&gt;
&lt;p&gt;add r14,r0 ;&lt;br /&gt; cmp r0,r14 ;sum2L += (vLo x c2)&lt;br /&gt; adcs r6,r1 ;&lt;/p&gt;
&lt;p&gt;mov r2,r11 ;&lt;br /&gt; mov r3,#$5c ;$5C - index&lt;br /&gt; sub r2,r3,r2 ;&lt;/p&gt;
&lt;p&gt;mov r0,[r7,r2] ;vHi&lt;br /&gt; mov r10,r0&lt;/p&gt;
&lt;p&gt;mov r1,r9 ;-c2&lt;br /&gt; neg r1,r1 ;&lt;/p&gt;
&lt;p&gt;mulshift32&lt;/p&gt;
&lt;p&gt;add r12,r0 ;&lt;br /&gt; cmp r0,r12 ;sum1L += (vHi x -c2)&lt;br /&gt; adcs r5,r1 ;&lt;/p&gt;
&lt;p&gt;mov r0,r8 ;c1&lt;br /&gt; mov r1,r10 ;vHi&lt;/p&gt;
&lt;p&gt;mulshift32&lt;/p&gt;
&lt;p&gt;add r14,r0 ;&lt;br /&gt; cmp r0,r14 ;sum2L += (vHi x c1)&lt;br /&gt; adcs r6,r1 ;&lt;/p&gt;
&lt;p&gt;mov r0,r10&lt;br /&gt; cmp r0,#$24&lt;br /&gt; bl .inner_loop&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Above is my hand-written M0+ Thumb code to do the same thing. I had to use r13 (SP) to read values because r0-r4 are used by your multiply, the hi-words of the 2 64-bit accumulators need a lo register to allow ADCS instruction and r7 is the base-address of data being processed. I COULD unroll it but tt would not be much % faster since your multiply takes 17 cycles.&lt;br /&gt;&lt;br /&gt;I have been looking for a routine that only returns the top 32-bits of a 64-bit value but it seems that your 17 cycle masterpiece cannot be improved on.&lt;br /&gt;&lt;br /&gt;I am writing this MP2/MP2.5/MP3 fixed-point decoder in the order of how many cycle each routine uses. I am presuming I can either disable all interrupts during this routine OR to find some way to use both MSP &amp;amp; PSP. If that were possible, I could store &amp;amp; restore the SP value in a simpler way.&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Note to all - if anyone has the source code for a fixed-point ACELP decoder then I would love to convert that into Thumb. While I appreciate that hardware solutions are possible for MP3 &amp;amp; ACELP, I&amp;#39;m also going to be adding LPC10, LPC10e &amp;amp; MELP i.e. a general audio decoder and 4 different speech decoders. I think an M0+ based audiobook player could be very cheap.&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;</description></item></channel></rss>