I use DWT in Cortex-M4 to catch instructions that write or read memory contents and the problem is it doesn't stop immediately where I expect, it stops after 2-3 instruction later than where it should and the contents of registers are overwritten, sometimes the branch are changed.
Where can I find any documentation about the reason for this behavior?
Is it because of instruction execution ordering? Can I disable it somehow?
the Armv7-M architecture reference manual does mentioned it
Section C.1.5.3 Debug event prioritization
The following are asynchronous debug events:— Watchpoint debug events, including PC match watchpoints.— DHCSR.C_HALT halt request debug events.— EDBGRQ external halt request debug events.
Unlike breakpoints, which we can stop the processor pipeline when the instruction is in decode stage, we don't know what address a load/store instruction is going to use until it start execute and the address is generated. While it is possible to compare the address and stop the processor immediately, it does means that the DWT comparators need to in the same clock cycle as address generation phase which might cause penalty to the maximum clock frequency of the processor.