Now we are researching watch point function on A53. We simply write a driver, hook debug exception handler aml_watchpoint_handler instead of default watch point handler.
In our watch point handler, we first disabled watch point control, then handle debug exception, after handler finished, we re-enable this watch point control and exit exception.
Theoretically when we trigger a watch point event once, watch point exception should taken once. but we found if we re-enable watch point at the end of handler, A53 CPU re-entered this exception again and again and this event will last a long time, some times it will not end. If we don't re-enable watch point in exception handler but re-enable it in a work_struct after a short time sleeping(1ms is enough), then this exception can only take once. But this method may cause lost of watch point event during disable period.
We don't know the detailed behavior of watch point exception and why watch point exception will enter again and again and how to exit watch point handler safely. Who can help us for this issue?
Watchpoints are synchronous and precise on ARMv8, which means they are taken in the same way as an MMU fault, *before* the access is visible to the memory system. If you just return from the watchpoint handler, it will return to re-execute the instruction that triggered the watchpoint, and naturally trigger it again.
The normal way to deal with this is:
* Disable the watchpoint.
* Single step the instruction (using Software Step exception).
* Re-enable the watchpoint.
I found it is a bit difficult for using software step exception.
I also wrote a handler and hook default software step handler in Linux kernel. But if I re-enable watch point in software step exception handler, watch point event still trigger again and again.
Can you help to explain how to use software step exception with more detail information?
Are you able to confirm that, after stepping the instruction, the PC has advanced from the instruction that triggered the watchpoint?
Here is the log:
[ 161.707466@0] ---- watch point 0 triggered, watch addr:ffffffc00256b500 ----[ 161.708870@0] [4500]sh, fault addr:ffffffc00256b500, esr:d6000062, mdscr:a000[ 161.715936@0] Call trace:[ 161.718532@0] [<ffffffc0018f225c>] dbg_en_store+0x50/0x68[ 161.723875@0] [<ffffffc001492424>] class_attr_store+0x3c/0x54[ 161.729567@0] [<ffffffc001238e50>] sysfs_kf_write+0x58/0x74[ 161.735087@0] [<ffffffc00123d15c>] kernfs_fop_write+0xf8/0x154[ 161.740865@0] [<ffffffc0011c1108>] vfs_write+0xac/0x1b4[ 161.746039@0] [<ffffffc0011c1b44>] SyS_write+0x50/0xb0[ 161.751126@0] aml_watchpoint_handler, pstate:20200145, mdscr:a001[ 161.757166@0] aml_single_step_handler, addr:ffffffc00256b500, esr:ce000022, awp:ffffffc05604f900[ 161.765875@0] aml_single_step_handler, pstate:200001c5, mdscr:a000[ 161.772121@0] ---- watch point 0 triggered, watch addr:ffffffc00256b500 ----[ 161.778989@0] [4500]sh, fault addr:ffffffc00256b500, esr:d6000062, mdscr:a000[ 161.786057@0] Call trace:[ 161.788655@0] [<ffffffc0018f225c>] dbg_en_store+0x50/0x68[ 161.793996@0] [<ffffffc001492424>] class_attr_store+0x3c/0x54[ 161.799688@0] [<ffffffc001238e50>] sysfs_kf_write+0x58/0x74[ 161.805207@0] [<ffffffc00123d15c>] kernfs_fop_write+0xf8/0x154[ 161.810989@0] [<ffffffc0011c1108>] vfs_write+0xac/0x1b4[ 161.816160@0] [<ffffffc0011c1b44>] SyS_write+0x50/0xb0[ 161.821247@0] aml_watchpoint_handler, pstate:20200145, mdscr:a001[ 161.827288@0] aml_single_step_handler, addr:ffffffc00256b500, esr:ce000022, awp:ffffffc05604f900[ 161.835997@0] aml_single_step_handler, pstate:200001c5, mdscr:a000[ 161.842238@0] ---- watch point 0 triggered, watch addr:ffffffc00256b500 ----[ 161.849112@0] [4500]sh, fault addr:ffffffc00256b500, esr:d6000062, mdscr:a000[ 161.856179@0] Call trace:[ 161.858774@0] [<ffffffc0018f225c>] dbg_en_store+0x50/0x68[ 161.864117@0] [<ffffffc001492424>] class_attr_store+0x3c/0x54[ 161.869809@0] [<ffffffc001238e50>] sysfs_kf_write+0x58/0x74[ 161.875330@0] [<ffffffc00123d15c>] kernfs_fop_write+0xf8/0x154[ 161.881110@0] [<ffffffc0011c1108>] vfs_write+0xac/0x1b4[ 161.886281@0] [<ffffffc0011c1b44>] SyS_write+0x50/0xb0[ 161.891368@0] aml_watchpoint_handler, pstate:20200145, mdscr:a001[ 161.897409@0] aml_single_step_handler, addr:ffffffc00256b500, esr:ce000022, awp:ffffffc05604f900[ 161.906118@0] aml_single_step_handler, pstate:200001c5, mdscr:a000
You can see pstate and mdscr print value, before leaving watch point hander, single step is enabled. In single step handler, single step is disabled, but after leaving single step_handler, this watch point is triggerd anain and again. Please help to have a look.