I am upgrading my operating system to Raspberry Pi 3B, the Cortex A53 processor. I am having trouble with the step command in my debugger. It appears that swapping the breakpoint instruction and the instruction it is placed on are not making it into the instruction cache. I tried the isb and dsp instructions to flush the cache, but this had no effect.
I have decided to try to implement the built in debug features of the Cortex A53 rather than fix the old strategy. So far I haven't succeeded with that, but I'm still working on it.
You might try to do I/D cache sync like
AArch64P1STR X11, [X1] ; X11 contains a new instruction to be stored in program memoryDC CVAU, X1 ; clean to PoU makes the new instruction visible to the instruction cacheDSB ISH ; ensures completion of the clean on all PEsIC IVAU, X1 ; ensures instruction cache/branch predictor discards stale dataDSB ISH ; ensures completion of the instruction cache/branch predictor; invalidation on all PEsSTR W0, [X2] ; sets flag to signal completionISB ; synchronizes context on this PEBR R1 ; branches to new code
P2-PxWAIT ([X2] == 1) ; waits for flag signalling completionISB ; synchronizes context on this PEBR X1 ; branches to new code
You should make sure that right barriers such as DSB, ISB is used.
One example in Linux for I/D cache sync can be found here,
https://elixir.bootlin.com/linux/latest/source/arch/arm64/mm/cache.S#L27
/* * __flush_cache_user_range(start,end) * * Ensure that the I and D caches are coherent within specified region. * This is typically used when code has been written to a memory region, * and will be executed. * * - start - virtual start address of region * - end - virtual end address of region */ SYM_FUNC_START(__flush_cache_user_range) uaccess_ttbr0_enable x2, x3, x4 alternative_if ARM64_HAS_CACHE_IDC dsb ishst b 7f alternative_else_nop_endif dcache_line_size x2, x3 sub x3, x2, #1 bic x4, x0, x3 1: user_alt 9f, "dc cvau, x4", "dc civac, x4", ARM64_WORKAROUND_CLEAN_CACHE add x4, x4, x2 cmp x4, x1 b.lo 1b dsb ish 7: alternative_if ARM64_HAS_CACHE_DIC isb b 8f alternative_else_nop_endif invalidate_icache_by_line x0, x1, x2, x3, 9f 8: mov x0, #0 1: uaccess_ttbr0_disable x1, x2 ret 9: mov x0, #-EFAULT b 1b SYM_FUNC_END(__flush_icache_range) SYM_FUNC_END(__flush_cache_user_range)
Thanks for this detailed answer. Unfortunately my code is all AArch32 and the dc and ic instructions are not available or need some translation. I'll have to look them up.
For aarch32,
AArch32
P1STR R11, [R1] ; R11 contains a new instruction to be stored in program memoryDCCMVAU R1 ; clean to PoU makes the new instruction visible to the instruction cacheDSB ; ensures completion of the clean on all PEsICIMVAU R1 ; ensures instruction cache discards stale dataBPIMVA R ; ensures branch predictor discards stale dataDSB ; ensures completion of the instruction cache and branch predictor; invalidation on all PEsSTR R0, [R2] ; sets flag to signal completionISB ; synchronizes context on this PEBX R1 ; branches to new code
P2-PxWAIT ([R2] == 1) ; waits for flag signalling completionISB ; synchronizes context on this PEBX R1 ; branches to new code
Thanks! That worked.
I'm glad that it worked.