in cortex-A series, most core has the BITC(branch target instruction cache). It is contained in the prefetch unit.
Branch Target Instruction Cache
The PFU also contains a four-entry deep Branch Target Instruction Cache
(BTIC). Each entry stores up to two instruction cache fetches and enables
the branch shadow of predicted taken B and BL instructions to be
eliminated. The BTIC implementation is architecturally transparent, so it
does not have to be flushed on a context switch.
How does the BTIC(branch target instruction cache) works?