in cortex-A series, most core has the BITC(branch target instruction cache). It is contained in the prefetch unit.
Branch Target Instruction Cache
The PFU also contains a four-entry deep Branch Target Instruction Cache
(BTIC). Each entry stores up to two instruction cache fetches and enables
the branch shadow of predicted taken B and BL instructions to be
eliminated. The BTIC implementation is architecturally transparent, so it
does not have to be flushed on a context switch.
How does the BTIC(branch target instruction cache) works?
You're very welcome. I am glad that my explanation helped.
1. Yes. BTIC stores instructions that immediately follow a conditional branch instruction in the program. When the branch is predicted taken these instructions will not executed but are fetched and stored in the BTIC anyway for later use in case the branch was mispredicted. If the branch is mispredicted then the instructions will be decoded and issued directly from the BTIC which speed execution.
- The BTAC (Branch Target Address Cache) is for predicting branch addresses and not branch outcome (i.e. taken or not). The branch outcome prediction is done in the BTB. BTAC is used by instructions that store the target address (partially or completely) in registers e.g. BX <reg> (an interesting observation is what happens in case of a POP or load instruction that overwrites the PC, does the BTAC and/or BTB keep track of these instructions in addition to the standard branches? Don't know the answer as I haven't gone this far yet in the ARM architecture). BTW, branches that code the branch target address or offset using an immediate value don't need to use the BTAC because the fetch unit can calculate the target address directly from the immediate value and the PC.
2. I am not sure if the BTIC is used exclusively by the B and BL instructions or by any conditional branch (as I would suppose it should be). The ARM documentation for some reasons only mentions B and BL.