Cortex A9 MMU Cache initialization

Bare metal project, trying to get the right sequence of operations for initializing the MMU and L1, L2 caches. This is my current code:

uint32_t _mmu_init(void)
{
uint32_t *L1_TBL_ptr;
KERNEL_DATA_STRUCT_PTR kernel_data;
uint32_t sctlr;
uint32_t dacr;
uint32_t actlr;


#if PSP_HAS_DATA_CACHE && PSP_HAS_MMU
_int_disable();

_L2_pl310_cache_disable();
_dcache_invalidate();
_icache_invalidate();

// Enable dcache prefetch
actlr = __MRC(15, 0, 1, 0, 1);
actlr |= BM_ACTLR_PP;
__MCR(15, 0, actlr, 1, 0, 1);

// Enable Branch Prediction
sctlr = __MRC(15, 0, 1, 0, 0); //get sctlr
sctlr |= BM_SCTLR_Z; //set branch prediction enable
__MCR(15, 0, sctlr, 1, 0, 0); //write it back
__DSB();
__ISB();

// Allocate L1 cache MMU table
L1_TBL_ptr = _mem_alloc_align(MMU_L1_TBL_SIZE, MMU_L1_TBL_ALIGN);

// Clear the table - sets all sections to FAULT (see section 9.4)
_mem_zero(L1_TBL_ptr, MMU_L1_TBL_SIZE);

// Write the L1 table address to the TTBR0
__MCR(15, 0, (uint32_t)L1_TBL_ptr, 2, 0, 0);

// Save table address in kernel data
kernel_data->L1_TBL_PTR = L1_TBL_ptr;

// Create the basic translation table entries that are known - ROM boot area,
// peripherals, ROM, RAM, etc. These will be part/design specific
_mmu_iMX6DQ(kernel_data->CORE_ID);

// Set Client mode for all Domains - see section 9.6.4
dacr = 0x55555555;
__MCR(15, 0, dacr, 3, 0, 0); // Write DACR

// Enable SMP
actlr = __MRC(15, 0, 1, 0, 1);
actlr |= BM_ACTLR_SMP;
__MCR(15, 0, actlr, 1, 0, 1);

// Enable the SCU - see section 2.2.1 in the ARM A9 MPCore TRM
// SCU_CNTRL |= SCU_CTRL_ENB;

_int_enable();
#endif

__DSB();
__ISB();

_dcache_enable();
_icache_enable();

//Invalidate the TLB - see sec B4.2.2 in the ARMv7 Architecture RM
__MCR(15, 0, 0, 8, 7, 0);

// Enable the memory management unit
sctlr = __MRC(15, 0, 1, 0, 0); //get sctlr
sctlr = sctlr | BM_SCTLR_M; //set MMU enable bit
__MCR(15, 0, sctlr, 1, 0, 0); //write modified sctlr

_int_enable();

return 0;

Processor is an iMX6Q. Used memory is flat mapped. All of the MMU table entries are 0 (fault) except the following:

0-0x00900000 1MB section sizes, device, RW all, sharable - boot rom and some peripherals

00900000 - 00A00000 1MB section, strongly ordered, RW all, shareable - OCRAM

00A00000 - 10000000 1MB sections device, RW all shareable

10000000-10100000 - 1MB section, WB, RO all - 1MB code memory for core 0

10100000 - 1MB section, WB all, RW, nX - 1MB data memory for core 0

I've looked at the MMU table, and the used entries appear correct. The error comes when I enable interrupts (next to last line). A prefetch exception occurs (vector table is at 0x1000008C0).

Trying to determine what's going on. Any ideas?