HW_UINT32 dtcm, itcm;HW_UINT32 enable_memory_sharing;#define HW_TCM_ENABLE 0x1#define HW_ITCM_SIZE_32K 6#define HW_DTCM_SIZE_16K 5 enable_memory_sharing = 0; __asm__ __volatile__("MRC p15, 0, %0, c9, c1, 0":"=r"(dtcm)); dtcm |= ( 0x10104000 | (HW_DTCM_SIZE_16K << 2) | HW_TCM_ENABLE ); __asm__ __volatile__("MCR p15, 0, %0, c9, c1, 0"::"r"(dtcm)); enable_memory_sharing |= AT91C_CCFG_DTCM_SIZE_16KB; __asm__ __volatile__("MRC p15, 0, %0, c9, c1, 1":"=r"(itcm)); itcm |= ( 0x10108000 | (HW_ITCM_SIZE_32K << 2) | HW_TCM_ENABLE ); __asm__ __volatile__("MCR p15, 0, %0, c9, c1, 1"::"r"(itcm)); enable_memory_sharing |= AT91C_CCFG_ITCM_SIZE_32KB; AT91C_BASE_CCFG->CCFG_TCMR = enable_memory_sharing;
TCM is only the same speed as cache *if* the SRAM provided in the ASIC is single cycle access, zero wait-state memory. If the design is using the bulk SRAM to provide the TCM, there is no reason why the TCM should suddely be faster than the SRAM - the physical RAM hasn't changed. If the SRAM is (for example) three cycle access, I'd expect TCM to be three cycle access. Could this mean that the TCM is slower than the cache - yes certainly.