hi,
i am trying to evaluate and compare the performance of cm4 and cm7 by running CMSIS function arm_dot_q15. the cycle count is obtained as follows
cm4 ======>934
cm7======>934
i understand that performance of cm7 should be better than cm4
please explain
The memory system is not simulated. Thus, extra cycles required for memory accesses are done through the following code
// variable definitions uint32_t clock_cycles_counter; volatile unsigned int *DWT_CYCCNT = (uint32_t *)0xE0001004; //address of the register volatile unsigned int *DWT_CONTROL = (uint32_t *)0xE0001000; //address of the register volatile unsigned int *SCB_DEMCR = (uint32_t *)0xE000EDFC; //address of the register // configure and start the clock cycles counter clock_cycles_counter = 0; *SCB_DEMCR = *SCB_DEMCR | 0x01000000; *DWT_CYCCNT = 0; *DWT_CONTROL |= 1; algorithm(); // stop and get the counter value *DWT_CONTROL &= ~1; clock_cycles_counter = *DWT_CYCCNT; // print the counter value printf("%d\n\r", clock_cycles_counter);
can someone please suggest on the above query