I want to the instruction time of every inst in disassembly window. for LPC1317 cortex M3 MCU.
Like MOV,LDRD. The stopwatch in keil doesn't seems to be correct.
Before jumping to conclusions, measure execution time of 100 nops in a row. This should reduce the influence of instruction cache and other effects.
If the core contains the ETM module, you should be able to enable and use DWT_CYCCNT to benchmark code.
Be conscious of APB/AHB bus speeds, flash wait states, and line buffering, DMA contention, et al. Test code under real world conditions using data sets that reflect expected utilization.