This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

why does LDR takes two cycle to be executed

Hello everyone,

I am currently working on a cortex-M0 microprocessor(LPC1114). I have looked through all the possible instruction descriptions but I did not find anyone of them explaining why some instructions takes two cycle to execute.

For example, ANDS, MOVS takes only one cycle to execute. but why do we need two cycles to execute LDR? and STR? 

Parents
  • Hi,

    In Cortex-M3 and Cortex-M4, the LDR for single load takes two cycles. This is documented in

    http://infocenter.arm.com/help/topic/com.arm.doc.100166_0001_00_en/ric1417175925887.html

    The Cortex-M3 and Cortex-M4 are three stage pipeline with simple fetch-decode-execute arrangement.

    The LDR's address cycle is the first cycle of execute, and the read data is available in the next cycle, hence the single LDR takes two cycles. For multiple load store instructions or back to back load we can detect the next operation is also a data memory access and generate the address for it while the pipeline is waiting for the first data. Therefore if the multiple load reads N data, it takes N+1 cycles.

    For stores, the address is also output in the execute stage, but the processor do not need to wait until the write is completed in the next cycle because there is a write buffer at the bus interface. Hence stores only take one cycle.

    The address generation unit use a combinatorial path to handle the address generation (Rn+Rm) or (Rn+offset) and output to the bus immediately without registering it. As a result it doesn't take an extra cycle, but it means the timing constraint on the bus interface is tight.

    Please note if you measure timing with DWT cycle counter with single step, the enable and disabling of that counter is not gurrantee to match the execution cycles at halting and unhalting.

    regards,

    Joseph

Reply
  • Hi,

    In Cortex-M3 and Cortex-M4, the LDR for single load takes two cycles. This is documented in

    http://infocenter.arm.com/help/topic/com.arm.doc.100166_0001_00_en/ric1417175925887.html

    The Cortex-M3 and Cortex-M4 are three stage pipeline with simple fetch-decode-execute arrangement.

    The LDR's address cycle is the first cycle of execute, and the read data is available in the next cycle, hence the single LDR takes two cycles. For multiple load store instructions or back to back load we can detect the next operation is also a data memory access and generate the address for it while the pipeline is waiting for the first data. Therefore if the multiple load reads N data, it takes N+1 cycles.

    For stores, the address is also output in the execute stage, but the processor do not need to wait until the write is completed in the next cycle because there is a write buffer at the bus interface. Hence stores only take one cycle.

    The address generation unit use a combinatorial path to handle the address generation (Rn+Rm) or (Rn+offset) and output to the bus immediately without registering it. As a result it doesn't take an extra cycle, but it means the timing constraint on the bus interface is tight.

    Please note if you measure timing with DWT cycle counter with single step, the enable and disabling of that counter is not gurrantee to match the execution cycles at halting and unhalting.

    regards,

    Joseph

Children