This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Instruction timings - arm cortex m3

I am using the following 3 assembly sections to read a memory mapped i/o to multiple registers and to read same i/o and save it ram respectively , on an ARM Cortex M3. I want to know exactly how many CPU cycles this would take to complete. Or in other words how fast am I reading the register.

1) read to and save to memory: Can LDR-STR=LDR-STR be tightly pipelined (With Address Phase of one instruction overlapping Data Phase of previous instruction), in which case the following will take only 9 cycles ? 

     486:   781a      ldrb        r2, [r3, #0]

     488:   7002      strb        r2, [r0, #0]

     48a:   781a      ldrb        r2, [r3, #0]

     48c:   7042      strb        r2, [r0, #1]

     48e:  781a      ldrb        r2, [r3, #0]

     490:   7082      strb        r2, [r0, #2]

     492:   781a      ldrb        r2, [r3, #0]

     494:   70c2      strb        r2, [r0, #3]

2) read to multiple registers: I am assuming these instructions take 5 cycles.

     486:   781a      ldrb        r2, [r3, #0]

     48a:   781a      ldrb        r4, [r3, #0]

     48e:  781a      ldrb        r5, [r3, #0]

     492:   781a      ldrb        r6, [r3, #0]

I appreciate any insight you can provide.

Thanks,

Parents
  • This may depend on more than one thing. I think jyiu might be able to give you a more complete answer than I can provide.

    I think the I/O timing may depend on the vendor's implementation.

    As far as I remember, the instruction alignment is important.

    If you use any 32-bit load or store instructions (eg. ldrb.w or strb.w instead of ldrb.n or strb.n), then make sure the instructions are aligned on a 4-byte boundary.

    If an instruction is not aligned on a 4-byte boundary, I think you will not get the expected results.

    Thus ...

    If you're using only low registers (r0-r7), then you can use 16-bit instructions. If you're using a high register (r8-r15), then you need to use ldrb.w / strb.w instead.

    The assembler automatically selects the necessary instruction size, but you may explicitly add the .w or .n suffix.

    To make sure your instructions are aligned on a 4-byte boundary, you can use ...

         .align     2

    ... when using the GNU Assembler. Note 2 does not mean 2-byte alignment, it means (1 << 2) byte alignment.

    Thus I recommend that you solely use ldrb.w / strb.w in case any of your load or store instructions contain high registers, because aligning the instructions will insert a NOP, which usually cost you 1 clock cycle, so your timing will be affected.

    Other things you should know: Some devices have limits on their GPIO speeds. Some devices have high-speed GPIO pins, which follow the CPU speed, thus you have nothing to worry about. Some devices, such as STM32 devices can have the GPIO pin speed configured (you can choose between low, mid, high and very high speeds).

    Also, if you're using bit-bang, make sure no interrupts can disturb you while you're reading/writing - but you probably know that already.

    (If you have a dual-core configuration, then one core might also affect the timing of the other core; I believe this is due to memory read/write access; but as you're using a Cortex-M3, you're likely not using a dual core configuration).

Reply
  • This may depend on more than one thing. I think jyiu might be able to give you a more complete answer than I can provide.

    I think the I/O timing may depend on the vendor's implementation.

    As far as I remember, the instruction alignment is important.

    If you use any 32-bit load or store instructions (eg. ldrb.w or strb.w instead of ldrb.n or strb.n), then make sure the instructions are aligned on a 4-byte boundary.

    If an instruction is not aligned on a 4-byte boundary, I think you will not get the expected results.

    Thus ...

    If you're using only low registers (r0-r7), then you can use 16-bit instructions. If you're using a high register (r8-r15), then you need to use ldrb.w / strb.w instead.

    The assembler automatically selects the necessary instruction size, but you may explicitly add the .w or .n suffix.

    To make sure your instructions are aligned on a 4-byte boundary, you can use ...

         .align     2

    ... when using the GNU Assembler. Note 2 does not mean 2-byte alignment, it means (1 << 2) byte alignment.

    Thus I recommend that you solely use ldrb.w / strb.w in case any of your load or store instructions contain high registers, because aligning the instructions will insert a NOP, which usually cost you 1 clock cycle, so your timing will be affected.

    Other things you should know: Some devices have limits on their GPIO speeds. Some devices have high-speed GPIO pins, which follow the CPU speed, thus you have nothing to worry about. Some devices, such as STM32 devices can have the GPIO pin speed configured (you can choose between low, mid, high and very high speeds).

    Also, if you're using bit-bang, make sure no interrupts can disturb you while you're reading/writing - but you probably know that already.

    (If you have a dual-core configuration, then one core might also affect the timing of the other core; I believe this is due to memory read/write access; but as you're using a Cortex-M3, you're likely not using a dual core configuration).

Children