This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

How long are the Cortex-M7 pipeline stages?

Hello experts,

recently ARM updated the Cortex-M7 information.

I think the biggest topic would be that the pipeline details were opened.

The new information says that the integer pipeline is 4 stage and the floating point pipeline is 5 stage.

However, the past information said that it was 6 stage.

From where this differences came?

I would like to know the concrete explanation for each stage.

What is the first stage, what is the second stage, what is the third stage, what is the fourth stage, and so on?

CM7_PIPE1.jpgCM7_PIPLE2.jpg

Best regards,

Yasuhiko Koumoto.

Parents
  • This is the 1s time for me to take a look at the Cortex-M7 so thanks for sharing this info. My first observation is that this pipeline diagram looks more like a CISC (instructions with different latencies) than a pure RISC pipeline hence the confusion. The shortest ALU operation takes 4 cycles which explains the 4-stage pipe. The FPU takes an additional cycle to access the FP-RF and therefore uses 5 stages.

    I would just ignore the old diagram as it has incorrect info (it might have been created by the marketing department without consultation with the eng. team). For example, the write/store in the ALU pipe is not a separate stage because writes are executed at the end of the execute stage. Same for the prefetch it is only activated when predicting branches so it is not really a separate pipeline stage. In normal program execution instructions are fetched in-order.

    PS: I doubt that ARM will share with you the architectural details for each pipeline stage.

    HBL

Reply
  • This is the 1s time for me to take a look at the Cortex-M7 so thanks for sharing this info. My first observation is that this pipeline diagram looks more like a CISC (instructions with different latencies) than a pure RISC pipeline hence the confusion. The shortest ALU operation takes 4 cycles which explains the 4-stage pipe. The FPU takes an additional cycle to access the FP-RF and therefore uses 5 stages.

    I would just ignore the old diagram as it has incorrect info (it might have been created by the marketing department without consultation with the eng. team). For example, the write/store in the ALU pipe is not a separate stage because writes are executed at the end of the execute stage. Same for the prefetch it is only activated when predicting branches so it is not really a separate pipeline stage. In normal program execution instructions are fetched in-order.

    PS: I doubt that ARM will share with you the architectural details for each pipeline stage.

    HBL

Children
No data