Hi experts and ARM designers,
I have found "ARM® Cortex®-M7 Processor Technical Reference Manual Revision r0p2" on the ARM site. By reading it I have a question. "Figure 1-3 Cortex-M7 functional diagram" shows all TCM accesses go through TCU. Does this mean CPU cannot access both ITCM and DTCM simultaneously? If it is correct, to locate DATA in DTCM is not useful in the performance view point because such accesses cannot be the Harvard Architecture. Is my understanding correct?
Best regards,Yasuhiko Koumoto.
I'm not 100% sure about the Cortex-M family, but on older ARM cores with TCM it is possible to access both concurrently (otherwise it is a little hard to justify using) - however there are a couple of corner cases.
The main corner case is where you have literal pool data stored in the ITCM, which means you have some data accesses into the ITCM; this can cost an extra cycle (you can't get literal pool data and an instruction at the same time). This is less of an issue with the newer ISA versions as you have fewer literal pool accesses (can use wide constant move instructions instead), and I believe some compilers allow you to forcefully disable literal pools in favour of using only wide constant moves.
HTH,
Pete
Hi Peter,
what I would like to know is the Cortex-M7 TCU feature. Also my question is simple. Can one access from PFU to ITCM and the other access from LSU to D0/1TCM occur concurrently? There is no explanation of TCU in the Cortex-M7 Technical Reference Manual other than its block diagram. The following picture is the Cortex-R5 TCM interface (from the Technical Reference Manual). From PFU and LSU to ATCM and BTCM, the bus matrix seems to be a multi-layer. I think it can be possible to make the concurrent accesses of both PFU to ATCM and LSU to BTCM. Is it the same as Cortex-M7? Can you have the answer?
Hi Peter and all,
I have found the following statements in the Cortex-M7 Technical Reference Manual.
5.7.3 TCM arbitration
Each TCM interface receives requests from the LSU, PFU, and AHBS. In most cases, the LSU has the highest priority, followed by the PFU, with the AHBS interface having lowest priority.When a higher-priority device is accessing a TCM interface, an access from a lower-priority device stalls.
From the statements, the concurrent access to both ITCM and DTCM seems to be impossible. Can anyone confirm it?
Hi all.
I have misunderstood Cortex-R specifications. In the Cortex-R5 case, it has the same arbitration scheme as Cortex-M7. Therefore Cortex-R and Cortex-M7 cannot make concurrent accesses to both ITCM and DTCM. To the contrary, the legacy ARM (such as ARM9 or ARM11) could make concurrent accesses to both ITCM and DTCM, I think. It is why the paths from PFU to ITCM and from LSU to DTCM are individual.
Hi Yasuhiko,
Depends on what you mean by simultaneous accesses, the following accesses can happen in parallel:
-PFU access to I-TCM
-LSU access to D0/1-TCM
In the TRM statement you quoted, please note the word "each":
Each TCM interface receives requests from the LSU, PFU, and AHBS.
So I-TCM has an arbiter that deal with multiple accesses, and same for D0/1-TCM which has a separated arbiter.
What you cannot do is
- PFU (instruction fetch) and LSU (data accesses) both access to I-TCM in the same cycle, or
- PFU and LSU both access to D0/1-TCM in the same cycle
regards,
Joseph
Hi Joseph,
thank you for your clarification. I was relieved. Cortex-M7 is great.
Best regards,
Yasuhiko Koumoto.