Hello,
I am preparing for the AAE. ARM is a modified Harvard architecture. What is this about ? I searched in internet and got various answers.
Can anyone tell me exactly what is this modified Harvard architecture in ARM's point of view ? Please point to any relevant literature.
Regards,
Deepu
Hi Deepu,
Your source of information described the ARM processors as modified Harvard architecture because recent releases of high-performance versions of the ARM architecture provide separate instruction and data buses to the core but these buses access a unified memory space.
That said, if you read the technical documents relating to these processors, the facts can confuse you. The list below designate some core as Harvard architecture while others are von Neumann:
1.1.2 Memory access
The ARM7TDMI core has a Von Neumann architecture, with a single 32-bit data bus carrying both instructions and data.
1.1 About the ARM920T
The ARM9TDMI processor core is a Harvard architecture device implemented using a five-stage pipeline consisting of Fetch, Decode, Execute, Memory, and Write stages.
The ARM920T processor is a Harvard cache architecture processor that is targeted at multiprogrammer applications where full memory management, high performance, and low power are all-important. The separate instruction and data caches in this design are 16KB each in size, with an 8-word line length.
The ARM920T interface to the rest of the system is over unified address and data buses.
1.1 About the Cortex-M0 processor and core peripherals
The Cortex-M0 processor is built on a high-performance processor core, with a 3-stage pipeline von Neumann architecture, making it ideal for demanding embedded applications.
1.1 About the Cortex-M4 processor and core peripherals
The Cortex-M4 processor is built on a high-performance processor core, with a 3-stage pipeline Harvard architecture, making it ideal for demanding embedded applications.
1.1 About the Cortex-M7 processor and core peripherals
The Cortex-M7 processor is built on a high-performance processor core, with a 6-stage pipeline Harvard architecture, making it ideal for demanding embedded applications.
Regardless how a particular core is categorized, you can find that the processors access a unified memory space. Read the last line for the ARM920T above. Open the respective (ARM Architecture) Reference Manual and search for any of these keywords:
address space, memory model, memory space
you will be able to read
unified or single, flat address space
Based on my understanding the ARM architecture was originally of von Neumann architecture and even to this day the processor cores retain this inherent attribute. Advances in technology allowed the designers to employ a separate path for instruction and data from a single memory space to the core. As you can see from the list, Harvard architecture features are implemented only at the pipeline and cache and this modification is generally transparent to the programmer. Just note though that processors other than ARM such as the 80x86 and POWER Architecture also implemented such a modification prior to ARM.
This particular variant of modified Harvard architecture is where the ARM architecture can be generally classified but cores designed for lower cost might still use the von Neumann architecture. You can observe this from the list as the older ARM7TDMI and the relatively new but low cost Cortex-M0 are those which were categorized as von Neumann. Processors that implement a cache can be configured to have a Harvard or unified cache.
It's very unlikely that I have told you "exactly what is this modified Harvard architecture in ARM's point of view" so please try to get more dependable information from these books:
Edited by David Seal
by Joseph Yiu
by David Patterson and John Hennessy
Good Luck on your AAE.
Goodwin
goodwin wrote:As you can see from the list, Harvard architecture features are implemented only at the pipeline and cache and this modification is generally transparent to the programmer.
goodwin wrote:
As you can see from the list, Harvard architecture features are implemented only at the pipeline and cache and this modification is generally transparent to the programmer.
As an addition to that, while it is still technically transparent to the programmer, it is possible that the type of access actually makes it onto the unified bus -- AHB HPROT[0] and AXI AxPROT[2] signals will usually tell you, if it is at all differentiated once it gets past the core, whether it originated as an instruction or data access. Obviously this makes far more sense on the read path (towards the core) where either the data or instruction side may be reading what it intends to be instructions or data -- however a unified cache at any level may obfuscate this. A write transaction over the bus would never be marked as an instruction access even if the data contains instruction opcodes, because explicit loads and stores are data accesses, and an instruction cache has no business writing to memory.
Where it may not be transparent to the programmer is if particular peripherals within the SoC design actually respected PROT[2] and denied or somehow differently arbitrated an instruction access. For instance, it makes no sense at all for instructions to be fetched from a DMA controller peripheral register set, but it does from a (RAM or ROM) memory region. One may mark particular areas of Flash peripherals as instruction or data only access, depending on the peripheral design. In that sense, you would see actual feedback as to whether the access was valid or not, and therefore some semblance of a Harvard architecture.
The real distinction is that a modified Harvard architecture still has separated instruction and data paths, but doesn't split the memory space. In a true Harvard architecture, a data access could never happen to the instruction address space, and an instruction fetch could never happen to the data address space -- in theory they'd be completely demarcated. This is really inconvenient for self-modifying code, however and makes loading applications on-demand really complicated, so pretty much all modern computing steers well clear of it.
For the Cortex-R and Cortex-M where there are TCMs, the address space for the instruction and data TCM is exactly the same -- although some cores allow overlapping TCM regions, it is usual for data accesses to be prioritized above instruction accesses meaning you see a different view from an instruction fetch perspective than you would from a load/store or a bus-generated access for the same address.
View all questions in Classic processors forum