This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Modified harvard architecture

Hello,

I am preparing for the AAE. ARM is a modified Harvard architecture. What is this about ? I searched in internet and got various answers.

Can anyone tell me exactly what is this modified Harvard architecture in ARM's point of view ? Please point to any relevant literature.

Regards,

Deepu

Parents
  • Hi Deepu,

    Your source of information described the ARM processors as modified Harvard architecture because recent releases of high-performance versions of the ARM architecture provide separate instruction and data buses to the core but these buses access a unified memory space.

    That said, if you read the technical documents relating to these processors, the facts can confuse you. The list below designate some core as Harvard architecture while others are von Neumann:

    1.1.2 Memory access

    The ARM7TDMI core has a Von Neumann architecture, with a single 32-bit data bus carrying both instructions and data.

    1.1 About the ARM920T

    The ARM9TDMI processor core is a Harvard architecture device implemented using a five-stage pipeline consisting of Fetch, Decode, Execute, Memory, and Write stages.

    The ARM920T processor is a Harvard cache architecture processor that is targeted at multiprogrammer applications where full memory management, high performance, and low power are all-important. The separate instruction and data caches in this design are 16KB each in size, with an 8-word line length.

    The ARM920T interface to the rest of the system is over unified address and data buses.

    1.1 About the Cortex-M0 processor and core peripherals

    The Cortex-M0 processor is built on a high-performance processor core, with a 3-stage pipeline von Neumann architecture, making it ideal for demanding embedded applications.

    1.1 About the Cortex-M4 processor and core peripherals

    The Cortex-M4 processor is built on a high-performance processor core, with a 3-stage pipeline Harvard architecture, making it ideal for demanding embedded applications.

    1.1 About the Cortex-M7 processor and core peripherals

    The Cortex-M7 processor is built on a high-performance processor core, with a 6-stage pipeline Harvard architecture, making it ideal for demanding embedded applications.

    Regardless how a particular core is categorized, you can find that the processors access a unified memory space. Read the last line for the ARM920T above. Open the respective (ARM Architecture) Reference Manual and search for any of these keywords:

    address space, memory model, memory space

    you will be able to read

    unified or single, flat address space

    Based on my understanding the ARM architecture was originally of von Neumann architecture and even to this day the processor cores retain this inherent attribute. Advances in technology allowed the designers to employ a separate path for instruction and data from a single memory space to the core. As you can see from the list, Harvard architecture features are implemented only at the pipeline and cache and this modification is generally transparent to the programmer. Just note though that processors other than ARM such as the 80x86 and POWER  Architecture also implemented such a modification prior to ARM.

    This particular variant of modified Harvard architecture is where the ARM architecture can be generally classified but cores designed for lower cost might still use the von Neumann architecture. You can observe this from the list as the older ARM7TDMI and the relatively new but low cost Cortex-M0 are those which were categorized as von Neumann. Processors that implement a cache can be configured to have a Harvard or unified cache.

    It's very unlikely that I have told you "exactly what is this modified Harvard architecture in ARM's point of view" so please try to get more dependable information from these books:

    • ARM Architecture Reference Manual, Second Edition

    Edited by David Seal

    • The Definitive Guide to ARM Cortex-M0 and Cortex-M0+ Processors, Second Edition

    by Joseph Yiu

    • The Definitive Guide to ARM Cortex-M3 and Cortex®-M4 Processors, Third Edition

    by Joseph Yiu

    • Computer Organization and Design: The Hardware/Software Interface - ARM Edition, Fourth Edition

    by David Patterson and John Hennessy

    Good Luck on your AAE.

    Goodwin

Reply
  • Hi Deepu,

    Your source of information described the ARM processors as modified Harvard architecture because recent releases of high-performance versions of the ARM architecture provide separate instruction and data buses to the core but these buses access a unified memory space.

    That said, if you read the technical documents relating to these processors, the facts can confuse you. The list below designate some core as Harvard architecture while others are von Neumann:

    1.1.2 Memory access

    The ARM7TDMI core has a Von Neumann architecture, with a single 32-bit data bus carrying both instructions and data.

    1.1 About the ARM920T

    The ARM9TDMI processor core is a Harvard architecture device implemented using a five-stage pipeline consisting of Fetch, Decode, Execute, Memory, and Write stages.

    The ARM920T processor is a Harvard cache architecture processor that is targeted at multiprogrammer applications where full memory management, high performance, and low power are all-important. The separate instruction and data caches in this design are 16KB each in size, with an 8-word line length.

    The ARM920T interface to the rest of the system is over unified address and data buses.

    1.1 About the Cortex-M0 processor and core peripherals

    The Cortex-M0 processor is built on a high-performance processor core, with a 3-stage pipeline von Neumann architecture, making it ideal for demanding embedded applications.

    1.1 About the Cortex-M4 processor and core peripherals

    The Cortex-M4 processor is built on a high-performance processor core, with a 3-stage pipeline Harvard architecture, making it ideal for demanding embedded applications.

    1.1 About the Cortex-M7 processor and core peripherals

    The Cortex-M7 processor is built on a high-performance processor core, with a 6-stage pipeline Harvard architecture, making it ideal for demanding embedded applications.

    Regardless how a particular core is categorized, you can find that the processors access a unified memory space. Read the last line for the ARM920T above. Open the respective (ARM Architecture) Reference Manual and search for any of these keywords:

    address space, memory model, memory space

    you will be able to read

    unified or single, flat address space

    Based on my understanding the ARM architecture was originally of von Neumann architecture and even to this day the processor cores retain this inherent attribute. Advances in technology allowed the designers to employ a separate path for instruction and data from a single memory space to the core. As you can see from the list, Harvard architecture features are implemented only at the pipeline and cache and this modification is generally transparent to the programmer. Just note though that processors other than ARM such as the 80x86 and POWER  Architecture also implemented such a modification prior to ARM.

    This particular variant of modified Harvard architecture is where the ARM architecture can be generally classified but cores designed for lower cost might still use the von Neumann architecture. You can observe this from the list as the older ARM7TDMI and the relatively new but low cost Cortex-M0 are those which were categorized as von Neumann. Processors that implement a cache can be configured to have a Harvard or unified cache.

    It's very unlikely that I have told you "exactly what is this modified Harvard architecture in ARM's point of view" so please try to get more dependable information from these books:

    • ARM Architecture Reference Manual, Second Edition

    Edited by David Seal

    • The Definitive Guide to ARM Cortex-M0 and Cortex-M0+ Processors, Second Edition

    by Joseph Yiu

    • The Definitive Guide to ARM Cortex-M3 and Cortex®-M4 Processors, Third Edition

    by Joseph Yiu

    • Computer Organization and Design: The Hardware/Software Interface - ARM Edition, Fourth Edition

    by David Patterson and John Hennessy

    Good Luck on your AAE.

    Goodwin

Children
  • There's a slight addendum to that in that some embedded processors have TCM, tightly coupled memory, that is only attached to the instruction or data bus. These don't allow instruction fetches from the data TCM but to cut down problems do usually I believe allow data accesses to go to the instruction TCM.

  • Hi Goodwin,

    thanks a lot for your detailed explanation. It adds more light to my understanding. As part of AAE I am now focusing on Cortex-A series.

  • And to add to that the new ARMv8-M looks like it will support proper Harvard with completely separate code and data. It has a few extra instructions to make it easier to avoid having a literal pool in the code and they have execute only memory so you can protect the code area from data fetches. So what I was saying about wanting to cut down problems by allowing data access to instruction memory will no longer hold.

  • ARMv8-M will retain the modified Harvard architecture since this architecture is not impeded by the bottleneck in pure von Neumann architecture and the constraint in strict Harvard architecture. ARMv8-M will only add eXecute-Only-Memory (XOM) to protect the on-chip firmware from intrusive access.

    A related technique is familiar in processors classified under the other perspective of modified Harvard architecture. In those processors modified Harvard architecture means having separate address spaces for instruction and data; however, data can also be located along with instructions in the program memory. Processors under this definition of modified Harvard architecture include the 8051, AVR, Z86, ADSP-21xx, etc. Code protection by inhibiting instructions that read from (internal) program memory is provided through lock bit(s) that the user can set once the program is finalized.

    XOM will complement eXecute Never (XN) which, in contrast to XOM, prevents execution from a memory area which is supposed to contain pure data.

    In ARMv8-M, addresses still point to a unified memory space. Only section(s) of the memory space will be defined as XOM, the rest (of the memory space) will operate as how they worked in previous versions of ARM.

  • Yes I should have said could support Harvard rather than will. There's always the problem of either loading up the code memory from flash which means there is a data path to the code memory - or executing directly from flash in which case one would really want to be able to access data there as well. So there is a data path to the code memory to which access must be restricted. This I guess is where Trustzone comes in, it must be able to restrict changes to the XOM property. I guess one could use a different address range for data access but that is messy. I look forward to seeing more details about the ARMv8-M additions and how they all fit together - I heard they loosed some hackers on it to test out the security.

  • goodwin wrote:

    As you can see from the list, Harvard architecture features are implemented only at the pipeline and cache and this modification is generally transparent to the programmer.

    As an addition to that, while it is still technically transparent to the programmer, it is possible that the type of access actually makes it onto the unified bus -- AHB HPROT[0] and AXI AxPROT[2] signals will usually tell you, if it is at all differentiated once it gets past the core, whether it originated as an instruction or data access. Obviously this makes far more sense on the read path (towards the core) where either the data or instruction side may be reading what it intends to be instructions or data -- however a unified cache at any level may obfuscate this. A write transaction over the bus would never be marked as an instruction access even if the data contains instruction opcodes, because explicit loads and stores are data accesses, and an instruction cache has no business writing to memory.

    Where it may not be transparent to the programmer is if particular peripherals within the SoC design actually respected PROT[2] and denied or somehow differently arbitrated an instruction access. For instance, it makes no sense at all for instructions to be fetched from a DMA controller peripheral register set, but it does from a (RAM or ROM) memory region. One may mark particular areas of Flash peripherals as instruction or data only access, depending on the peripheral design. In that sense, you would see actual feedback as to whether the access was valid or not, and therefore some semblance of a Harvard architecture.

    The real distinction is that a modified Harvard architecture still has separated instruction and data paths, but doesn't split the memory space. In a true Harvard architecture, a data access could never happen to the instruction address space, and an instruction fetch could never happen to the data address space -- in theory they'd be completely demarcated. This is really inconvenient for self-modifying code, however and makes loading applications on-demand really complicated, so pretty much all modern computing steers well clear of it.

    For the Cortex-R and Cortex-M where there are TCMs, the address space for the instruction and data TCM is exactly the same -- although some cores allow overlapping TCM regions, it is usual for data accesses to be prioritized above instruction accesses meaning you see a different view from an instruction fetch perspective than you would from a load/store or a bus-generated access for the same address.