This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Cortex-M3 vs ARM7TDMI

Note: This was originally posted on 21st April 2011 at http://forums.arm.com

Trying to turn what was a simple hobby project based on Cortex-M3 with some serious enhancements/additions, into a semi-commercial product (might sell it at cost, or even opensource design/software), however I am at a crossroad of sorts, and need to decide whether I could stick to Cortex-M3 or move to ARM7TDMI, since those seem to be comparably priced, although from max DMIPS standpoint, are much lower than those on latest Cortex-M3s. My query is, how much of a factor is DMIPS in my circumstance.

Given some very rough (& not very scientific) estimates, I'd require around 120DMIPS of performance which I might be able to squeeze into say 80DMIPS if I go with commercial libraries & invest significantly (time/effort) in optimizing, which given the nature of the project (desire to sell at-cost or opensource) will be difficult, if not impossible. The point where I do hit the limits of Cortex-M3 is the onboard SRAM. This is where I am in some confusion, as to whether I should try a switch to ARM7TDMI ? Will the Cortex-M3's with External-Memory support, help address my issue ?

~Jay
Parents
  • Note: This was originally posted on 23rd April 2011 at http://forums.arm.com


    Joseph, the software based nested interrupt management is something I am familiar with from 8-bit MCU programming, but getting it right can be a lot of work.


    well..  Joseph is the guru here and I have respect to him, but still I disagree about the interrupts. I think that ARM7 has better interrupt system.

    The interrupt latency is an issue for simple 8-bit controllers. Here we are talking for controllers with dozens of peripherals. And since we have many, the core cannot always pay instant attention to every one. That is impossible by definition. That's why a typical ARM peripheral has FIFO or DMA or in other way is designed to work without asking too much from the CPU. So in most of the cases you just have to service each ISR once in a while and it doesn't matter if you do it 10 or 20 clocks earlier or later..

    The same logic applies to the IRQ handler. Yes, you need few instruction to dispatch but this will be disadvantage only if we have simple 8-bit system. But here we are talking about controllers with 5-6 UARTs for example. How you make the software? You write 5-6 different ISRs? Why? The hardware is the same, the ISRs should also be identical unless you want a mess.

    Also when you have many peripherals you need multitasking and ISRs should not mess up. If the ISRs are called directly (as vectors) you will have to put synchronization logic in every ISR. The same logic in many places. This is overhead... Instead I prefer a single IRQ handler where I have the synchronization logic, so the ISRs does not have to do anything special. They could be plain C functions and every function receives a pointer to the driver instance as a parameter. I just counted the handler instructions. Worst case on entrance is 13 instruction this includes the dispatch, context saving and also mode change to enable the interrupt nesting. On exit I have 5 or 9 instructions for the interrupt acknowledge and context change.

    Now the Cortex M3 requires substantially lower number of instructions. But that is not because of the IRQ Handler. It is due to the improved instruction set, cpu modes were removed so I don't have to change them and also the interrupts are acknowledged automatically. These are the real advantages. The individual IRQ vectors do not improve the speed. At least not in the my scenario with common ISR routine for a given driver class and common context processing.

    Sorry this was a little bit off topic ;-)

Reply
  • Note: This was originally posted on 23rd April 2011 at http://forums.arm.com


    Joseph, the software based nested interrupt management is something I am familiar with from 8-bit MCU programming, but getting it right can be a lot of work.


    well..  Joseph is the guru here and I have respect to him, but still I disagree about the interrupts. I think that ARM7 has better interrupt system.

    The interrupt latency is an issue for simple 8-bit controllers. Here we are talking for controllers with dozens of peripherals. And since we have many, the core cannot always pay instant attention to every one. That is impossible by definition. That's why a typical ARM peripheral has FIFO or DMA or in other way is designed to work without asking too much from the CPU. So in most of the cases you just have to service each ISR once in a while and it doesn't matter if you do it 10 or 20 clocks earlier or later..

    The same logic applies to the IRQ handler. Yes, you need few instruction to dispatch but this will be disadvantage only if we have simple 8-bit system. But here we are talking about controllers with 5-6 UARTs for example. How you make the software? You write 5-6 different ISRs? Why? The hardware is the same, the ISRs should also be identical unless you want a mess.

    Also when you have many peripherals you need multitasking and ISRs should not mess up. If the ISRs are called directly (as vectors) you will have to put synchronization logic in every ISR. The same logic in many places. This is overhead... Instead I prefer a single IRQ handler where I have the synchronization logic, so the ISRs does not have to do anything special. They could be plain C functions and every function receives a pointer to the driver instance as a parameter. I just counted the handler instructions. Worst case on entrance is 13 instruction this includes the dispatch, context saving and also mode change to enable the interrupt nesting. On exit I have 5 or 9 instructions for the interrupt acknowledge and context change.

    Now the Cortex M3 requires substantially lower number of instructions. But that is not because of the IRQ Handler. It is due to the improved instruction set, cpu modes were removed so I don't have to change them and also the interrupts are acknowledged automatically. These are the real advantages. The individual IRQ vectors do not improve the speed. At least not in the my scenario with common ISR routine for a given driver class and common context processing.

    Sorry this was a little bit off topic ;-)

Children
No data