This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Registers and Cache on M0

Hi,
     Coming from a games coder background, I always seek to find the very limits of what a CPU can do. Now we have PragmatIC and very cheap CPUs but much more importantly - vastly cheaper MROM (Mask ROM). With this in mind, I wanted to know how many registers I could REALLY use on the AtmelSAMD21 and found some very interesting ideas, and some very interesting possibilities.

For subroutines:

1)R14 (LR) can be stacked and unstacked - an extra register
2)R13 (SP) can be stored in memory as long as no interrupts can occur and the subroutine doesn't call anything.
3)R15 (IP) is not used on said Atmel product if the code and data of the subroutine are all in the cache.

These things work although that last one seems to have some rules that I am still divining - but not much use if it is technically part of the design errata.

Now i'm interested in the special registers, or rather the instructions themselves.(MRS,MSR). The ARM Infocenter notes that they perform a read-modify-write sequence and lists the special registers as:

APSR
IPSR
EPSR
IEPSR -
IAPSR
EAPSR
PSR
,
MSP
PSP
PRIMASK,
CONTROL

It appears that the field governing which special register is read or written to is a 5-bit field and what is more, for low-cost debug I'm guessing, if you select a value outside the range of the special registers, it acts on the general-purpose resisters. I'm interested in knowing if people can see optimizations in this. Code from Flash often has a 1-cycle penalty so a single instruction that performs a RMW in one instruction will be faster.

I know these are the extreme cases but getting a fixed-point implementation of .MP3 decode, for example, will really be scratching around for stray bus cycles. Plastic is 20 years behind silicon and will be very cheap so I'm jumping the gun a couple of years because I believe the M0 & M0+ running MBed will become the de facto baseline processor. It is only from bitter experience with MROMs (order 80000 units of Chuck Rock Jr for the Megadrive. Sell 60000 and you make a loss) that has put people off. Now, especially with simple CRC 10:8 MROM will provide a yield so close to 100% that it will be reborn.