Originally this blog post was intended to be all-in-one, but I was suggested to split it into smaller parts.
So what I'll do, is that I'll mention the features I'd like in my ARM processor, one at a time, piece by piece.
The purpose of this, is to throw in new ideas (good and bad) to the ARM engineers.
-Features, that may be able to make a difference, especially features, which would help the soft- and hardware developers in getting to new places.
Now let's start...
Currently, the only processor I know of, that supports 128-bit floating point calculation, is the PowerPC (combining two 64-bit registers).
If we had 128-bit floating point registers, we could calculate precision math very quickly.
I'd use such feature to make billions of planet gravity calculations per second.
These mainly include multiply and add, subtract and square-root calculations.
Having a high precision vector unit would definitely make insane performance boosts here.
I know we will get there some day (after Cortex-A57), but the sooner we'll get it, the sooner we'll get the cool end-results.
Perhaps it'll be the next Cortex-A, which can deliver an impressive performance when it comes to precision math, opening up further possibilities.
If you had a 128-bit precision floating point unit, what would you use it for - or what kind of things do you think it could be used for ?
IBM implement 128 bit in decimal floating point as well as binary. In some ways the decimal floating point is more saleable in that it can be used for financial transactions easily. The scaling might not be often used but when places have high inflation you can easily break the long int limit. And you'd also really want to be able to convert to packed decimal for COBOL, how long before ARM starts running COBOL programs? ;-)
For large calculations 128 bit binary floating point can be very useful, it is amazing how hard it is to even get a decent algorithm to find the roots of a quadratic accurately with the way errors can get magnified. They would also make it much easier to get the last bit exact in the maths library routines operating on doubles.
Now if we're talking about the moon how about some direct memory under the SIMD unit turning it into an old style bit array processor like the
ICL Distributed Array Processor - Wikipedia, the free encyclopedia
where all 32 registers would be changed at once, that would make it a supercomputer!
or for something a bit cheaper and more practical how about support for fast programmable associative look up which either quickly finds a match or lets the user fill an entry if it doesn't, this might help things like hash tables, dynamic languages method lookup or JIT branches.
I agree with Jens, I think that 128 bit register (for floats and more) would be very useful.
Especially with hardware packing/unpacking capability for conversions to other bases
(i.e. 3/4/5/6 bit video encodings, etc.)
Personally I would like to see more Analog Capabilities in the smaller & medium size ARM processors.
The only folks (that I'm aware of...) working on this at present are Cypress with their PSoC4 (M0) & PSoC5 (M3).
Adding programmable analog greatly reduces the power budget compared to using high speed DSP operations to
accomplish the same thing in the digital domain. To be more specific, I'd like to see :
1) programmable gain OpAmps
2) DAC's that can fed by DMA without processor intervention
3) ADC's that can be sampled using DMA without processor intervention
4) Switched-Capacitor Filters
5) Internal Routing multiplexers that don't require external pins to connect 1/2/3/4 together.
And I'd like to see those on a higher performance part, like an M4R/M4F type part.
My brother and I have discussed the use of 128-bit floats a few years back.
He needs them desperately and wanted to try out some things on my Mac, even though he's not at all interested in Mac/PPC.
He mentioned that he's disappointed with that intel long doubles are only 96 bit (which also sounds strange to me).
For now, 64-bit integer registers will be OK for a while. Having 128-bit integer registers in addition to the 128-bit float registers would make it easier to transfer values between the units, but perhaps not required.
As long as a 64-bit integer value could be transferred to a 64-bit float, I think we'd probably do fine there.
In addition to what I mentioned in the above post, there will be large benefits in audio compression as well. New video and audio formats will be invented, since compression will have higher ratio and the lossy compression quality wlll be improved too.
I must say that the idea jonnydoin mentions is very interesting. -If not reducing the number of simultaneous registers and register values, the idea is indeed worth looking further into.
Perhaps the register file could be 'semi shared', which means the integer registers could have two register-banks; one shared with the floating point unit and one private. I know that partly disagrees with Jonny's idea about saving silicon, but I would prefer being able to keep twice as many values in registers.
Funny you should mention this. After the 128-Bit rumour iandrew put right in 128 bits is 64 bits too many, I could see a thread starting on this on LinkedIn:
After 64 bits what about 128 bits ? | LinkedIn. You should be able to read without needing any account.