Hi experts,
I want to knows why there are 4 core cores per cluster in ARM big.Littte architecture?
Is it possiable if we make more cores per cluster? if not, what is the limitation?
hi Ash & Peter:
so, are there no any technical limitations for more than 4 cores in a cluster ? for example, cache coherence or gic or anything ?
the only reason is "4 is enough" for performance ?
in the future, maybe we could design the soc with 8 cores in a cluster , more clusters ?
now , we can see helio X20 soc, which provides three clusters and 10 cores(4+4+2).
As Ash has already stated in his first answer, for the ARM Cortex-A family there is a limit of 4. There isn't any specific technical reason - it's just a design choice based on what we see our partners needing for their designs.
It's like asking "Why do cars have 4 wheels?" - a company could build a car with three wheels (Robin Reliant), or with six wheels (Tyrrell P34), but in most cases the answer is four. Why? It's enough to do what the car needs to do, so why add more.
The ARM architecture doesn't have a limit, so this could change in future, or a partner with an architecture license could build their own CPU today with as many cores in a cluster as they wanted.
HTH, Pete
So an IP-customer cannot just "plug" n Cortex-A53 together, but buys a single, dual or quad CA53 IP?
See, we just like to understand, why certain companies (as the before mentioned NXP) build chips with 8 cores but in 2 clusters.
BTW: Cars have mostly four wheels because of physical/technical reason. So there is surely a technical reason for the max. 4 cores/cluster choice. But it is ok, if ARM does not want to share this with everyone ;-)
42bis,
The technical reason is allowing more configurations at RTL synthesis time means more combinations to validate, which is a lot of work. There was a design decision to limit it to 4, a long time ago, for those products. You could imagine that the logic that connects the cores together only has 4 ports in the design. If you configure 2 cores, some of that logic is optimized away. But there are only 4 combinations to work out - 1, 2, 3 and 4 cores in a cluster. Imagine if we supported 32 cores in a cluster - that'd be 8x the work to validate it. To cover that we might decide that you can only have 1, 2, 3, 4, 8, 16 and 32 (which is not even twice the work). Aside from that you have to cover power consumption and area concerns with larger systems. That's the limitation. We only wish it was as simple as "copy & paste"
That's an extremely simple view of things, but you get the idea, right?
Ta,
Matt
Thanks Matt for the insights. I get the idea.