Please note: We are aware of an issue affecting replies on the Arm Community forums, which may not be loading as expected.
We apologize for any inconvenience and appreciate your patience while we investigate and work to resolve the issue.
Thank you for your understanding.
In this discussion Peter Harris explained that the ARM Mali g72 MP3 gpu can run 1152 threads concurrently. Can someone please explain where did this number come from I am just starting to learn about this stuff and all I can understand from the specs is that it has 3 Cores which I thought is very low for any parallelization
appreciate the fast reply but can you please elaborate more on this ? this 384 number is per core right ? and to my understanding every core has 3 execution engines that can group instructions into a group of 4 so 3*4 is 12 per core why there's 384 ?
384 threads per core.
3 engines = 128 threads per engine.
4-wide warp = 32 warps per engine, with 4 threads per warp.
Oh thanks so much, but a final question though, Where exactly does it say there's 32 warps per engine ? so I can read more on this
It doesn't - but you can work it out from the other counts. (threads)/(engines * warp width)