This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

About mali-g76 MP12 GPU and micro-architecture

About mali-g76 MP12 GPU and micro-architecture.

1) Do they context switch between warps to hide memory access latency when the kernel has memory operations ??

2) I saw the datasheet max thread count is 768, is it right 256 threads per execution engine?

and as I know, they have 8 lanes (8-wide warp) per execution engine. how they can run 768 threads simultaneously?? (with context switching? or  are they more lanes?...)

I want to understand the process to execute threads in aspect of  micro-architectrue.

3) If they can run 768 threads simultaneously and the work-group size is only 24, do they run 24 warps(8-wide warps*3 engine) with same work_group id per core?

if work-group size is 8, remain lanes(24 lanes-8 = 16 lanes) don't work?

(in case of Nvidia, multiple warps with the same work-group per SM)

please help me ~!