Hi.
I'm an SoC design engineer and want to provide my customers with Mali-G51's ML nominal performance in terms of GFLOPS or GOPS or GMACS.
My GPU is Mali-G51MP4 at 800MHz and followings are information I get from articles.
- G51MP4 = 6 execution engines.
- Each execution engine = 4 SIMD Lanes.
- Each Lane = 4 MACs per cycle (not sure which data type is assumed.. fp32 or fp16 or int8 ???)
Then, my calculation of the nominal(ideal) ML processing capability is,
(6 engine/MP4) x (4 lane/engine) x (4 MACs/lane) x 800MHz = 76.8 GMACs/second = 153.6 GFLOPs/second.
Is it correct?
If the data type is fp32 above, then can I expect double the numbers above in case of fp16 ?
How about the case of int8 data type and GOPs calculation?