When I read the materials about G76, especially about its texture unit. I encountered a question about its performance in doing 2-D Bi-linear interpolation. It was said that for Mali-G76 the best case performance (bi-linear filtered samples) is 0.5 cycles per sample. Some other related descriptions were given as well, like that this counter increments for every texture filtering issue cycle, that some instructions take more than one cycle due to multi-cycle data access and filtering operations, and that the costs per 4 sample quad are:(i) 2D bilinear filtering takes two cycles, (ii)2D trilinear filtering takes four cycles; (iii) 3D bilinear filtering takes four cycles, and (iv) 3D trilinear filtering takes eight cycles. So my question is whether the OpenCL API read_imagef() and write_imagef() has the same performance in using texture unit on 2-D bi-linear interpolation.
Hi xwentian, The imageLoad/imageStore for all current Mali GPUs path runs at 1 sample per cycle.
Kind regards, Pete
Thanks for your explanation. It will take 1 cycle to load raw data, 0.5 cycle for bi-linear filtering, and 1 cycle to keep the result on one sample, so totally it will take 2.5 cycles per pixel?
View all questions in Graphics and Gaming forum