We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
Hi,
I am doing an image crop and writing to the destination. I am using a vector load and store of 8 uchar's. can someone help in optimizing this kernel . any mali G-72 gpu specific changes required?
uchar* src_y : source pointer to Y data of the image
uchar* dst_y : destination pointer to Y data of the image
uchar* src_uv : source pointer to UV data of the image
uchar* dst_uv : destination pointer to UV data of the image
dst_uv_h = image height/2 // for copying uv part along with y
global_size {dst_w,dst_h}; //destination width , destination height
int x = get_global_id(0) * 8; int y = get_global_id(1);
int src_pos = mad24(y, src_stride, x); // y*w+h for source position int dst_pos = mad24(y, dst_stride, x); //y*w+h for destination position vstore8(vload8(0, src_y + src_pos), 0, dst_y + dst_pos);
if (y < dst_uv_h) { vstore8(vload8(0, src_uv + src_pos), 0, dst_uv + dst_pos); //copy UV part of image }