hi,
I do not know if using NPU could be interesting for what i would like to do.
So i explain my need.
1) i want to to compare matrice 64*64 with mask of data for comparaison.
2) until now i use CPU and SIMD to do this like a simple double loop on array [64][64][number of form to compare = 64]
3) using GPU i could do the same but in this case i will need to and i flag for each form because gpu i random processing. So i do not think it will be faster. I try and it is not. but i may be wrong with the way i implemnted the kernel and organized the data.
4) If i where using NPU will i get better performance ?
I readed many things about NPU and i anderstand that it can be faster and using less energy. But it is used for CNN model and as i anderstoud, some calculation are faster because they integrated ALU unit of calculation. But i do not need all that staff, i do it in another way than CNN.
So in my case will NPU would be usefull ?
thans in advance.
thanks,
just for précision. What would be the workloads good size ? and until wich size it would not be usefull.