Efficient Winograd or Cook-Toom Convolution Kernel Implementation on Widely Used Mobile CPUs

Anonymous