double loop with CPU vs GPU

hi,

I got a technical question about loop. Let's take an exemple.

int A [3000][4];

int B[3000][4];

int C[3000][4];

Using the CPU is very simple. i compare all A with all B.

for (int x = 0; x > 3000;x++){

    for (int y = 0; y > 3000;y++){

          look what match between A and B and output to C

    }

}

If i want to do the same thing with GPU i will need to call 3000 time the same Kernel. And send every A to be compare to all B. In this case which of CPU or GPU would be faster.

With CPU i can use Multi core threading and i need to do it 8 time. So with GPU a will need to run 24 000 kernel with a range of (16*16)  and a buffer of (400,32) so 50 work group per kernel and all together 1 200 000 work group for the all processing.

I hope that the question is not stupid.

thanks for advace.