We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
Does Mali suuport 8bit int vector operation to workaround overflow issue like scalar operation?
Such as..
I tested with G72.
In scalar operation,
--------------------------------
uchar a = 255;
uchar b = 255;
Int c = a + b;
It results 510 in c.
But in case of vector,
uchar4 a={255,255,255,255}
uchar4 b={255,255,255,255}
int4 c = a + b;
It prints wrong answer..
So my question is
1. Scalar operation uses general purpose register and it is 32bit register. That's why scalar operation results correctly. Am i right?
2. Why does Vector operation not support auto cast like scalar operation ? Does it not support general purpose register like in scalar operation?
3. I heard G52 and it supports int8 operation. Does it mean G52 supports 8bit vector register which resolve second case above?
Hello Unarmed guy As it's been a few days with no responses here, I'm moving this across to our Graphics & Multimedia forum, where there is more discussion of Mali.Many thanks,Georgia
Unarmed guy said: Scalar operation uses general purpose register and it is 32bit register. That's why scalar operation results correctly. Am i right?
How the hardware works is irrelevant really; this is just how the language specification is specified to behave.
Just like "normal" C programming, integer scalar types that are smaller than an int are promoted up to an int when an operation is performed on them. (Search for "integer promotion" in the OpenCL C spec).
int
uchar a = 255; uchar b = 255; int c = a + b;
... is effectively:
uchar a = 255; uchar b = 255; int c = ((int)a) + ((int)b);
Unarmed guy said:2. Why does Vector operation not support auto cast like scalar operation ? Does it not support general purpose register like in scalar operation?
... because the specification says so. See section "6.2.1 Implicit Conversions"; it explicitly states:
"Implicit conversions between built-in vector data types are disallowed".
To be honest, I'm actually surprised the code compiles at all - the conversion from a uchar4 sum to an int4 result is an implicit conversion so I would expect that to have generated a compile error.
HTH, Pete
To answer your third question about Mali-G52, then it adds a dedicated vector instruction for 8-bit integer dot product which effectively provides a cross-lane FMA for machine learning kernels. The instruction behaves as if all of the multiplication intermediates are 32-bits wide, so there is no clipping of the result.
See the following OpenCL extension for usage information in OpenCL kernels:
https://www.khronos.org/registry/OpenCL/extensions/arm/cl_arm_integer_dot_product.txt
Cheers, Pete