This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Int8 operation in G72

Does Mali suuport 8bit int vector operation to workaround overflow issue like scalar operation? 

Such as..

I tested with G72.

In scalar operation,

--------------------------------

uchar a = 255;

uchar b = 255;

Int c = a + b;

--------------------------------

It results 510 in c.

But in case of vector,

--------------------------------

uchar4 a={255,255,255,255}

uchar4 b={255,255,255,255}

int4 c = a + b;

--------------------------------

It prints wrong answer..

So my question is 

1. Scalar operation uses general purpose register and it is 32bit register. That's why scalar operation results correctly. Am i right? 

2. Why does Vector operation not support auto cast like scalar operation ? Does it not support general purpose register like in scalar operation?

3. I heard G52 and it supports int8 operation. Does it mean G52 supports 8bit vector register which resolve second case above?

  • Hello  As it's been a few days with no responses here, I'm moving this across to our Graphics & Multimedia forum, where there is more discussion of Mali.

    Many thanks,
    Georgia

  • Scalar operation uses general purpose register and it is 32bit register. That's why scalar operation results correctly. Am i right?

    How the hardware works is irrelevant really; this is just how the language specification is specified to behave.

    Just like "normal" C programming, integer scalar types that are smaller than an int are promoted up to an int when an operation is performed on them. (Search for "integer promotion" in the OpenCL C spec).

    uchar a = 255;
    uchar b = 255;
    int c = a + b;

    ... is effectively:

    uchar a = 255;
    uchar b = 255;
    int c = ((int)a) + ((int)b);

    2. Why does Vector operation not support auto cast like scalar operation ? Does it not support general purpose register like in scalar operation?

    ... because the specification says so. See section "6.2.1 Implicit Conversions"; it explicitly states:

    "Implicit conversions between built-in vector data types are disallowed".

    To be honest, I'm actually surprised the code compiles at all - the conversion from a uchar4 sum to an int4 result is an implicit conversion so I would expect that to have generated a compile error.

    HTH, 
    Pete

  • To answer your third question about Mali-G52, then it adds a dedicated vector instruction for 8-bit integer dot product which effectively provides a cross-lane FMA for machine learning kernels. The instruction behaves as if all of the multiplication intermediates are 32-bits wide, so there is no clipping of the result.

    See the following OpenCL extension for usage information in OpenCL kernels:

    https://www.khronos.org/registry/OpenCL/extensions/arm/cl_arm_integer_dot_product.txt

    Cheers, 
    Pete