Which type of quantization does ARM NPU support?

I want to know ARM NPU can support both symmetric and asymmetric quantization or just symmetric quantization.

Parents
  • There are many ways to do quantisation.


    First off, you generally need a way to represent negative values. You can do this in two ways - symmetric around zero (i.e. a standard signed integer), or asymmetric around zero (unsigned value + a zero offset per tensor which user provides). Apart from this we have the single scale for the tensor, or a separate scale per channel.

    Our neural networking hardware generally supports both methods. Note, as described in the TensorFLow Lite 8 bit quantization specification:
    "Note: In the past our quantization tooling used per-tensor, asymmetric, uint8 quantization. New tooling, reference kernels, and optimized kernels for 8-bit quantization will use this spec."

    Refer - Remove support for asymmetric uint8 quantization in Tensorflow Lite Micro (github.com/.../44912)

Reply
  • There are many ways to do quantisation.


    First off, you generally need a way to represent negative values. You can do this in two ways - symmetric around zero (i.e. a standard signed integer), or asymmetric around zero (unsigned value + a zero offset per tensor which user provides). Apart from this we have the single scale for the tensor, or a separate scale per channel.

    Our neural networking hardware generally supports both methods. Note, as described in the TensorFLow Lite 8 bit quantization specification:
    "Note: In the past our quantization tooling used per-tensor, asymmetric, uint8 quantization. New tooling, reference kernels, and optimized kernels for 8-bit quantization will use this spec."

    Refer - Remove support for asymmetric uint8 quantization in Tensorflow Lite Micro (github.com/.../44912)

Children
No data