AI and ML forum Which type of quantization does ARM NPU support?

State Accepted Answer
Locked Locked
Replies 1 reply
Subscribers 12 subscribers
Views 2295 views
Users 0 members are here

Options

Related

How was your experience today?

This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Which type of quantization does ARM NPU support?

PYC over 3 years ago

I want to know ARM NPU can support both symmetric and asymmetric quantization or just symmetric quantization.

Top replies

Sandeep Singh over 3 years ago +2 verified

PYC There are many ways to do quantisation. First off, you generally need a way to represent negative values. You can do this in two ways - symmetric around zero (i.e. a standard signed integer), or...

Parents

+1 Sandeep Singh over 3 years ago

PYC

There are many ways to do quantisation.

First off, you generally need a way to represent negative values. You can do this in two ways - symmetric around zero (i.e. a standard signed integer), or asymmetric around zero (unsigned value + a zero offset per tensor which user provides). Apart from this we have the single scale for the tensor, or a separate scale per channel.

Our neural networking hardware generally supports both methods. Note, as described in the TensorFLow Lite 8 bit quantization specification:
"Note: In the past our quantization tooling used per-tensor, asymmetric, uint8 quantization. New tooling, reference kernels, and optimized kernels for 8-bit quantization will use this spec."

Refer - Remove support for asymmetric uint8 quantization in Tensorflow Lite Micro (github.com/.../44912)
Cancel
Up +2 Down

Cancel

Reply

+1 Sandeep Singh over 3 years ago

PYC

There are many ways to do quantisation.

First off, you generally need a way to represent negative values. You can do this in two ways - symmetric around zero (i.e. a standard signed integer), or asymmetric around zero (unsigned value + a zero offset per tensor which user provides). Apart from this we have the single scale for the tensor, or a separate scale per channel.

Our neural networking hardware generally supports both methods. Note, as described in the TensorFLow Lite 8 bit quantization specification:
"Note: In the past our quantization tooling used per-tensor, asymmetric, uint8 quantization. New tooling, reference kernels, and optimized kernels for 8-bit quantization will use this spec."

Refer - Remove support for asymmetric uint8 quantization in Tensorflow Lite Micro (github.com/.../44912)
Cancel
Up +2 Down

Cancel

Children

No data