I want to know ARM NPU can support both symmetric and asymmetric quantization or just symmetric quantization.
There are many ways to do quantisation.
First off, you generally need a way to represent negative values. You can do this in two ways - symmetric around zero (i.e. a standard signed integer), or asymmetric around zero (unsigned value + a zero offset per tensor which user provides). Apart from this we have the single scale for the tensor, or a separate scale per channel.
Our neural networking hardware generally supports both methods. Note, as described in the TensorFLow Lite 8 bit quantization specification: "Note: In the past our quantization tooling used per-tensor, asymmetric, uint8 quantization. New tooling, reference kernels, and optimized kernels for 8-bit quantization will use this spec."
Refer - Remove support for asymmetric uint8 quantization in Tensorflow Lite Micro (github.com/.../44912)