I trained a model using TF2 and Keras. The model includes the following layers (from tf.keras.layers):
I first trained the mode using the fit() function, then I performed a quantization-aware training using Tensorflow Model Optimization API and converted to .tflite:
# disable quantization for dense layerannotated_model = tf.keras.models.clone_model(model, clone_function=custom_quantization)
# disable quantization for dense layer
annotated_model = tf.keras.models.clone_model(model, clone_function=custom_quantization)
q_aware_model = tfmot.quantization.keras.quantize_apply(annotated_model)
q_aware_model.compile(...)
q_aware_model.compile(...
)
q_aware_model.fit(...)
q_aware_model.fit(...
q_aware_model.save('q_model.hdf5')converter = tf.lite.TFLiteConverter.from_keras_model(q_aware_model)converter.optimizations = [tf.lite.Optimize.DEFAULT]
q_aware_model.save('q_model.hdf5')
converter = tf.lite.TFLiteConverter.from_keras_model(q_aware_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
quantized_tflite_model = converter.convert()open("converted_model.tflite", "wb").write(quantized_tflite_model)
quantized_tflite_model = converter.convert()
open("converted_model.tflite", "wb").write(quantized_tflite_model)
I tried to perform inference using PyArmnn with Npu/CpuAcc with this code:
parser = ann.ITfLiteParser() network = parser.CreateNetworkFromBinaryFile(path) graph_id = 0 input_names = parser.GetSubgraphInputTensorNames(graph_id) input_binding_info = parser.GetNetworkInputBindingInfo(graph_id, input_names[0]) input_tensor_id = input_binding_info[0] input_tensor_info = input_binding_info[1] options = ann.CreationOptions() runtime = ann.IRuntime(options)
parser = ann.ITfLiteParser()
network = parser.CreateNetworkFromBinaryFile(path)
graph_id = 0
input_names = parser.GetSubgraphInputTensorNames(graph_id)
input_binding_info = parser.GetNetworkInputBindingInfo(graph_id, input_names[0])
input_tensor_id = input_binding_info[0]
input_tensor_info = input_binding_info[1]
options = ann.CreationOptions()
runtime = ann.IRuntime(options)
preferredBackends = [ann.BackendId('VsiNpu'), ann.BackendId('CpuAcc'), ann.BackendId('CpuRef')] opt_network, messages = ann.Optimize(network, preferredBackends, runtime.GetDeviceSpec(), ann.OptimizerOptions())
preferredBackends = [ann.BackendId('VsiNpu'), ann.BackendId('CpuAcc'), ann.BackendId('CpuRef')]
opt_network, messages = ann.Optimize(network, preferredBackends, runtime.GetDeviceSpec(), ann.OptimizerOptions())
However I got the following warnings:
RuntimeError: WARNING: Layer of type Quantize is not supported on requested backend VsiNpu for input data type Float32 and output data type QAsymmS8 (reason: Npu quantize: output type not supported.), falling back to the next backend.WARNING: Layer of type Convolution2d is not supported on requested backend VsiNpu for input data type QAsymmS8 and output data type QAsymmS8 (reason: Npu convolution2d: Uint8UnbiasedConvolution not supported.Npu convolution2d: input is not a supported type.Npu convolution2d: output is not a supported type.Npu convolution2d: weights is not a supported type.Npu convolution2d: input and weights types mismatched.), falling back to the next backend.WARNING: Layer of type Activation is not supported on requested backend VsiNpu for input data type QAsymmS8 and output data type QAsymmS8 (reason: Npu activation: input type not supported.Npu activation: output type not supported.), falling back to the next backend.
And also this error:
ERROR: Layer of type Mean is not supported on any preferred backend [VsiNpu CpuAcc CpuRef ]
What is the reason behind this? Is there a fix?
Hey! Quick question and response:
What NPU are you using for this?
Fundamentally, those layers are supported by that hardware. That's the translation of the error. There are 2 options. both of which are tricky.
1. Add support to the TF Parser for ArmNN
2. Use the TFLite delegate, which is newer, and I haven't tried, but you can check this:
https://arm-software.github.io/armnn/21.02/delegate.xhtml
Currently looking into some of your other errors, though, so I'll get back to you.