Hello there,I was recently developing and model which is required to use some LSTM layers. I found that in deed Ethos-U65 is supporting LSTM: https://developer.arm.com/documentation/102023/0000/Programmers-model/Operators-and-performance/Supported-data-types-and-operators.
I stumble upon a fact that with most of the frameworks it is hard to quantize those layers as their performance is poor (after PTQ). Can you suggest what is the right path for LSTM layer integration with NPU? Maybe use of dynamic quantization?Thank you,Tymo
Hi Tymoteusz,
Thanks for raising the U65 related questions in Arm Community and sorry that this forum is not monitored very well.
I think the most suspicious point is that the LSTM model generated is decomposed but not fused, which can lead poor performance after vela processing then running on U65.
Kindly please check below flow for fused LSTM:
Thanks,
Will