This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Too low speed inference of CNN in a i.mx6 Cortex A9 (TVM)

romankr over 4 years ago

Hello to all:

I'm using an i.mx6 with cortex A9 and when I run neural networks (MLP) written by me in plain C++ I got a difference from my I7 of 10 times which seems to be reasonable. My problem appears when I use a CNN (LFFD in my case) with TVM, then the difference between my computer and the i.mx6 if 40 times which is 4 times slower than my own written code. This is what I can't understand, why do I get this extra x4 when using TVM (I have also used TFLite with a similar outcome).

The original format of the file is .onnx and the conversion to TVM is done using optimization without performance loss.

Anyone knows how to overcome this issue?

Thanks in advance.

Top replies

Ronan Synnott over 4 years ago +1 suggested

It is interesting that you are seeing a difference of 4x - perhaps the inefficient implementation is not making use of the Neon instructions? https://developer.arm.com/architectures/instruction-sets...

Parents

0 Brian859 over 3 years ago

I have exactly the same problem. Anyone got a suggestion?
Cancel
Vote up 0 Vote down

Cancel

Reply

0 Brian859 over 3 years ago

I have exactly the same problem. Anyone got a suggestion?
Cancel
Vote up 0 Vote down

Cancel

Children

No data