ARM VETX.32 q1,q1,q1,#3 Slow

Hi, i am trying to do a bit wise rotate VETX 32. in place qx,qx,qx,#3  on an ARM7A but i notice the instructions is very slow.  Any guys have a tips to optimize ?

So i can replace the instruction ?

Thanks a lot


