This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Fastest way to transpose array in cortex-m4?

I've been seeing situations where I want to take a 32bit array of 32 elements and copy it so that all the bit 0s are copied into element 0, bit 1s copied into element 1, etc for all 32 elements.  This is always the processing bottleneck for applications that do this.  Is there a faster way to do this using cortex-m4 simd operations?  Is there a better name for what I'm trying to do that I could search with better results?

Thanks!