When we write such code as following, can keil compiler automaticly translate it into SIMD STM.
do{
*p++ = 1;
}while( i-- );
We write code like this expecting STM can be used to do some optimization.
If the compiler can do this, what the option it need? I think this is not the default behaviour.
Hi Zhi Yang,
One more thing I thought about while looking at your code. Is your pointer p pointing at 16-bit or 32-bit values? If 32-bit values, then you can do no better than the STM instruction or multiple store instructions. However, if it is 16-bit, then on the M4 you could take advantage of some of the 16-bit SIMD instructions and pack two 16-bit values into a single store. Let me know if you are accessing 16-bit values and if you are using the M3 or M4. If so, we could provide some example code that uses the SIMD intrinsics on the M4.
-Paul