This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

CMSIS confusion

Note: This was originally posted on 27th October 2012 at http://forums.arm.com

Hello all
I have few questions about CMSIS. I have reviewed some libs with basic functions and there is no loops for rest of the samples (Cortex M3/4), for example (Abs function q31):

/*loop Unrolling */
  blkCnt = blockSize >> 2u;

  /* First part of the processing with loop unrolling.  Compute 4 outputs at a time.  
   ** a second loop below computes the remaining 1 to 3 samples. */
  while(blkCnt > 0u)
  {
    /* C = |A| */
    /* Calculate absolute of input (if -1 then saturated to 0x7fffffff) and then store the results in the destination buffer. */
    in = *pSrc++;
    *pDst++ = (in > 0) ? in : ((in == 0x80000000) ? 0x7fffffff : -in);
    in = *pSrc++;
    *pDst++ = (in > 0) ? in : ((in == 0x80000000) ? 0x7fffffff : -in);
    in = *pSrc++;
    *pDst++ = (in > 0) ? in : ((in == 0x80000000) ? 0x7fffffff : -in);
    in = *pSrc++;
    *pDst++ = (in > 0) ? in : ((in == 0x80000000) ? 0x7fffffff : -in);

    /* Decrement the loop counter */
    blkCnt--;
  }

  /* If the blockSize is not a multiple of 4, compute any remaining output samples here.  
   ** No loop unrolling is used. */
  blkCnt = blockSize % 0x4u;



I think there should be a loop for remaining samples. Can you explain why there is 4 operations in one loop iteration? Is it related with time optimization?

Best Regards
Parents
  • Note: This was originally posted on 27th October 2012 at http://forums.arm.com

    Hi Paul
    Yea, I have noticed that there is loop but comment says /* Run the below code for Cortex-M0 */, in some libs like arm_add_q31 there is a loop for Cortex M3/4 and second for M0, why? Your explanation about 4 samples in one loop iteration is clear for me :) thanks!

    Best Regards


Reply
  • Note: This was originally posted on 27th October 2012 at http://forums.arm.com

    Hi Paul
    Yea, I have noticed that there is loop but comment says /* Run the below code for Cortex-M0 */, in some libs like arm_add_q31 there is a loop for Cortex M3/4 and second for M0, why? Your explanation about 4 samples in one loop iteration is clear for me :) thanks!

    Best Regards


Children
No data