This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Cortex M4 (SIMD) - Fastest way to un-pack 1 (one) uint32 to 4 (four) uint8

Hi to you all,
In my current project I need to send over a serial bus an array of integers:

  • type = unsigned 32 bit integers
  • length = 4096

The driver I'm using (actually USB CDC VCOM from NXP, which is embedded in LPCOpen) takes pointer to unit8 and does a bulk transfer using DMA. I can output long strings, no problem. But of course I need to output data and so 32 bit integers.

Here's the function I'm trying to call:

   /** \fn uint32_t WriteEP(USBD_HANDLE_T hUsb, uint32_t EPNum, uint8_t *pData, uint32_t cnt)
   *  Function to write data to be sent on the requested endpoint.
   *
   *  This function is called by USB stack and the application layer to send data
   *  on the requested endpoint.
   *  
   *  \param[in] hUsb Handle to the USB device stack. 
   *  \param[in] EPNum  Endpoint number as per USB specification. 
   *                    ie. An EP1_IN is represented by 0x81 number.
   *  \param[in] pData Pointer to the data buffer from where data is to be copied. 
   *  \param[in] cnt  Number of bytes to write. 
   *  \return Returns the number of bytes written.
   */
 uint32_t (*WriteEP)(USBD_HANDLE_T hUsb, uint32_t EPNum, uint8_t *pData, uint32_t cnt);

I need to do the parsing in 8 bit tokens as fast as possible because the project as some serious time constraints. Is anyone aware of some SIMD instruction to un-pack data to this purpose?

Any help would be highly appreciated.
Thanks in advance,
Andrea