This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Cortex M4 (LPC4370): fastest way to sum offset binary samples

Hi to you all,
I'm working on a project involving the LPC Link2 to evaluate its LPC4370 (the one on the board is actually the LPC4370JFET100) for real-time data processing: a more datailed description of my work was given in this question.
What I need to do is:

  • acquire sample @40MSPS (done)
  • move them into the central memory using DMA (done)
  • at a certain threshold crossing caused by the input signal trigger the data processing (done)
  • process the data as fast as possible TODO

Basically I just need to sum the samples acquired. The ADC packs two 12-bit wide samples in offset binary (due to the fact that the firmware uses thresholds) into one 32bit word.
Thanks to 's code I was able to extract maximum and minimum very fast: now I'm trying to adapt another of his algorithms (kindly published on his interesting blog m4-unleashed.com).

Here's my code:

Fullscreen
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
__RAMFUNC(RAM) void sum_SMLAD(int32_t* pSrc, uint32_t pSize)
{
int32_t sum = 0;
uint32_t pair, loop = pSize >> 2;
while ((loop-- > 0) && wordsLeft)
{
pair = *__SIMD32(pSrc)++;
sum = __SMLAD(pair, 0b10000000000000011000000000000001u, sum);
pair = *__SIMD32(pSrc)++;
sum = __SMLAD(pair, 0b10000000000000011000000000000001u, sum);
wordsLeft -= 2;
}
if (pSize & 0x2)
{
pair = *__SIMD32(pSrc)++;
sum = __SMLAD(pair, 0b10000000000000011000000000000001u, sum);
wordsLeft -= 1;
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Unfortunately I always run into a Hardfault around the first couple of iteration of the main while loop.
Here's how SMLAD is defined in core_cm4_simd.h:

Fullscreen
1
2
3
4
5
6
7
__attribute__( ( always_inline ) ) __STATIC_INLINE uint32_t __SMLAD (uint32_t op1, uint32_t op2, uint32_t op3)
{
uint32_t result;
__ASM volatile ("smlad %0, %1, %2, %3" : "=r" (result) : "r" (op1), "r" (op2), "r" (op3) );
return(result);
}
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX


Since the SMLAD intrinsic basically takes 2 words then does (first_top_halfword*second_top_halfword)+(first_bottom_halfword*second_bottom_halfword) im' trying to multiply by one and possibly facing two issues:

  1.  I have 12 bit samples not 16 ones (maybe a shift is needed?)
  2. They are in offset binary and I'm a bit confused about multiply by one means in this case.

Thanks for your patience: any help would be highly appreciated! 

Regards,

Andrea

0