This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Question about accumulator word length in A8 core

Hi,

I have used some 32-bit microprocessor cores (non-ARM), which has a long word-length accumulator for some DSP operations, to avoid over-flow etc. After I check A8 core document, it is a surprise that I do not see any about this specification. It looks like it is a 32-bit as the register. For a FIR filter, 24-bit data/16-bit coefficients, at least 48-bit is needed for the accumulator. How to get satisfying results with A8 core?

Thanks,

  • Hello,

    how about using NEON?

    Best regards,

    Yasuhiko Koumoto.

  • For arbitrary arithmetic, you might have to use the carry bit and instructions like ADC ("Add With Carry") to handle wide data types. However, the core instruction set has an SMLAL ("Signed Multiply Accumulate Long") instruction which probably does exactly what you want; it multiplies two 32-bit values and accumulates the 64-bit result with a value stored in two registers. There's also an unsigned variant (UMLAL).

    That said, NEON is probably a good choice here. The VMLAL instruction, for example, performs the same multiply-accumulate operation. For a 64-bit accumulator it can handle two elements per instruction.

  • Yes that's exactly right. As to coding you could consider a package from some supplier and they might have managed to optimize for the memory access as well. On the own coding side there's three levels of increasing speed and difficulty you might consider - C code using int for the operands and long long int for the 64 bit accumulator - this would use SMLAL, using C with the ARM 'NEON intrinsics' extensions, and straight assembler using NEON both of which would use VMLAL.