This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Loss of information - SMMUL

Why the Cortex M4 instruction SMMUL (32 = 32 x 32b) preserves a redundant sign bit and discards one useful bit of information? What could possibly be the justification for such blatant disregard of the ISO/IEC TR 18037 standard Fract format?

Top replies

daith over 8 years ago in reply to G. Goodwin L. Pitos +1 verified

I was agreeing with you. Misinterpreting happens all the time and is very hard to guard against.

Parents

0 Jens Bauer over 8 years ago

I believe this is intended to be used for getting the highword result of a topword x topword multiplication.
(note: by topword, I mean the most significant word of each factor; they could for instance be a 32-bit value multiplied by a 32-bit value or the highword of a 64-bit value multiplied by the highword of a 64-bit value; the product would go into the highest word of a 128-bit value).
Without it, I think it would be cumbersome to multiply two more-than-32-bit signed values.
Which operation do you need to perform in more details ?
Cancel
Up 0 Down

Cancel

Reply

0 Jens Bauer over 8 years ago

I believe this is intended to be used for getting the highword result of a topword x topword multiplication.
(note: by topword, I mean the most significant word of each factor; they could for instance be a 32-bit value multiplied by a 32-bit value or the highword of a 64-bit value multiplied by the highword of a 64-bit value; the product would go into the highest word of a 128-bit value).
Without it, I think it would be cumbersome to multiply two more-than-32-bit signed values.
Which operation do you need to perform in more details ?
Cancel
Up 0 Down

Cancel

Children

0 Petr over 8 years ago in reply to Jens Bauer
Jens,
I need to perform these operations:
FIR filter
IIR filter
controllers (PI, PID)
numerical integrators (trapezoidal/backward Euler)
observers for electrical motor control
Park/Clarke transforms
FFT
All of the above operations require the standard fractional multiplication for optimal accuracy.
Could you please give me an example of a DSP application on a low-power 32-bit MCU where you would need to multiply two more-than-32-bit signed values?
Cancel
Up 0 Down

Cancel
0 daith over 8 years ago in reply to Jens Bauer

SMULL gives both the high and low parts , one can do everything with that and it is implemented in the Cortex-M3,
SMMUL gives just the high part and is part of the DSP extension, you need a Cortex M4 for SMMUL.
Cancel
Up 0 Down

Cancel
0 G. Goodwin L. Pitos over 8 years ago in reply to daith

This is just to avoid misinterpretation.
SMULL gives both the high and low parts , one can do everything with that and it is implemented in the Cortex-M3,
SMULL is still present in Cortex-M4.
Cancel
Up 0 Down

Cancel
0 daith over 8 years ago in reply to G. Goodwin L. Pitos

Misinterpretation. Yes I could easily get paranoid about people misinterpreting what I've said! It just seems to happen so easily despite ones best efforts.
Cancel
Up 0 Down

Cancel
0 daith over 8 years ago in reply to daith

By the way some of the Cortex-M4 processors have single precision floating point and that is quite quick.
I just had a look at what gcc does for this and it doesn't do saturation. It could do the work in the same time and saturation I think with
smull hi, lo, x, y
lsr lo,31
qdadd result,lo,hi
Cancel
Up 0 Down

Cancel
0 Petr over 8 years ago in reply to daith

Yes, your code example corresponds to the saturated multiplication that I have been using. It takes 3 cycles to complete. The single precision floating point is faster (VMUL.F32 takes 1 cycle) but the 24-bit mantissa has lower resolution than the 32-bit Fract so it can't be considered a direct replacement.
Cancel
Up 0 Down

Cancel
0 G. Goodwin L. Pitos over 8 years ago in reply to daith

daith, sorry if I posted that. I didn't have much time and about to log-out then but there was a young engineer (I've recently convinced to also study ARM instead of being too dedicated to AVR and PICmicro) who read your reply pertaining to SMULL/SMMUL and wondered if SMULL was excluded in Cortex-M4. I then decided to post such response, hoping that would help prevent some other readers, especially new users of Cortex-M, from also misinterpreting the info.
Cancel
Up 0 Down

Cancel
0 G. Goodwin L. Pitos over 8 years ago in reply to Petr

Hi petr,
I'm not sure if Jens' answer was the main reason for the SMMUL instruction. Note however that Cortex-M4 is strictly not a DSP but an MCU with DSP extension so multiplication of more-than-32-bit signed values may have application aside from DSP.
I hope you can visit here more often. You can share your knowledge about DSP by participating in discussions, posting blogs, etc. My impression is that you already have intensive experience in DSP especially using DSP/DSC rather than MCU.
Regards,
Goodwin
Cancel
Up 0 Down

Cancel
+1 daith over 8 years ago in reply to G. Goodwin L. Pitos

I was agreeing with you. Misinterpreting happens all the time and is very hard to guard against.
Cancel
Up +1 Down

Cancel
0 G. Goodwin L. Pitos over 8 years ago in reply to daith

This is not an answer to petr's question, I just found myself comparing they way some RISC processors initiated their support for multiplication in hardware.
The MUL instruction was added in ARMv2, SMULL in ARMv3M.
The i960 has multiply instructions generating (the least significant) 32 bits and extended multiply instruction that generates 64 bits stored in two 32-bit registers.
When I was studying the PowerPC (using older generations), I have to learn that to perform 32-bit x 32-bit = 64-bit two instructions must be used, one for getting the high-order 32 bits and one for getting the low-order 32 bits of the result.
When multiplying, MIPS32 uses special registers for storing the high- and low-order words of the result.
Cancel
Up 0 Down

Cancel