Support forums

Architectures and Processors forum Loss of information - SMMUL

State Accepted Answer
+1 person also asked this people also asked this
Locked Locked
Replies 19 replies
Subscribers 347 subscribers
Views 12876 views
Users 0 members are here

Options

How was your experience today?

This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Loss of information - SMMUL

Petr over 8 years ago

Why the Cortex M4 instruction SMMUL (32 = 32 x 32b) preserves a redundant sign bit and discards one useful bit of information? What could possibly be the justification for such blatant disregard of the ISO/IEC TR 18037 standard Fract format?

Top replies

daith over 8 years ago in reply to G. Goodwin L. Pitos +1 verified

I was agreeing with you. Misinterpreting happens all the time and is very hard to guard against.

Parents

0 Thibaut ZEISSLOFF over 8 years ago
On my side, I think that ARM intended to define instructions that can be used by C compilers.
Regarding multiply, ANSI C states that
The result of the binary * operator is the product of the operands.
Therefore in order to have following C code generate efficient code, you have to define SMULL and SMMUL as they are today !
int64_t result64 = (int64_t)(int32_t)operand1 * (int64_t)(int32_t)operand2; // Translates to SMULL r0,r1,r1,r0 int32_t result32 = (int32_t)(((int64_t)(int32_t)operand3 * (int64_t)(int32_t)operand4) >> 32); // Translates to SMMUL r0,r2,r3
Also, in order to have a symmetrical error introduced by SMMUL truncation, you can use its alternative SMMULR which performs rounding before extracting those 32 Most Significant
Bits.
Cancel
Up 0 Down

Cancel

Reply

0 Thibaut ZEISSLOFF over 8 years ago
On my side, I think that ARM intended to define instructions that can be used by C compilers.
Regarding multiply, ANSI C states that
The result of the binary * operator is the product of the operands.
Therefore in order to have following C code generate efficient code, you have to define SMULL and SMMUL as they are today !
int64_t result64 = (int64_t)(int32_t)operand1 * (int64_t)(int32_t)operand2; // Translates to SMULL r0,r1,r1,r0 int32_t result32 = (int32_t)(((int64_t)(int32_t)operand3 * (int64_t)(int32_t)operand4) >> 32); // Translates to SMMUL r0,r2,r3
Also, in order to have a symmetrical error introduced by SMMUL truncation, you can use its alternative SMMULR which performs rounding before extracting those 32 Most Significant
Bits.
Cancel
Up 0 Down

Cancel

Children

0 daith over 8 years ago in reply to Thibaut ZEISSLOFF

What was wanted was what the NEON instructions VQDMULH or VQRDMULH do so ARM certainly thought the operation was worthwhile implementing when they designed NEON.
Cancel
Up 0 Down

Cancel
0 G. Goodwin L. Pitos over 8 years ago in reply to Thibaut ZEISSLOFF

The examples in Cortex-M4 Devices Generic User Guide, 3.6.8. SMMUL use SMULL instead of SMMUL.
Cancel
Up 0 Down

Cancel