Why the Cortex M4 instruction SMMUL (32 = 32 x 32b) preserves a redundant sign bit and discards one useful bit of information? What could possibly be the justification for such blatant disregard of the ISO/IEC TR 18037 standard Fract format?
What was wanted was what the NEON instructions VQDMULH or VQRDMULH do so ARM certainly thought the operation was worthwhile implementing when they designed NEON.