This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

ARM MUL instruction

Still more instruction things giving me head ache.

This time it's the MUL-instruction.

What the heck means:

Multiply multiplies two register values. The least significant 32 bits of the result are written to the destination

register. These 32 bits do not depend on whether the source register values are considered to be signed values or

unsigned values.

(The bolded part)

Parents

0 Sean Dunlevy over 7 years ago in reply to Juha Aaltonen

I am developing on a SAMD21G18 SoC so MUL is the only multiply instruction I have. Recently a developed in the ARM community posted a 32-bit x 32-bit --->64 bit multiply in 17 cycles and it is certainly an interesting routine. My problem is that I am converting a 64kb/s mono MP3 decoder and there are 10s of thousands of MULSHIFT32 macros all over the code. As the name suggests. it performs a 32-bit x 32-bit --->64 bit product but only bits 32-63 are required.

I have just about exhausted the various methodologies within the programming fraternity as well as the pure maths branch of science. One GOOD result is that for people using an SoC with a 32-cycle multiply, Karatsuba multiplication is some 22 cycles faster. Something of use to people writing for the very smallest ARM cores.

For myself, I have exhausted tricks like finding the least significant bits within a register so that just 2 multiplies will produce the correct results.... but buy is it slow.

If anyone has just a name for me to search, I would really appreciate it. The system works at 48MHz but the slower I can run the SoC, the less power it uses and with the product aiming to use a single (rechargable) AA battery, improving battery life is vital.

There are quite a few interesting & novel aspects to the ARM processors. It is always interesting although the C bit treated as a 'borrow' rather than a carry spoils some looping and makes some maths... interesting.

I would like to take this chance to thank Yasuhiko Koumoto who has always been a superb source of information explaining 'branch shadows' which I think is unique to ARM. I coded many of the RISC chips developed in the 80s (used in consoles in the 90s) so I came from 'branch delay-slot' instructions.
Cancel
Vote up 0 Vote down

Cancel

Reply

0 Sean Dunlevy over 7 years ago in reply to Juha Aaltonen

I am developing on a SAMD21G18 SoC so MUL is the only multiply instruction I have. Recently a developed in the ARM community posted a 32-bit x 32-bit --->64 bit multiply in 17 cycles and it is certainly an interesting routine. My problem is that I am converting a 64kb/s mono MP3 decoder and there are 10s of thousands of MULSHIFT32 macros all over the code. As the name suggests. it performs a 32-bit x 32-bit --->64 bit product but only bits 32-63 are required.

I have just about exhausted the various methodologies within the programming fraternity as well as the pure maths branch of science. One GOOD result is that for people using an SoC with a 32-cycle multiply, Karatsuba multiplication is some 22 cycles faster. Something of use to people writing for the very smallest ARM cores.

For myself, I have exhausted tricks like finding the least significant bits within a register so that just 2 multiplies will produce the correct results.... but buy is it slow.

If anyone has just a name for me to search, I would really appreciate it. The system works at 48MHz but the slower I can run the SoC, the less power it uses and with the product aiming to use a single (rechargable) AA battery, improving battery life is vital.

There are quite a few interesting & novel aspects to the ARM processors. It is always interesting although the C bit treated as a 'borrow' rather than a carry spoils some looping and makes some maths... interesting.

I would like to take this chance to thank Yasuhiko Koumoto who has always been a superb source of information explaining 'branch shadows' which I think is unique to ARM. I coded many of the RISC chips developed in the 80s (used in consoles in the 90s) so I came from 'branch delay-slot' instructions.
Cancel
Vote up 0 Vote down

Cancel

Children

No data