We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
My Cortex M0+ calculates 32-bit x 32-bit --> 64-bit result in 17 cycles. For the most part I only need bits 32-63. Does anyone know a method for calculation of top 32-bits? 1 cycle from 17 (for example) doesn't seem like a big deal but it's the deviding line between the possible and the impossible.
I would point out that the last instruction that calculates 32-63 so no savings there.
What if you just took the top 16 bits of the inputs & did 16 x 16 ==> 32-bit result ?