This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

1-cycle multiply, 64-bit result,  reciprocal?

Can someone tell me how many extra gates the 1-cycle multiply uses? If there was a 64-bit result, how many more gates would be used? Can these gates also be used to find the reciprocal of a number so instead of divides, the coder multiplies the reciprocal? In the 90s, Nvidia sent a coder to optimize Tomb Raider on their video cards. He explained that they didn't use a Z-buffer but rather a W-buffer which was the reciprocal of the Z. This meant that the draw engine used multiplies rather than divides when calculating texture coordinates, lighting and other vertex controls.. As far as I know, they still do.

MP3 on the M0 (or 2 x M0s) uses a LOT of multiplies for the FFTs. Since the BBC Microbit has a Nordic Semiconductors nRF51822 bluetooth chip. Like the CPU, it's clocked at 48MHz. If the RAM of this CPU could be mapped into the CPU address space, it would be possible to build 1 channel using the Nordic chip & the other using the CPU.

I'm looking to use 16:16 fixed-point by modification of the Minimp3 player with some extra speed/space tradeoff so that the player can go right up to 320Kb/S.

Thanks in advance.

Sean

Parents
  • Yep - if the CPU is 16Mhz while the Bluetooth is 48MHz, I was considering asking for a small amount of the program-RAM for the Bluetooth CPU to do all of those FFT calculations. My other option is not good - specify Sandisk USB sticks and reprogram the program RAM in them to placed the MP3 decode inside the memory stick. Of course, nobody has specified the speed they are running the memory stick. I know that on a PC they run at 100MHz so plenty of power for MP3. If all else fails, ADPCM is simple but not so compact. I really need to find a codec that is designed to output 8-bit data...

    I wrote a multiply for the 8080 in the Gameboy in which I split the numbers into 4-bit blocks and used a lookup table. I wonder if a similar trick would beat the usual 32x32=64 bit software multiply which itself uses 31 cycles IF it has the 1-cycle multiply. Without it, it's about 250 cycles!

Reply
  • Yep - if the CPU is 16Mhz while the Bluetooth is 48MHz, I was considering asking for a small amount of the program-RAM for the Bluetooth CPU to do all of those FFT calculations. My other option is not good - specify Sandisk USB sticks and reprogram the program RAM in them to placed the MP3 decode inside the memory stick. Of course, nobody has specified the speed they are running the memory stick. I know that on a PC they run at 100MHz so plenty of power for MP3. If all else fails, ADPCM is simple but not so compact. I really need to find a codec that is designed to output 8-bit data...

    I wrote a multiply for the 8080 in the Gameboy in which I split the numbers into 4-bit blocks and used a lookup table. I wonder if a similar trick would beat the usual 32x32=64 bit software multiply which itself uses 31 cycles IF it has the 1-cycle multiply. Without it, it's about 250 cycles!

Children