This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

1-cycle multiply, 64-bit result,  reciprocal?

Can someone tell me how many extra gates the 1-cycle multiply uses? If there was a 64-bit result, how many more gates would be used? Can these gates also be used to find the reciprocal of a number so instead of divides, the coder multiplies the reciprocal? In the 90s, Nvidia sent a coder to optimize Tomb Raider on their video cards. He explained that they didn't use a Z-buffer but rather a W-buffer which was the reciprocal of the Z. This meant that the draw engine used multiplies rather than divides when calculating texture coordinates, lighting and other vertex controls.. As far as I know, they still do.

MP3 on the M0 (or 2 x M0s) uses a LOT of multiplies for the FFTs. Since the BBC Microbit has a Nordic Semiconductors nRF51822 bluetooth chip. Like the CPU, it's clocked at 48MHz. If the RAM of this CPU could be mapped into the CPU address space, it would be possible to build 1 channel using the Nordic chip & the other using the CPU.

I'm looking to use 16:16 fixed-point by modification of the Minimp3 player with some extra speed/space tradeoff so that the player can go right up to 320Kb/S.

Thanks in advance.

Sean

Parents
  • I'm somewhat confused with the questions in the first paragraph and I'll just try to share information the way I interpret the objective of the questions.

    I suppose you have a notion that a circuit for finding the reciprocal of an operand is coupled or integrated in a multiplier. In this case, when division is to be executed, the divisor is passed to the reciprocal circuit and the result is used as the multiplier (operand) while the dividend is applied as the multiplicand to the multiplier (circuit).

    Newer processors (e.g., ARMv7, Itanium, PowerPC - later versions, etc.) allow the reciprocal estimator (or initial estimator) to work independently and a reciprocal estimate instruction (or instruction to support reciprocal estimation) is available.

    To determine how many extra/more gates is used, I suggest you consult a book/reference in advanced digital/microprocessor/arithmetic circuit design. For our space I think it would suffice to remind us that division in digital computer circuit consumes far more resources compared to multiplication. Hence, division is avoided by transforming the operation into multiplication of the dividend and the reciprocal of the divisor. It may seem absurd because finding the reciprocal of a number is also division. However, division is truly eluded because reciprocal estimation is performed by iterating through

    Xn+1 = Xn(2 - dXn)

    which does not involve division (see for example the ARM® Architecture Reference Manual ARMv7-A and ARMv7-R edition page 86 to see a more readable equation). This sprung from Newton-Raphson root-finding method. From the function

    f(x) = d - (1/x)

    f(x) = 0 when x is the reciprocal of d.

    For details you may consult a reference in Numerical Methods/Analysis, Scientific Computing, and related subjects.

    Lastly, we have to be reminded that the method of division we are discussing does not apply to integer arithmetic circuit and the reciprocal estimate instruction belongs to floating-point operations of microprocessors. If a nonzero number is not exactly equal to 1, the number itself or its reciprocal involves a fractional part.

Reply
  • I'm somewhat confused with the questions in the first paragraph and I'll just try to share information the way I interpret the objective of the questions.

    I suppose you have a notion that a circuit for finding the reciprocal of an operand is coupled or integrated in a multiplier. In this case, when division is to be executed, the divisor is passed to the reciprocal circuit and the result is used as the multiplier (operand) while the dividend is applied as the multiplicand to the multiplier (circuit).

    Newer processors (e.g., ARMv7, Itanium, PowerPC - later versions, etc.) allow the reciprocal estimator (or initial estimator) to work independently and a reciprocal estimate instruction (or instruction to support reciprocal estimation) is available.

    To determine how many extra/more gates is used, I suggest you consult a book/reference in advanced digital/microprocessor/arithmetic circuit design. For our space I think it would suffice to remind us that division in digital computer circuit consumes far more resources compared to multiplication. Hence, division is avoided by transforming the operation into multiplication of the dividend and the reciprocal of the divisor. It may seem absurd because finding the reciprocal of a number is also division. However, division is truly eluded because reciprocal estimation is performed by iterating through

    Xn+1 = Xn(2 - dXn)

    which does not involve division (see for example the ARM® Architecture Reference Manual ARMv7-A and ARMv7-R edition page 86 to see a more readable equation). This sprung from Newton-Raphson root-finding method. From the function

    f(x) = d - (1/x)

    f(x) = 0 when x is the reciprocal of d.

    For details you may consult a reference in Numerical Methods/Analysis, Scientific Computing, and related subjects.

    Lastly, we have to be reminded that the method of division we are discussing does not apply to integer arithmetic circuit and the reciprocal estimate instruction belongs to floating-point operations of microprocessors. If a nonzero number is not exactly equal to 1, the number itself or its reciprocal involves a fractional part.

Children