This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Count Leading Zeros

Note: This was originally posted on 6th April 2009 at http://forums.arm.com

Using the ARM9 CLZ instruction - I am wondering, generally, how often would it be required to use this function and does it justify its inclusion?
  • Note: This was originally posted on 7th April 2009 at http://forums.arm.com

    > On the contrary, what then are the disadvantages of the CLZ instruction being included in the ARM9? If any?

    Well it probably takes a few extra gates to decode the instruction, but other than that not much...

    Okay, I see. So this extra circuitry is justified by the reduced number of clk cycles and memory usage. Thanks.

    But how many clock cycles does CLZ require to execute???
  • Note: This was originally posted on 7th April 2009 at http://forums.arm.com

    For general purpose code it is commonly used for integer normalization (placing the MSB of the integer at a known location).

    In more practical terms normalized integers are used for optimized Newton-Raphson software integer division, as Jacob mentioned, but also things like  integer to floating point conversion, and bit-field priority decoders.


    Thanks for the replies!

    On the contrary, what then are the disadvantages of the CLZ instruction being included in the ARM9? If any?
  • Note: This was originally posted on 7th April 2009 at http://forums.arm.com

    Normalization is the act of shifting the fractional part in order to make the left bit of the fractional point is one. This normalization is used in IEEE-754 compatible binary floating-point addition/subtraction. Let say you 20 leading zeros in the sum result, you may need 20 instructions to shift left "˜1' to become MSB and count the leading zeros. This count will be used to set the exponent value. Now the CLZ helps to have one instruction to find the count of leading zeros. This will speed up the normalization during binary floating-point addition/subtraction. One of widely used application is the Digital Signal Processing.
  • Note: This was originally posted on 7th April 2009 at http://forums.arm.com

    For general purpose code it is commonly used for integer normalization (placing the MSB of the integer at a known location).

    In more practical terms normalized integers are used for optimized Newton-Raphson software integer division, as Jacob mentioned, but also things like  integer to floating point conversion, and bit-field priority decoders.
  • Note: This was originally posted on 7th April 2009 at http://forums.arm.com

    > On the contrary, what then are the disadvantages of the CLZ instruction being included in the ARM9? If any?

    Well it probably takes a few extra gates to decode the instruction, but other than that not much...
  • Note: This was originally posted on 7th April 2009 at http://forums.arm.com

    But how many clock cycles does CLZ require to execute???


    For an ARM9, assuming you are not targeting the PC as the destination register (because that would be a bit crazy).

    One cycle if not using the shifter to shift the second input operand, or using a constant shift.
    Two cycles if using the shifter to shift the second operation with a register value input to the shifter.

    This is the same as any other ARM9 data processing instruction (ADD, ORR, etc).
  • Note: This was originally posted on 6th April 2009 at http://forums.arm.com

    It's surprisingly useful when hand-optimizing assembler. I think it has benefits for some cryptographic routines, and it speeds up division routines.

    I don't have any performance figures, but implementing the same functionality without using CLZ requires quite a few instructions.