Hi,
According to instruction guide, `UADDLV` instruction will add every vector element in the source SIMD&FP register together, and writes the scalar result to the destination SIMD&FP register.
My question is, what happens to the higher bits in the destination vector register? Are they deterministically cleared to 0, or not undefined (so unsafe to do any assumption)?
The context for this ask, is that I'm seeing this codegen (https://godbolt.org/z/xjohrjocd), and in particular for the following sequence, not sure about the content in the higher 16 bits of register `s0`. Thanks for any inputs!
uaddlv h0, v0.8bfmov w9, s0str x9, [x8]
uaddlv h0, v0.8b
fmov w9, s0
str x9, [x8]
Edit with more context
The example in godbolt link above is a little special, since the relevant function counts the number of bits that are set in a 64-bit integer; in other words, the number of bits is guaranteed to be no greater than 64, and a 16-bit integer is sufficient to store it.
Also, by `-print-after-all` of `llvm llc`, there are `undef` flags seen in a defined register. According to LLVM MachineOperand.h comments
/// IsUndef - True if this register operand reads an "undef" value, i.e. the
/// read value doesn't matter. This flag can be set on both use and def
/// operands. On a sub-register def operand, it refers to the part of the
/// register that isn't written. On a full-register def operand, it is a
/// noop. See readsReg().
Any writes to a register, unless explicitly stated otherwise by the instruction will always clear the other bits to 0.
An example of such an instruction is https://developer.arm.com/documentation/ddi0596/2021-12/SIMD-FP-Instructions/ADDHN--ADDHN2--Add-returning-High-Narrow-
where we explicitly state what happens because it deviates from the norm.
See the full architecture description document https://developer.arm.com/documentation/ddi0487/ha/?lang=en and look at section C1.2 on page C1-228 (if looking at revision H.a of the document.)