Is there a way to write an equivalent of 32-bit saturation instruction such as QADD or QSUB without using neon saturation instructions for Armv8 architecture?