We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
When I use rayQueryProceedEXT(query), the work register value is very high.
More threads help with latency hiding, but half threads is enough to keep the core busy, so it's not guaranteed to be beneficial. A lot of complex gaming content has shaders using more than 32 registers, so I wouldn't worry too much.