We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
When I query the binary, I really get a binary and nothing human readable. I was expecting to see the generated assembly code like how Nvidia returns it. It's really difficult to write a maxFLOPS test without seeing this assembly. Moreover the Midgard architecture is a mixmatch between old school VLIW and scalar so I never know whether scalar or vector MULs are being generated from my code.
Honestly, I wasn't expecting to see any human readable binary from either of the vendors especially Nvidia. But still, PTX code isn't really useful for my purposes.
I agree with your sentiments that it is very useful to have a tool like AMD's Shaderanalyzer. I was able to achieve close to peak FP perf in my matrix multiplication code using that tool. Without that it's like trying to throw a coin from top of a pond into a bucket down below. There's a lot of guess work going on.