Dear all,
- is it possible to force the compiler through a special flag (whether ARM CC, or ARM gnu toolchain) to use predication instead of branch instruction throughout the whole binary.?
- Suppose that in an Out of order processor (such as the A15) the predicated instruction and the instruction on which the predication depends are both in-flight (in the pipeline) and it happens that the predicated instruction finishes its computation before the result of the predication is forwarded to the predicated instruction, does the predicated instruction wait in the reorder buffer until it receives the predication result where it commits to the architected state or not correspondingly.
Thank you so much.
Only albanrampon is the great predicator !
Hi,
Thanks for your question. Some interesting topics!
- I am not aware of any compiler option which would remove the use of branch instructions. I am interested to know why you might want to do this.
- We do not discuss this kind of microarchitectural detail in a public forum, I'm afraid. You can rest assured that, whatever re-ordering or scheduling the pipeline carries out, correct program operation will be preserved. As a programmer, you don't really need to know the details of how that happens in terms of the treatment of individual instructions or sequences of instructions.
Hope this helps.
Chris
I will add to the comments above that a good practice is to use the --cpu option on your compiler command lines. This ensures that the compiler takes into consideration the micro-architecture of your particular target. You may find that the assembly code generated for one device may be different than the code for a different target. For example, the conditional execution of instructions may be less efficient that using branch instructions in complex super-scalar out-of-order pipelines like that of the Cortex-A15 that have advanced branch prediction logic. Nonetheless, the functional behavior of the code will be correct.
Thanks so much for the reply.
I am a actually a graduate student doing some research in the context of micro-architectural attacks (branch predictor, cache, simult. multithreading ) against cryptographic functions mainly on ARM microprocessors and considering cloud (and local) computing models . For instance, RSA has a key dependent branch instruction performing addition if the branch is taken (key bit = 0) and multiplication otherwise. An adversary can exploit the timing information as well as the branch target buffer collisions, to deduce the key bits etc... I believe that even TrustZone-like solution can not protect against such adversary models. So, I was trying to investigate whether removing key-dependent branch and use predication instead through some flag inserted to the compiler might cancel such threat or at least make the adversary effort required to recover the key much harder.
Best wishes.