This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Cortex A code / function alignment

Hi !

I am writing assembly code for some ARMv7a and ARMv8a CPU. I know that code has to be 4 bytes aligned, but I saw in several places (uboot/linux) the ".align 4" GCC directive, which will align to 2**4 = 16 bytes.

When writing code that will be called from C code, what alignment should I set to my assembly function in aarch32 ? in aarch64 (if they are different) ?

I can't find a clear answer to that.

Best regards,

V.

  • Hi vsiles,

    It should be 4 bytes, unless you're running T32 (Thumb), in which case it's 2 bytes.

    There are measurable performance benefits and, comparably, measurable performance impacts to aligning functions on a particular boundary - the code in U-Boot/Linux you saw has obviously been benchmarked as actually faster but the circumstances of the branch and the state of the caches, predictors and prefetchers all come into play here.

    Stick with the natural instruction alignment for the ISA, unless you are in a phenomenally sensitive piece of code which requires shaving individual cycles. Any padding you put between functions is, up front, a waste of memory - and instruction cache.

    If you're concerned then you can always benchmark it.

    Ta,

    Matt