Note: Armv8 deprecates the use of the it instruction to make anything other than a single 16-bit instruction conditional. This affects many of the examples in this post. Refer to the Armv8-A Architecture Reference Manual for details.
Thumb-2 can make use of the same conditional execution features that the Arm instruction set provides. For conditionally executing one or two instructions, this mechanism can provide code-size and performance benefits over the (more conventional) conditional branching mechanism.
I noted at the end of the last post in this series that this mechanism is not directly available to Thumb. Instead, Thumb-2 has an instruction — it — which can provide the same functionality as Arm conditional execution. In this article, I will describe the it instruction, and I will also explain a few caveats of condition-setting instructions in Thumb-2. Note that the it instruction is only available to Thumb-2, and so most of this article will not be relevant to the old Thumb instruction set 1.
it
With the exception of simple conditional branches, Thumb-2 instructions do not have the 4-bit condition code field that most Arm instruction have. Instead, Thumb-2 has the it instruction, which conditionally executes up to four subsequent instructions. The instructions affected by an it instruction are said to be in an it block.
The mnemonic it represents an if-then construct. If the condition code (given as an argument to the instruction) evaluates to true, then the next instruction is executed. Up to three additional t (then) or e (else) codes can be added to control the execution of the subsequent instructions. For example, read ite as if-then-else, and ittee as if-then-then-else- else. The following code either increments r0, or resets it to 0 if it is greater than or equal to 10:
t
e
ite
ittee
r0
0
10
.syntax unified @ Remember this! .thumb [...] cmp r0, #10 ite lo @ if r0 is lower than 10 ... addlo r0, #1 @ ... then r0 = r0 + 1 movhs r0, #0 @ ... else r0 = 0
Note that the conditionally-executed instructions inside the it block must still be given condition codes, as they would in Arm assembly. Assemblers will check that the condition you gave to it is consistent with those on the individual instructions. The then conditions must match the condition code, and any else conditions must be the opposite condition. In the example, the else condition was hs (higher or same) — the opposite of lo (lower). The table below shows the condition codes and their opposites:
hs
lo
eq
ne
cs
cc
mi
pl
vs
vc
hi
ls
ge
lt
gt
le
al
Whilst it is valid to give condition code al to the it, it has no opposite as there is no never code. It is not valid to specify the al condition code in an it instruction that uses an else clause.
Just like other instructions, Thumb-2's branches can be conditionally executed using it. Indeed, some branches cannot be conditionally executed without using an it block. However, any branches that exist in an it block must be the last instruction in the block. The following, for example, is unpredictable:
ite eq blxeq some_label @ UNPREDICTABLE during an IT block. movne r0, #0
The correct way to implement the above would be to put the mov before the blx, as follows:
mov
blx
ite ne movne r0, #0 blxeq some_label @ Ok at the end of an IT block.
The it instruction is valid in Arm assembly, though it will not generate any code. This is done for compatibility with Thumb-2 assembly, and allows most assembly sequences to be assembled for both Arm and Thumb-2.
Just like Arm code, a simple Thumb b instruction can be made conditional by adding a suitable condition code suffix. Indeed, the if/else example provided in my last post will assemble for Thumb just as it will for Arm.
b
if/else
16-bit forms of Thumb arithmetic instructions usually set the condition flags. When inside an it block, however, the 16-bit forms do not set the flags. This property can be useful in combination with condition code al. Consider the following code sequence:
@ Instruction Size add r0, r0, #1 @ 4 bytes add r1, r1, #1 @ 4 bytes add r2, r2, #1 @ 4 bytes add r3, r3, #1 @ 4 bytes @ Total: 16 bytes
Writing an equivalent code sequence using an it block can result in smaller code size:
@ Instruction Size itttt al @ 2 bytes addal r0, r0, #1 @ 2 bytes addal r1, r1, #1 @ 2 bytes addal r2, r2, #1 @ 2 bytes addal r3, r3, #1 @ 2 bytes @ Total: 10 bytes
It should be noted that the 16-bit forms have additional limitations, so the it trick used above may not always be applicable. The restrictions vary between each instruction, but typically the 16-bit instruction forms can typically only access r0-r7 and have a very restricted range of immediate constants. For details, refer to the Architecture Reference Manual.
r7
Because (outside of it blocks) most arithmetic instruction that set the flags have 16-bit forms, code size can be dramatically improved by setting the flags even when not necessary. This will provide the best (smallest) code size possible. However, depending on your target processor, this technique may have a small negative performance impact. It is perhaps advisable to use the al condition trick or 32-bit instructions in performance-critical code.
You can force the assembler to produce 16-bit instructions by adding a .n suffix. Assemblers will do this anyway, but if your instruction cannot be encoded using a 16-bit form and you specify .n, the assembler will give an error message.
.n
[...] @ Not in an IT block. adds.n r1, r2, r3 @ Generates a 16-bit instruction. add.n r1, r2, r3 @ Error: No 16-bit form for this.
Refer to the Architecture Reference Manual for details of each instruction, and information about the constraints of the 16-bit forms. There are many exceptions and special cases so I won't describe them here in detail.
[CTAToken URL = "https://community.arm.com/processors/b/blog/posts/condition-codes-4-floating-point-comparisons-using-vfp" target="_blank" text="Read next blog in series" class ="green"]
1Thumb-2 is available on the Armv6T2 architecture and above (including Armv7-A). Processors based on these architecture versions include Arm1156 and all of the Cortex series, but not older processors or the others in the Arm11 series.