Every practical general-purpose computing architecture has a mechanism of conditionally executing some code. Such mechanisms are used to implement the if construct in C, for example, in addition to several other cases that are less obvious.
Arm, like many other architectures, implements conditional execution using a set of flags which store state information about a previous operation. I intend, in this post, to shed some light on the operation of these flags. Of course, the Architecture Reference Manual is the definitive source of information, so if you need to know about a specific corner-case that I do not cover here, that is where you need to look.
Consider a simple fragment of C code:
for (i = 10; i != 0; i--) { do_something(); }
A compiler might implement that structure as follows:
mov r4, #10 loop_label: bl do_something sub r4, r4, #1 cmp r4, #0 bne loop_label
The last two instructions are of particular interest. The cmp (compare) instruction compares r4 with 0, and the bne instruction is simply a b (branch) instruction that executes if the result of the cmp instruction was "not equal". The code works because cmp sets some global flags indicating various properties of the operation. The bne instruction — which is really just a b (branch) with a ne condition code suffix — reads these flags to determine whether or not to branch 1.
The following code implements a more efficient solution:
mov r4, #10 loop_label: bl do_something subs r4, r4, #1 bne loop_label
Adding the s suffix to sub causes it to update the flags itself, based on the result of the operation. This suffix can be added to many (but not all) arithmetic and logical operations 2.
s
sub
In the rest of the article, I will explain what the condition flags are, where they are stored, and how to test them using condition codes.
If you have an Arm platform (or emulator) handy, the attached ccdemo application can be used to experiment with the operations discussed in the article. The application allows you to pick an operation and two operands, and shows the resulting flags and a list of which condition codes will match. When writing assembly code, it can also be a rather useful development tool.
ccdemo
The simplest way to set the condition flags is to use a comparison operation, such as cmp. This mechanism is common to many processor architectures, and the semantics (if not the details) of cmp will likely be familiar. In addition, we have already seen that many instructions (such as sub in the example) can be modified to update the condition flags based on the result by adding an s suffix. That's all well and good, but what information is stored, and how can we access it?
cmp
The additional information is stored in four condition flag bits in the APSR (Application Processor Status Register), or the CPSR (Current Processor Status Register) if you are used to pre-Armv7 terminology 3, 4. The flags indicate simple properties such as whether or not the result was negative, and are used in various combinations to detect higher-level relationships such as "greater than" and suchlike. Once I have described the flags, I will explain how they map onto condition codes (such as ne in the previous example).
APSR
CPSR
ne
N
The N flag is set by an instruction if the result is negative. In practice, N is set to the two's complement sign bit of the result (bit 31).
Z
The Z flag is set if the result of the flag-setting instruction is zero.
C
The C flag is set if the result of an unsigned operation overflows the 32-bit result register. This bit can be used to implement 64-bit unsigned arithmetic, for example.
V
The V flag works the same as the C flag, but for signed operations. For example, 0x7fffffff is the largest positive two's complement integer that can be represented in 32 bits, so 0x7fffffff + 0x7fffffff triggers a signed overflow, but not an unsigned overflow (or carry): the result, 0xfffffffe, is correct if interpreted as an unsigned quantity, but represents a negative value (-2) if interpreted as a signed quantity.
0x7fffffff
0x7fffffff + 0x7fffffff
0xfffffffe
-2
Consider the following example:
ldr r1, =0xffffffff ldr r2, =0x00000001 adds r0, r1, r2
The result of the operation would be 0x100000000, but the top bit is lost because it does not fit into the 32-bit destination register and so the real result is 0x00000000. In this case, the flags will be set as follows:
0x100000000
0x00000000
N = 0
0
Z = 1
1
C = 1
V = 0
0xffffffff
-1
(-1) + 1 = 0
If you fancy it, you can check this with the ccdemo application. The output looks like this:
$ ./ccdemo adds 0xffffffff 0x1 The results (in various formats): Signed: -1 adds 1 = 0 Unsigned: 4294967295 adds 1 = 0 Hexadecimal: 0xffffffff adds 0x00000001 = 0x00000000 Flags: N (negative): 0 Z (zero) : 1 C (carry) : 1 V (overflow): 0 Condition Codes: EQ: 1 NE: 0 CS: 1 CC: 0 MI: 0 PL: 1 VS: 0 VC: 1 HI: 0 LS: 1 GE: 1 LT: 0 GT: 0 LE: 1
We have worked out how to set the flags, but how does that result in the ability to conditionally execute some code? Being able to set the flags is pointless if you cannot then react to them.
The most common method of testing the flags is to use conditional execution codes. This mechanism is similar to mechanisms used in other architectures, so if you are familiar with other machines you might recognize the following pattern, which maps cleanly onto C's if/else construct:
if/else
cmp r0, #20 bhi do_something_else do_something: @ This code runs if (r0 <= 20). b continue @ Prevent do_something_else from executing. do_something_else: @ This code runs if (r0 > 20). continue: @ Other code.
In effect, attaching one of the condition codes to an instruction causes it to execute if the condition is true. Otherwise, it does nothing, and is essentially a nop.
nop
The following table lists the available condition codes, their meanings (where the flags were set by a cmp or subs instruction), and the flags that are tested:
subs
eq
Z==1
Z==0
cs
hs
C==1
cc
lo
C==0
mi
N==1
pl
N==0
vs
V==1
vc
V==0
hi
(C==1) && (Z==0)
ls
(C==0) || (Z==1)
ge
N==V
lt
N!=V
gt
(Z==0) && (N==V)
le
(Z==1) || (N!=V)
al
It is fairly obvious how the first few work because they test individual flags, but the others rely on specific combinations of flags. In practice, you very rarely need to know exactly what is happening; the mnemonics hide the complexity of the comparisons.
Here, once again, is the example for-loop code I gave earlier:
for
It should now be easy enough to work out exactly what is happening here:
r4-1
bne
r4
The cmp instruction (that we saw in the first example) can be thought of as a sub instruction that doesn't store its result: if the two operands are equal, the result of the subtraction will be zero, hence the mapping between eq and the Z flag. Of course, we could just use a sub instruction with a dummy register, but you can only do that if you have a register to spare. Dedicated comparison instructions are therefore quite commonly used.
There are actually four dedicated comparison instructions available, and they perform operations as described in the following table:
cmn
adds
tst
ands
teq
eors
Note that the dedicated comparison operations do not require the s suffix; they only update the flags, so the suffix would be redundant.
Whilst the condition flag mechanism is fairly simple in principle, there are a lot of details to take in, and seeing some real examples will probably be useful! I will make a point of presenting some examples of realistic usage in a future blog post.
1Technically, most instructions can be executed conditionally, not just branches. However, I will discuss such conditional execution in more detail in another article.
2The Instruction Set Quick Reference Card summarises the flag-setting abilities of each instruction. The Architecture Reference Manual contains detailed information about exactly how the flags are updated for each instruction.
3The APSR and CPSR are actually the same on Armv7, despite having separate names, but only the condition codes and one or two other bits are defined for the APSR. The other bits should not really be accessed directly anyway, so the renaming is essentially a clean-up of the old mixed-access CPSR. Note, however, that GCC (4.3.3 at least) does not accept APSR, so you have to use CPSR in your assembly source if you want to access it.
4In general, you will very rarely need to directly access the APSR because the condition codes give you the functionality you usually need from them anyway. However, if you really want to see what is in there, you can access it using the msr and mrs instructions. Indeed, this is the method that the ccdemo application uses to give information about the specified operation.
msr
mrs
I finaly unterstand how all this works. That is great!!
Thank you for your explanation