This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Regarding the documentation on the T1 encoding of the MOV instruction on ARMv6-M architecture

While reading the documentation on the MOV instruction (section A6.7.40) on the ARMv6-M architecture, I stumbled upon the following in the "Encoding T1" description: "ARMv6-M, ARMv7-M, if and both from R0-R7. Otherwise all versions of the Thumb instruction set.". I have trouble interpreting this.

My first thought was that the T1 encoding can only be used on ARMv6-M, ARMv7-M if the Rd and Rm registers are both from R0-R7, but this doesn't hold as it is perfectly possible to assemble a MOV instruction for (at least) the ARMv6-M architecture using the T1 encoding with Rm and Rd being greater than R7.

I have tried contacting the ARM support, but it wasn't very helpful at all.

Top replies

a.surati over 4 years ago +1 verified

The encoding T1 of A6.7.40 in doc#1 is the same encoding mov(3) of A7.1.44 in doc#2 . Call the encoding, "mov3-T1-encoding". Furthermore, if the encoding is utilized with both Rd and Rm as low registers...

+1 a.surati over 4 years ago

The encoding T1 of A6.7.40 in doc#1 is the same encoding mov(3) of A7.1.44 in doc#2 .

Call the encoding, "mov3-T1-encoding". Furthermore, if the encoding is utilized with both Rd and Rm as low registers, call it "mov3-T1-low-encoding".

Summary:

The comment of A6.7.40 in doc#1 is not about the ability of the mov3-T1-encoding to receive high registers - it can receive them even for armv4t. It is about whether or not mov3-T1-low-encoding is legit - it is not on architectures < armv6; it is on architectures >= armv6.

Some details:

For architectures < armv6, the usage of mov3-T1-low-encoding is declared unpredictable by doc#2.

For architectures >= armv6, doc#1 says that only armv6-m and armv7-m can utilize mov3-T1-low-encoding, which is consistent with the above comment about the unpredictability on architectures < armv6.

However, doc#1 is silent about how to generate mov3-T1-low-encoding. The Operation section for mov3-T1-encoding in doc#1 /seems/ to imply that one can
write "mov r1, r6" and the mov3-T1-low-encoding will be generated. But I am wrong, at least in the case of Arm's gcc toolchain.

The doc#2 has more details on how to generate mov3-T1-low-encoding. It says that if "mov Rd, Rm" is given as source, with both Rd and Rm as low registers, the
assembler should generate a flag-setting copy by emitting "adds Rd, Rm, #0".

To actually generate mov3-T1-low-encoding as output, when both Rd and Rm are low registers, the source must use "cpy Rd, Rm", where cpy is a mnemonic specifically intended to be used for the purpose of generating a non-flag-setting copy between low registers. The mnemonic "cpy" is available on architectures >= armv6.

So, there are two sides to the issue: The behaviour of the cpu when encountering a mov3-T1-low-encoding, and the generation of the said encoding.

If a cpu adhering to an architecture < armv6 encountered the mov3-T1-low-encoding, the results are declared unpredictable. Thus, a toolchain is not supposed to generate mov3-T1-low-encoding when building for architectures < armv6.

Evidently, Arm's gcc toolchain adheres to this rule. It emits encoding-for-"adds r1, r6, #0" when assembling "mov r1, r6" as thumb not only for armv4 and armv5, but also for armv6, armv7 (and possibly more that I did not test). That is, mov-between-low-registers in assembler source code is always treated as flag-setting.

It emits the mov3-T1-low-encoding (which is non-flag-setting) when assembling "cpy Rd, Rm" as thumb, for any high-low combination of legal register-arguments of cpy.

For the "mov Rd, Rm" thumb instruction in the source code:

If [architecture < armv6] and [both registers are low], then generate the [flag-setting encoding for "adds Rd, Rm, #0"]

If [architecture < armv6] and [at least one register is high], then generate the [non-flag-setting mov3-T1-encoding]

If [architecture >= armv6] and [both registers are low], then generate the [flag-setting encoding for "adds Rd, Rm, #0"]

If [architecture >= armv6] and [at least one register is high], then generate the [non-flag-setting mov3-T1-encoding]
Cancel
Up +1 Down

Cancel
0 a.surati over 4 years ago

The encoding T1 of A6.7.40 in doc#1 is the same encoding mov(3) of A7.1.44 in doc#2.

Call the encoding, "mov3-T1-encoding". Furthermore, if the encoding is utilized with both Rd and Rm as low registers, call it "mov3-T1-low-encoding".

Summary:

The comment of A6.7.40 in doc#1 is not about the ability of the mov3-T1-encoding to receive high registers - it can receive them even for armv4t. It is about whether or not mov3-T1-low-encoding is legit - it is not on architectures < armv6; it is on architectures >= armv6.

Some details:

For architectures < armv6, the usage of mov3-T1-low-encoding is declared unpredictable by doc#2.

For architectures >= armv6, doc#1 says that only armv6-m and armv7-m can utilize mov3-T1-low-encoding, which is consistent with the above comment about the unpredictability on architectures < armv6.

However, doc#1 is silent about how to generate mov3-T1-low-encoding. The Operation section for mov3-T1-encoding in doc#1 /seems/ to imply that one can
write "mov r1, r6" and the mov3-T1-low-encoding will be generated. But I am wrong, at least in the case of Arm's gcc toolchain.

The doc#2 has more details on how to generate mov3-T1-low-encoding. It says that if "mov Rd, Rm" is given as source, with both Rd and Rm as low registers, the
assembler should generate a flag-setting copy by emitting the encoding for "adds Rd, Rm, #0", which is different than mov3-T1-encoding/mov3-T1-low-encoding.

To actually generate mov3-T1-low-encoding as output, the source must use "cpy Rd, Rm", where cpy is a mnemonic specifically intended to be used for the purpose of generating a non-flag-setting copy between low registers. The mnemonic "cpy" is available on architectures >= armv6.

So, there are two sides to the issue: The behaviour of the cpu when encountering a mov3-T1-low-encoding, and the generation of the said encoding.

If a cpu adhering to an architecture < armv6 encountered the mov3-T1-low-encoding, the results are declared unpredictable. Thus, a toolchain is not supposed to generate mov3-T1-low-encoding when building for architectures < armv6.

Evidently, Arm's gcc toolchain adheres to this rule. It emits encoding-for-"adds r1, r6, #0" when assembling "mov r1, r6" as thumb not only for armv4 and armv5, but also for armv6, armv7 (and possibly more that I did not test). That is, mov-between-low-registers in assembler source code is always treated as flag-setting.

It emits the mov3-T1-low-encoding (which is non-flag-setting) when assembling "cpy Rd, Rm" as thumb, for any high-low combination of legal register-arguments of cpy.

For the "mov Rd, Rm" thumb instruction in the source code:

If [architecture < armv6] and [both registers are low], then generate the [flag-setting encoding for "adds Rd, Rm, #0"]

If [architecture < armv6] and [at least one register is high], then generate the [non-flag-setting mov3-T1-encoding]

If [architecture >= armv6] and [both registers are low], then generate the [flag-setting encoding for "adds Rd, Rm, #0"]

If [architecture >= armv6] and [at least one register is high], then generate the [non-flag-setting mov3-T1-encoding]

Edit: All of the above in context of thumb instructions only.
Cancel
Up 0 Down

Cancel
0 B. Robertson over 4 years ago in reply to a.surati

Thank you very much! It's all clear now. I interpreted the implication the other way around, i.e. if you use the T1 encoding on ARMv6-m, both Rd and Rm have to be low registers. Thanks again for your quick and detailed reply.
Cancel
Up 0 Down

Cancel