ARM/THUMB instructions that change execution path?

Has anybody come across a list of ARM & THUMB instructions that cause deviation from the linear instruction stream?

I've been trying to figure out gdb-stub single stepping using software interrupts, and in single stepping you need to find

the next instruction(s) where the next breakpoint instruction needs to be set.

There are three cases:

1) current instruction doesn't change the execution path. Next instruction is the next word.

2) current instruction is a jump. The operand defines the next instruction address

3) current instruction is conditional branch. One possible next instruction is the next word, the other possible

instruction address is defined by the operand. (That includes conditional add with PC as the target, and the like).

To implement single stepping, I need to tell those cases apart and figure out how to find out the possible branching address.

I could go through manuals of numerous processors instruction by instruction and maybe I'd be done within the next couple of years,

or I could find a list of instructions to check, or a paper that explains how to "decode" the instructions in a useful way.

Also, there doesn't seem to be lots of sources of ARM gdb servers or stubs around that use software breakpoints.

Parents
  • That's no fun at all. I hope you've got them all now.

    When I wrote my disassembler/debugger, I recall how I went through each page of the book; actually I took all the integer instructions first.

    When I was done, I had a break and worked on other things, then it struck me that I could make a floating point emulator, so I got the FFP library and added the entire FPU instruction set.

    t probably took a few months before I had all instructions.

    Hopefully the script will take some of the burden off.

    ... When you arrange the table entries and there are instructions where one has a known bit in one place and the other has a known bit in another place, it will be necessary to find out which states are valid and which are invalid. The one with invalid states should go after the one that does not have invalid states.

    Eg. for instance this instruction is invalid:

         add pc,pc,pc

    -So ARM decided to recycle the opcode space (because there isn't a lot of opcode space left, so this is a good thing, though writing tools become more complex).

    As the above instruction contains invalid combinations of registers (basically pc is not allowed in opcode2 I believe it is; but I might be wrong - it might be only when PC is the destination).

    So the instruction which takes the seat from add pc,pc,pc, should go before the add instruction.

    I think it might be a good idea to modify the script for the following checks:

    1: is PC in opcode2.

    2: is PC the destination register.

    3: is SP in opcode2.

    4: is SP the destination register.

    I've forgotten other rules, but the above seem to be used a few times.

    Also ... some bitfield instructions are not allowed.

    Rule: BFI and BFC: Start+Length must be 32 or less.

    Eg. BFI r3,r6,#23,#16 is illegal

    When you reach the thumb instruction set and thumb2, it's important to read about the "restrictions" for each instruction.

    The 16-bit thumb instructions only allow operations on r0...r7, except for very few instructions:

    ADD r7,r7,r10  /* note: destination must be the same as operand1 (the opcode actually only has room for 2 registers) */

    MOV r3,r11

    CMP r2,r9

    The rest of them do not allow operations on r8...r15, except for ADD and SUB with SP and PC (but that's a special case. ADD and SUB #imm also allow a different range on those two registers).

    I don't know everything about the instruction sets, but I'll try and write whatever I remember.

    I really feel like writing a disassembler, but unfortunately, I do not have the time. :/

Reply
  • That's no fun at all. I hope you've got them all now.

    When I wrote my disassembler/debugger, I recall how I went through each page of the book; actually I took all the integer instructions first.

    When I was done, I had a break and worked on other things, then it struck me that I could make a floating point emulator, so I got the FFP library and added the entire FPU instruction set.

    t probably took a few months before I had all instructions.

    Hopefully the script will take some of the burden off.

    ... When you arrange the table entries and there are instructions where one has a known bit in one place and the other has a known bit in another place, it will be necessary to find out which states are valid and which are invalid. The one with invalid states should go after the one that does not have invalid states.

    Eg. for instance this instruction is invalid:

         add pc,pc,pc

    -So ARM decided to recycle the opcode space (because there isn't a lot of opcode space left, so this is a good thing, though writing tools become more complex).

    As the above instruction contains invalid combinations of registers (basically pc is not allowed in opcode2 I believe it is; but I might be wrong - it might be only when PC is the destination).

    So the instruction which takes the seat from add pc,pc,pc, should go before the add instruction.

    I think it might be a good idea to modify the script for the following checks:

    1: is PC in opcode2.

    2: is PC the destination register.

    3: is SP in opcode2.

    4: is SP the destination register.

    I've forgotten other rules, but the above seem to be used a few times.

    Also ... some bitfield instructions are not allowed.

    Rule: BFI and BFC: Start+Length must be 32 or less.

    Eg. BFI r3,r6,#23,#16 is illegal

    When you reach the thumb instruction set and thumb2, it's important to read about the "restrictions" for each instruction.

    The 16-bit thumb instructions only allow operations on r0...r7, except for very few instructions:

    ADD r7,r7,r10  /* note: destination must be the same as operand1 (the opcode actually only has room for 2 registers) */

    MOV r3,r11

    CMP r2,r9

    The rest of them do not allow operations on r8...r15, except for ADD and SUB with SP and PC (but that's a special case. ADD and SUB #imm also allow a different range on those two registers).

    I don't know everything about the instruction sets, but I'll try and write whatever I remember.

    I really feel like writing a disassembler, but unfortunately, I do not have the time. :/

Children
  • Have to check "add pc,pc,pc", but basically add with PC as a destination is a special instruction:

    The SUBS PC, LR, #<const> instruction provides an exception return without the use of the stack. It subtracts the

    immediate constant from LR, branches to the resulting address, and also copies the SPSR to the CPSR.

    ...

    Encoding A2 ARMv4*, ARMv5T*, ARMv6*, ARMv7

    <opc1>S<c> PC, <Rn>, <Rm>{, <shift>}

    <opc2>S<c> PC, <Rm>{, <shift>}

    <opc3>S<c> PC, <Rn>, #<const>

    RRXS<c> PC, <Rn>

    ...

    SUBS{<c>}{<q>} PC, LR, #<const> Encoding A1

    <opc1>S{<c>}{<q>} PC, <Rn>, #<const> Encoding A1

    <opc1>S{<c>}{<q>} PC, <Rn>, <Rm> {, <shift>} Encoding A2, deprecated

    <opc2>S{<c>}{<q>} PC, #<const> Encoding A1, deprecated

    <opc2>S{<c>}{<q>} PC, <Rm> {, <shift>} Encoding A2

    <opc3>S{<c>}{<q>} PC, <Rn>, #<const> Encoding A2, deprecated

    RRXS{<c>}{<q>} PC, <Rn> Encoding A2, deprecated

    ...

    <opc1> The operation. <opc1> is one of ADC, ADD, AND, BIC, EOR, ORR, RSB, RSC, SBC, and SUB. ARM deprecates

    the use of all of these operations except SUB.

    <opc2> The operation. <opc2> is MOV or MVN. ARM deprecates the use of MVN.

    <opc3> The operation. <opc3> is ASR, LSL, LSR, or ROR. ARM deprecates the use of all of these operations.

    Also, I'm not sure if assembler lets instructions with lsb + length > 32 through.

    Also, I'm allowing more than just user level. That drops quite some restrictions.

    In the "assembly" group I got a hint that I also should do UNPREDICTABLEs, but warning about that would be nice.

    That, I heard, is the convention with debuggers.

  • I'd like to explain a little better what I mean by the BFI and BFC:

    If you come across an opcode, where the start + length is > 32, then it's not a BFI or BFC instruction.

    That means that any opcodes with those values, must be handled before the BFI and BFC opcodes.

    In other words: The mask+data for those opcodes must be preceding the mask+data for the BFI and BFC.

  • turboscrew wrote:

    These need to be recognized as not UNDEFINED.

    If they're all valid, then just handle them before you check for UNDEFINED.

  • No, not again!

    It'll take another day or two to figure these out!

    1 1 1 1 0 0 1 0 0 D f z n n n n d d d d 1 1 0 1 N Q M 1 m m m m     V<op><c>.F32_<Qd>,_<Qn>,_<Qm>_V<op><c>.F32_<Dd>,_<Dn>,_<Dm> T1/A1 A8.8.337

    1 1 1 1 0 0 1 0 0 D f z n n n n d d d d 1 1 1 1 N Q M 0 m m m m     V<op><c>.F32_<Qd>,_<Qn>,_<Qm>_V<op><c>.F32_<Dd>,_<Dn>,_<Dm> T1/A1 A8.8.335

    1 1 1 1 0 0 1 0 1 D z z n n n n d d d d 0 f 1 1 N 1 M 0 m m m m     VQD<op><c>.<dt>_<Qd>,_<Dn>,_<Dm[x]> T2/A2 A8.8.371

    1 1 1 1 0 0 1 0 1 D z z n n n n d d d d 1 0 f 1 N 0 M 0 m m m m     VQD<op><c>.<dt>_<Qd>,_<Dn>,_<Dm> T1/A1 A8.8.371

    1 1 1 1 0 0 1 1 0 D f f n n n n d d d d 0 0 0 1 N Q M 1 m m m m     V<op><c>_<Qd>,_<Qn>,_<Qm>_V<op><c>_<Dd>,_<Dn>,_<Dm> T1/A1 A8.8.290

    1 1 1 1 0 0 1 1 0 D f z n n n n d d d d 1 1 1 0 N Q M 1 m m m m     V<op><c>.F32_<Qd>,_<Qn>,_<Qm>_V<op><c>.F32_<Dd>,_<Dn>,_<Dm> T1/A1 A8.8.281

    1 1 1 1 0 0 1 1 0 D f z n n n n d d d d 1 1 1 1 N Q M 0 m m m m     VP<op><c>.F32_<Dd>,_<Dn>,_<Dm> T1/A1 A8.8.366

    1 1 1 1 0 0 1 1 1 D 1 1 n n n n d d d d 1 0 z z N f M 0 m m m m     V<op><c>.8_<Dd>,_<list>,_<Dm> T1/A1 A8.8.419

    1 1 1 1 0 0 1 Q 1 D z z n n n n d d d d 0 f 0 F N 1 M 0 m m m m     V<op><c>.<dt>_<Qd>,_<Qn>,_<Dm[x]>_V<op><c>.<dt>_<Dd>,_<Dn>,_<Dm[x]> T1/A1 A8.8.338

    1 1 1 1 0 0 1 U 0 D z z n n n n d d d d 0 0 f 0 N Q M 0 m m m m     VH<op><c>_<Qd>,_<Qn>,_<Qm>_VH<op><c>_<Dd>,_<Dn>,_<Dm> T1/A1 A8.8.319

    1 1 1 1 0 0 1 U 0 D z z n n n n d d d d 0 1 1 0 N Q M f m m m m     V<op><c>.<dt>_<Qd>,_<Qn>,_<Qm>_V<op><c>.<dt>_<Dd>,_<Dn>,_<Dm> T1/A1 A8.8.334

    1 1 1 1 0 0 1 U 0 D z z n n n n d d d d 1 0 1 0 N Q M f m m m m     VP<op><c>.<dt>_<Dd>,_<Dn>,_<Dm> T1/A1 A8.8.365

    1 1 1 1 0 0 1 U 1 D z z n n n n d d d d 0 f 1 0 N 1 M 0 m m m m     V<op>L<c>.<dt>_<Qd>,_<Dn>,_<Dm[x]> T2/A2 A8.8.338

    1 1 1 1 0 0 1 U 1 D z z n n n n d d d d 1 0 f 0 N 0 M 0 m m m m     V<op>L<c>.<dt>_<Qd>,_<Dn>,_<Dm> T2/A2 A8.8.336

    1 1 1 1 0 0 1 f 0 D z z n n n n d d d d 1 0 0 1 N Q M 0 m m m m     V<op><c>.<dt>_<Qd>,_<Qn>,_<Qm>_V<op><c>.<dt>_<Dd>,_<Dn>,_<Dm> T1/A1 A8.8.336

    These need to be recognized as not UNDEFINED.

    I'm very close of loosing my mind, and I'm surely getting very tired of fighting this.

    I really had to do some work to find out the right 'MOV' in:

    Encoding A2 ARMv4*, ARMv5T*, ARMv6*, ARMv7

    <opc1>S<c> PC, <Rn>, <Rm>{, <shift>}

    <opc2>S<c> PC, <Rm>{, <shift>}

    <opc3>S<c> PC, <Rn>, #<const>

    RRXS<c> PC, <Rn>

    <opc2> The operation. <opc2> is MOV or MVN. ARM deprecates the use of MVN.

    c c c c 0 0 0 1 1 0 1 S (0) (0) (0) (0) 1 1 1 1 0 0 0 0 0 0 0 0 m m m m     MOV{S}<c>_PC,_<Rm>_(=_LSL{S}<c>_PC,_<Rm>,_#0) A2 B9.3.20

    (That is LSL (reg) with Rd=PC and immediate = 0).

    [EDIT]

    I frigging knew it (bolding is mine):

    A8.8.290 VBIF, VBIT, VBSL

    Encoding T1/A1 Advanced SIMD

    V<op><c> <Qd>, <Qn>, <Qm>

    V<op><c> <Dd>, <Dn>, <Dm>

    if op == ‘00’ then SEE VEOR;

    if op == ‘01’ then operation = VBitOps_VBSL;

    if op == ‘10’ then operation = VBitOps_VBIT;

    if op == ‘11’ then operation = VBitOps_VBIF;

    and

    A8.8.281 VACGE, VACGT, VACLE, VACLT

    Encoding T1/A1 Advanced SIMD (UNDEFINED in integer-only variant)

    V<op><c>.F32 <Qd>, <Qn>, <Qm>

    V<op><c>.F32 <Dd>, <Dn>, <Dm>

    Assembler syntax

    where:

    <op> The operation. It must be one of:

    ACGE Absolute Compare Greater than or Equal, encoded as op = 0.

    ACGT Absolute Compare Greater Than, encoded as op = 1.

    What!

    What happened to ACLE and ACLT?

    What's the phone number of Sherlock Holmes?

    Aha:

    VACLE (Vector Absolute Compare Less Than or Equal) is a pseudo-instruction, equivalent to a VACGE instruction with

    the operands reversed. Disassembly produces the VACGE instruction.

    VACLT (Vector Absolute Compare Less Than) is a pseudo-instruction, equivalent to a VACGT instruction with the

    operands reversed. Disassembly produces the VACGT instruction.

    [/EDIT]

  • Yep. All instructions not matching the table are considered UNDEFINED.

    It means that all those 'new' instructions must be added to the table too. (Sigh.)

    Oh well, it's just a couple of hundred instructions more...

  • When I wrote my disassembler, there were illegal instructions, which occupied parts of legal instruction space.

    In some cases, I had to make a special "illegal instruction" handling; eg. place that before the actual decoded instruction.

    ...

    I'm quite impressed with all your work. You've absolutely done a lot in very little time!

    I just took a look at the ARM_instructions.txt ...

    If ignoring the condition-codes, you got 8 bits, which are almost always known.

    Speed-wise it might be a real good idea to do this:

    index = 0xff & (opcode >> 20);          /* isolate instruction group */

    handleGroup[index](opcode);          /* jump directly to group handler */

    -That means you'll shave several clock cycles off your execution time, without really sacrificing anything.

    In assembly language it could of course be just a simple jump-table; r0 = opcode:

    handle_group:

         ubfe r1,r0,#20,#8

         tbb [r1,lsl#1]

    table:

         .4byte     MUL_AND_Group

         .4byte     MUL_AND_Group

         .4byte     MLA_EOR_Group

         .4byte     MLA_EOR_Group

         .4byte     UMAAL_SUB_Group

         .4byte     SUB_Group

         .4byte     MLS_RSB_Group

         .4byte     RSB_Group

         ...

         ...

    A 256-entry table is fairly small on a RasPi

    You can do this for "bits which are always known", but you can even extend it to include "bits which are often known"

    Bits which are often known, could include bit 4 and perhaps bits 8...11; but it might be a good idea to wait determining what bits to include, till you have the complete table.

    The above assembly code can then be declared as a function like this ...

    void handle_group(uint32_t aOpcode);

    and called that way; it'll indirectly jump to a C function, spending just a few clock cycles in total.

    After that, you can probably focus on the low 20 bits, but in 16-bit thumb, there might be needs for modification, because 16-bit thumb does not have the 4-bit condition code field.

  • Unfortunately it's not so simple - this kind of things will mess it up causing "false positives":

    cccc00010B00nnnntttt(0)(0)(0)(0)1001TTTTSWP{B}<c>_<Rt>,_<Rt2>,_[<Rn>]A1A8.8.229
    cccc00010R00(1)(1)(1)(1)dddd(0)(0)0(0)0000(0)(0)(0)(0)MRS<c>_<Rd>,_<spec_reg>A1B9.3.8
    cccc00010R00MMMMdddd(0)(0)1M0000(0)(0)(0)(0)MRS<c>_<Rd>,_<banked_reg>A1B9.3.9
    cccc00010R10MMMM(1)(1)(1)(1)(0)(0)1M0000nnnnMSR<c>_<banked_reg>,_<Rn>A1B9.3.10
    cccc00010R10mmmm(1)(1)(1)(1)(0)(0)0(0)0000nnnnMSR<c>_<spec_reg>,_<Rn>A1B9.3.12
    cccc000PU0W0nnnntttt(0)(0)(0)(0)1011mmmmSTRH<c>_<Rt>,_[<Rn>,+/-<Rm>]{!}_STRH<c>_<Rt>,_[<Rn>],+/-<Rm>A1A8.8.218
    cccc000PU0W0nnnntttt(0)(0)(0)(0)1101mmmmLDRD<c>_<Rt>,_<Rt2>,_[<Rn>,+/-<Rm>]{!}_LDRD<c>_<Rt>,_<Rt2>,_[<Rn>],+/-<Rm>A1A8.8.74
    cccc000PU0W0nnnntttt(0)(0)(0)(0)1111mmmmSTRD<c>_<Rt>,_<Rt2>,_[<Rn>,+/-<Rm>]{!}_STRD<c>_<Rt>,_<Rt2>,_[<Rn>],+/-<Rm>A1A8.8.211
    cccc000PU0W1nnnntttt(0)(0)(0)(0)1011mmmmLDRH<c>_<Rt>,_[<Rn>,+/-<Rm>]{!}_LDRH<c>_<Rt>,_[<Rn>],+/-<Rm>A1A8.8.82
    cccc000PU0W1nnnntttt(0)(0)(0)(0)1101mmmmLDRSB<c>_<Rt>,_[<Rn>,+/-<Rm>]{!}_LDRSB<c>_<Rt>,_[<Rn>],+/-<Rm>A1A8.8.86
    cccc000PU0W1nnnntttt(0)(0)(0)(0)1111mmmmLDRSH<c>_<Rt>,_[<Rn>,+/-<Rm>]{!}_LDRSH<c>_<Rt>,_[<Rn>],+/-<Rm>A1A8.8.90
    cccc000PU1W0nnnnttttxxxx1011xxxxSTRH<c>_<Rt>,_[<Rn>{,_#+/-<imm8>}]_STRH<c>_<Rt>,_[<Rn>],_#+/-<imm8>_STRH<c>_<Rt>,_[<Rn>,_#+/-<imm8>]!A1A8.8.217
    cccc000PU1W0nnnnttttxxxx1101xxxxLDRD<c>_<Rt>,_<Rt2>,_[<Rn>{,_#+/-<imm8>}]_LDRD<c>_<Rt>,_<Rt2>,_[<Rn>],_#+/-<imm8>_LDRD<c>_<Rt>,_<Rt2>,_[<Rn>,_#+/-<imm8>]!A1A8.8.72
    cccc000PU1W0nnnnttttxxxx1111xxxxSTRD<c>_<Rt>,_<Rt2>,_[<Rn>{,_#+/-<imm8>}]_STRD<c>_<Rt>,_<Rt2>,_[<Rn>],_#+/-<imm8>_STRD<c>_<Rt>,_<Rt2>,_[<Rn>,_#+/-<imm8>]!A1A8.8.210
    cccc000PU1W11111ttttxxxx1011xxxxLDRH<c>_<Rt>,_<label>_LDRH<c>_<Rt>,_[PC,_#-0]_Special_caseA1A8.8.81
    cccc000PU1W11111ttttxxxx1101xxxxLDRSB<c>_<Rt>,_<label>_LDRSB<c>_<Rt>,_[PC,_#-0]_Special_caseA1A8.8.85
    cccc000PU1W11111ttttxxxx1111xxxxLDRSH<c>_<Rt>,_<label>_LDRSH<c>_<Rt>,_[PC,_#-0]_Special_caseA1A8.8.89
    cccc000PU1W1nnnnttttxxxx1011xxxxLDRH<c>_<Rt>,_[<Rn>{,_#+/-<imm8>}]_LDRH<c>_<Rt>,_[<Rn>],_#+/-<imm8>_LDRH<c>_<Rt>,_[<Rn>,_#+/-<imm8>]!A1A8.8.80
    cccc000PU1W1nnnnttttxxxx1101xxxxLDRSB<c>_<Rt>,_[<Rn>{,_#+/-<imm8>}]_LDRSB<c>_<Rt>,_[<Rn>],_#+/-<imm8>_LDRSB<c>_<Rt>,_[<Rn>,_#+/-<imm8>]!A1A8.8.84
    cccc000PU1W1nnnnttttxxxx1111xxxxLDRSH<c>_<Rt>,_[<Rn>{,_#+/-<imm8>}]_LDRSH<c>_<Rt>,_[<Rn>],_#+/-<imm8>_LDRSH<c>_<Rt>,_[<Rn>,_#+/-<imm8>]!A1A8.8.88

    The only bit that is either '0' or '1' (instruction specific) when the first 3 bits (27, 26, 25) after condition code field is 0 0 0, is bit 4.

    If bit 4 = 0 then you can use the next bits (24, 23) but if bit 4 is '1', you have to check from bit 7 what are the next bits.

    If bit 7 is '0', then next bits are 24 and 23, if bit 7 is '1', the next bits are 6 and 5.

    And so on.

    if the bits 27, 26 and 25 are 0 1 0, then there is only one instruction: single data transfer:

    In the list it's listed like in the manual, but in the reality it's:

    cccc010PUBWLnnnnttttxxxxxxxxxxxxA1A8.8.204

    P = 1, pre-indexing, otherwise post-indexing or offset

    U = 1 offset is added, otherwise offset is subtracted

    B = 1 byte access, else word access

    W = 1 writeback, else no writeback

    L = 1 load, else store

    It's a bit different with the media- or special LD/ST-instructions.

    Sometimes all the above is not used, but are just part of opcode, sometimes B = 1 register, else immediate(?).

    BTW, there will be another update to the ARM instruction list, and to the spreadsheet. I'll look into Thumbs not until I learn to do it better playing with ARM instructions first.

  • I think a good way might be:

    check condition code

        if it's 1 1 1 1 then specials

        else normal

    check bits 27 -25 (with both specials and normal)

    then apply table if instruction subset contains several instructions

    This way the huge amount of instructions is split into 16 subsets some of them only having a couple of instructions.

    (NOTE: the floating point and vector instructions are in the special instructions - and there are lots of them.)

    And if that's still far too much, I'll put them in a hash-table! That should, frigging, do it!

  • It would really be excellent, if ARM provided the instruction set as an XML file.

    No matter which path you take, make sure you create some kind of automation script; it's tedious to do the actual code and table by hand.

    Perl is very good for processing text-files (because of the excellent RegEx). It's currently my preferred choice, especially because you don't have to wait forever for it to compile.

  • I've been using awk, sed, sort and geany's regex + hand editing first. Then I load the file as CSV into LibreOffice Calc and do some editing there too. It's also tedious to go through the instructions (499 ARM instructions, I don't dare to think about Thumb instructions yet) and assign a handler to them - what handlers do I need and which handler to which instruction.

    I'm not eager to crash-learn Perl at this point.

    Oh, and I committed new versions of the text file and spreadsheet of the ARM instructions.

    All the instructions are (I really hope) there.

  • The 16-bit thumb would be fairly short. I don't know how many instructions the 32-bit thumb provides.

    There's a picture in this document, which gives you a quick overview: http://community.arm.com/docs/DOC-7034

    Perl is very much C-like, but you don't have to learn it if the other tools can do what you need.

    The hard part in Perl is probably the RegEx. The rest looks very much like C.

  • Generated 'raw' thumb instruction file the same way I created the ARM instruction file (all thumb instructions

    should be there - both 16- and 32-bit) and it has 521 instructions.

    And there are still 13 'V<op>'-kind of instructions. When they are expanded to real instructions, I guess 20 - 30 instructions more giving about 550 Thumb instructions.

    Wish me luck and long life.

  • That's a lot more than I expected; but don't the thumb2 instructions share space with the instructions you already processed?

    (I always had the impression that the Cortex-A7 was using the Thumb2 architecture, but I most likely need some correction here).

    Oh, and ... of course: "Good Health, Strong Body, Clear Mind and Many Years".

  • I'm not sure what you mean by "...share space with the instructions you already processed".

    They are re-using the instruction bits. You can't tell if it's ARM instruction or thumb instruction without checking the 'T'-bit in the CPSR.

    Encoding T1/A1   Advanced SIMD

    VST1<c>.<size> <list>, [<Rn>{:<align>}]{!}

    VST1<c>.<size> <list>, [<Rn>{:<align>}], <Rm>

    15 14 13 12 11 10  9  8  7  6  5  4  3  2  1  0 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

    1  1  1  1  1  0  0  1  0  D  0  0 Rn     Vd   type  sz  algn  Rm

    31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

    1  1  1  1  0  1  0  0  0  D  0  0 Rn     Vd   type  sz  algn  Rm

    (The site does something funny to the hand-formatted text. The editor seems to treat some stuff as tables - although misformatted.)

    I think, for a bit pattern of the thumb instruction, there is an ARM instruction to match it that has nothing to do with the thumb instruction, and vice versa.

    That is: a separate table is needed for Thumbs.

    From the manual:

    ARMv7 contains two main instruction sets, the ARM and Thumb instruction sets.

    The two instruction sets differ in how instructions are encoded:

        • Thumb instructions are either 16-bit or 32-bit, and are aligned on a two-byte boundary. 16-bit and 32-bit

          instructions can be intermixed freely.

  • Alright, you gave me excellent news today.

    I did not expect the Cortex-A7 to support the ARM instruction set. I was only expecting it to support Thumb2.

    So this day just got better.

    ... Yes, I know that there's a table-bug, but I've found out if paste the text into my text-editor (eg. text-only, no formatting), then re-copy and finally paste it into one of the JIVE-editors, it works better.

    Each of the JIVE editors have several bugs. Some won't let the caret move past the # symbol, some will not let the caret move past an empty line, some does not recognize the Delete key, and some finds it amusing to remove a space now and then.

    I know this has been reported to the authors, but I'm not sure they're able to fix it, so I've chosen to live with it and re-edit until my documents look the way I want them. Hey, we've got more than 80 characters per line.