ARM/THUMB instructions that change execution path?

Has anybody come across a list of ARM & THUMB instructions that cause deviation from the linear instruction stream?

I've been trying to figure out gdb-stub single stepping using software interrupts, and in single stepping you need to find

the next instruction(s) where the next breakpoint instruction needs to be set.

There are three cases:

1) current instruction doesn't change the execution path. Next instruction is the next word.

2) current instruction is a jump. The operand defines the next instruction address

3) current instruction is conditional branch. One possible next instruction is the next word, the other possible

instruction address is defined by the operand. (That includes conditional add with PC as the target, and the like).

To implement single stepping, I need to tell those cases apart and figure out how to find out the possible branching address.

I could go through manuals of numerous processors instruction by instruction and maybe I'd be done within the next couple of years,

or I could find a list of instructions to check, or a paper that explains how to "decode" the instructions in a useful way.

Also, there doesn't seem to be lots of sources of ARM gdb servers or stubs around that use software breakpoints.

Parents

0 Juha Aaltonen over 9 years ago in reply to Juha Aaltonen

No, not again!
It'll take another day or two to figure these out!

1 1 1 1 0 0 1 0 0 D f z n n n n d d d d 1 1 0 1 N Q M 1 m m m m     V<op><c>.F32_<Qd>,_<Qn>,_<Qm>_V<op><c>.F32_<Dd>,_<Dn>,_<Dm> T1/A1 A8.8.337

1 1 1 1 0 0 1 0 0 D f z n n n n d d d d 1 1 1 1 N Q M 0 m m m m     V<op><c>.F32_<Qd>,_<Qn>,_<Qm>_V<op><c>.F32_<Dd>,_<Dn>,_<Dm> T1/A1 A8.8.335

1 1 1 1 0 0 1 0 1 D z z n n n n d d d d 0 f 1 1 N 1 M 0 m m m m     VQD<op><c>.<dt>_<Qd>,_<Dn>,_<Dm[x]> T2/A2 A8.8.371

1 1 1 1 0 0 1 0 1 D z z n n n n d d d d 1 0 f 1 N 0 M 0 m m m m     VQD<op><c>.<dt>_<Qd>,_<Dn>,_<Dm> T1/A1 A8.8.371

1 1 1 1 0 0 1 1 0 D f f n n n n d d d d 0 0 0 1 N Q M 1 m m m m     V<op><c>_<Qd>,_<Qn>,_<Qm>_V<op><c>_<Dd>,_<Dn>,_<Dm> T1/A1 A8.8.290

1 1 1 1 0 0 1 1 0 D f z n n n n d d d d 1 1 1 0 N Q M 1 m m m m     V<op><c>.F32_<Qd>,_<Qn>,_<Qm>_V<op><c>.F32_<Dd>,_<Dn>,_<Dm> T1/A1 A8.8.281

1 1 1 1 0 0 1 1 0 D f z n n n n d d d d 1 1 1 1 N Q M 0 m m m m     VP<op><c>.F32_<Dd>,_<Dn>,_<Dm> T1/A1 A8.8.366

1 1 1 1 0 0 1 1 1 D 1 1 n n n n d d d d 1 0 z z N f M 0 m m m m     V<op><c>.8_<Dd>,_<list>,_<Dm> T1/A1 A8.8.419

1 1 1 1 0 0 1 Q 1 D z z n n n n d d d d 0 f 0 F N 1 M 0 m m m m     V<op><c>.<dt>_<Qd>,_<Qn>,_<Dm[x]>_V<op><c>.<dt>_<Dd>,_<Dn>,_<Dm[x]> T1/A1 A8.8.338

1 1 1 1 0 0 1 U 0 D z z n n n n d d d d 0 0 f 0 N Q M 0 m m m m     VH<op><c>_<Qd>,_<Qn>,_<Qm>_VH<op><c>_<Dd>,_<Dn>,_<Dm> T1/A1 A8.8.319

1 1 1 1 0 0 1 U 0 D z z n n n n d d d d 0 1 1 0 N Q M f m m m m     V<op><c>.<dt>_<Qd>,_<Qn>,_<Qm>_V<op><c>.<dt>_<Dd>,_<Dn>,_<Dm> T1/A1 A8.8.334

1 1 1 1 0 0 1 U 0 D z z n n n n d d d d 1 0 1 0 N Q M f m m m m     VP<op><c>.<dt>_<Dd>,_<Dn>,_<Dm> T1/A1 A8.8.365

1 1 1 1 0 0 1 U 1 D z z n n n n d d d d 0 f 1 0 N 1 M 0 m m m m     V<op>L<c>.<dt>_<Qd>,_<Dn>,_<Dm[x]> T2/A2 A8.8.338

1 1 1 1 0 0 1 U 1 D z z n n n n d d d d 1 0 f 0 N 0 M 0 m m m m     V<op>L<c>.<dt>_<Qd>,_<Dn>,_<Dm> T2/A2 A8.8.336

1 1 1 1 0 0 1 f 0 D z z n n n n d d d d 1 0 0 1 N Q M 0 m m m m     V<op><c>.<dt>_<Qd>,_<Qn>,_<Qm>_V<op><c>.<dt>_<Dd>,_<Dn>,_<Dm> T1/A1 A8.8.336

These need to be recognized as not UNDEFINED.
I'm very close of loosing my mind, and I'm surely getting very tired of fighting this.
I really had to do some work to find out the right 'MOV' in:

Encoding A2 ARMv4*, ARMv5T*, ARMv6*, ARMv7

<opc1>S<c> PC, <Rn>, <Rm>{, <shift>}

<opc2>S<c> PC, <Rm>{, <shift>}

<opc3>S<c> PC, <Rn>, #<const>

RRXS<c> PC, <Rn>

<opc2> The operation. <opc2> is MOV or MVN. ARM deprecates the use of MVN.

c c c c 0 0 0 1 1 0 1 S (0) (0) (0) (0) 1 1 1 1 0 0 0 0 0 0 0 0 m m m m     MOV{S}<c>_PC,_<Rm>_(=_LSL{S}<c>_PC,_<Rm>,_#0) A2 B9.3.20
(That is LSL (reg) with Rd=PC and immediate = 0).
[EDIT]
I frigging knew it (bolding is mine):

A8.8.290 VBIF, VBIT, VBSL

Encoding T1/A1 Advanced SIMD

V<op><c> <Qd>, <Qn>, <Qm>

V<op><c> <Dd>, <Dn>, <Dm>

if op == ‘00’ then SEE VEOR;

if op == ‘01’ then operation = VBitOps_VBSL;

if op == ‘10’ then operation = VBitOps_VBIT;

if op == ‘11’ then operation = VBitOps_VBIF;

and

A8.8.281 VACGE, VACGT, VACLE, VACLT

Encoding T1/A1 Advanced SIMD (UNDEFINED in integer-only variant)

V<op><c>.F32 <Qd>, <Qn>, <Qm>

V<op><c>.F32 <Dd>, <Dn>, <Dm>

Assembler syntax

where:

<op> The operation. It must be one of:

ACGE Absolute Compare Greater than or Equal, encoded as op = 0.

ACGT Absolute Compare Greater Than, encoded as op = 1.

What!
What happened to ACLE and ACLT?
What's the phone number of Sherlock Holmes?
Aha:

VACLE (Vector Absolute Compare Less Than or Equal) is a pseudo-instruction, equivalent to a VACGE instruction with

the operands reversed. Disassembly produces the VACGE instruction.

VACLT (Vector Absolute Compare Less Than) is a pseudo-instruction, equivalent to a VACGT instruction with the

operands reversed. Disassembly produces the VACGT instruction.

[/EDIT]
Cancel
Vote up 0 Vote down

Reply

Accept answer

Cancel

Reply

0 Juha Aaltonen over 9 years ago in reply to Juha Aaltonen

No, not again!
It'll take another day or two to figure these out!

1 1 1 1 0 0 1 0 0 D f z n n n n d d d d 1 1 0 1 N Q M 1 m m m m     V<op><c>.F32_<Qd>,_<Qn>,_<Qm>_V<op><c>.F32_<Dd>,_<Dn>,_<Dm> T1/A1 A8.8.337

1 1 1 1 0 0 1 0 0 D f z n n n n d d d d 1 1 1 1 N Q M 0 m m m m     V<op><c>.F32_<Qd>,_<Qn>,_<Qm>_V<op><c>.F32_<Dd>,_<Dn>,_<Dm> T1/A1 A8.8.335

1 1 1 1 0 0 1 0 1 D z z n n n n d d d d 0 f 1 1 N 1 M 0 m m m m     VQD<op><c>.<dt>_<Qd>,_<Dn>,_<Dm[x]> T2/A2 A8.8.371

1 1 1 1 0 0 1 0 1 D z z n n n n d d d d 1 0 f 1 N 0 M 0 m m m m     VQD<op><c>.<dt>_<Qd>,_<Dn>,_<Dm> T1/A1 A8.8.371

1 1 1 1 0 0 1 1 0 D f f n n n n d d d d 0 0 0 1 N Q M 1 m m m m     V<op><c>_<Qd>,_<Qn>,_<Qm>_V<op><c>_<Dd>,_<Dn>,_<Dm> T1/A1 A8.8.290

1 1 1 1 0 0 1 1 0 D f z n n n n d d d d 1 1 1 0 N Q M 1 m m m m     V<op><c>.F32_<Qd>,_<Qn>,_<Qm>_V<op><c>.F32_<Dd>,_<Dn>,_<Dm> T1/A1 A8.8.281

1 1 1 1 0 0 1 1 0 D f z n n n n d d d d 1 1 1 1 N Q M 0 m m m m     VP<op><c>.F32_<Dd>,_<Dn>,_<Dm> T1/A1 A8.8.366

1 1 1 1 0 0 1 1 1 D 1 1 n n n n d d d d 1 0 z z N f M 0 m m m m     V<op><c>.8_<Dd>,_<list>,_<Dm> T1/A1 A8.8.419

1 1 1 1 0 0 1 Q 1 D z z n n n n d d d d 0 f 0 F N 1 M 0 m m m m     V<op><c>.<dt>_<Qd>,_<Qn>,_<Dm[x]>_V<op><c>.<dt>_<Dd>,_<Dn>,_<Dm[x]> T1/A1 A8.8.338

1 1 1 1 0 0 1 U 0 D z z n n n n d d d d 0 0 f 0 N Q M 0 m m m m     VH<op><c>_<Qd>,_<Qn>,_<Qm>_VH<op><c>_<Dd>,_<Dn>,_<Dm> T1/A1 A8.8.319

1 1 1 1 0 0 1 U 0 D z z n n n n d d d d 0 1 1 0 N Q M f m m m m     V<op><c>.<dt>_<Qd>,_<Qn>,_<Qm>_V<op><c>.<dt>_<Dd>,_<Dn>,_<Dm> T1/A1 A8.8.334

1 1 1 1 0 0 1 U 0 D z z n n n n d d d d 1 0 1 0 N Q M f m m m m     VP<op><c>.<dt>_<Dd>,_<Dn>,_<Dm> T1/A1 A8.8.365

1 1 1 1 0 0 1 U 1 D z z n n n n d d d d 0 f 1 0 N 1 M 0 m m m m     V<op>L<c>.<dt>_<Qd>,_<Dn>,_<Dm[x]> T2/A2 A8.8.338

1 1 1 1 0 0 1 U 1 D z z n n n n d d d d 1 0 f 0 N 0 M 0 m m m m     V<op>L<c>.<dt>_<Qd>,_<Dn>,_<Dm> T2/A2 A8.8.336

1 1 1 1 0 0 1 f 0 D z z n n n n d d d d 1 0 0 1 N Q M 0 m m m m     V<op><c>.<dt>_<Qd>,_<Qn>,_<Qm>_V<op><c>.<dt>_<Dd>,_<Dn>,_<Dm> T1/A1 A8.8.336

These need to be recognized as not UNDEFINED.
I'm very close of loosing my mind, and I'm surely getting very tired of fighting this.
I really had to do some work to find out the right 'MOV' in:

Encoding A2 ARMv4*, ARMv5T*, ARMv6*, ARMv7

<opc1>S<c> PC, <Rn>, <Rm>{, <shift>}

<opc2>S<c> PC, <Rm>{, <shift>}

<opc3>S<c> PC, <Rn>, #<const>

RRXS<c> PC, <Rn>

<opc2> The operation. <opc2> is MOV or MVN. ARM deprecates the use of MVN.

c c c c 0 0 0 1 1 0 1 S (0) (0) (0) (0) 1 1 1 1 0 0 0 0 0 0 0 0 m m m m     MOV{S}<c>_PC,_<Rm>_(=_LSL{S}<c>_PC,_<Rm>,_#0) A2 B9.3.20
(That is LSL (reg) with Rd=PC and immediate = 0).
[EDIT]
I frigging knew it (bolding is mine):

A8.8.290 VBIF, VBIT, VBSL

Encoding T1/A1 Advanced SIMD

V<op><c> <Qd>, <Qn>, <Qm>

V<op><c> <Dd>, <Dn>, <Dm>

if op == ‘00’ then SEE VEOR;

if op == ‘01’ then operation = VBitOps_VBSL;

if op == ‘10’ then operation = VBitOps_VBIT;

if op == ‘11’ then operation = VBitOps_VBIF;

and

A8.8.281 VACGE, VACGT, VACLE, VACLT

Encoding T1/A1 Advanced SIMD (UNDEFINED in integer-only variant)

V<op><c>.F32 <Qd>, <Qn>, <Qm>

V<op><c>.F32 <Dd>, <Dn>, <Dm>

Assembler syntax

where:

<op> The operation. It must be one of:

ACGE Absolute Compare Greater than or Equal, encoded as op = 0.

ACGT Absolute Compare Greater Than, encoded as op = 1.

What!
What happened to ACLE and ACLT?
What's the phone number of Sherlock Holmes?
Aha:

VACLE (Vector Absolute Compare Less Than or Equal) is a pseudo-instruction, equivalent to a VACGE instruction with

the operands reversed. Disassembly produces the VACGE instruction.

VACLT (Vector Absolute Compare Less Than) is a pseudo-instruction, equivalent to a VACGT instruction with the

operands reversed. Disassembly produces the VACGT instruction.

[/EDIT]
Cancel
Vote up 0 Vote down

Reply

Accept answer

Cancel

Children

0 Jens Bauer over 9 years ago in reply to Juha Aaltonen

turboscrew wrote:

These need to be recognized as not UNDEFINED.

If they're all valid, then just handle them before you check for UNDEFINED.
Cancel
Vote up 0 Vote down

Reply

Accept answer

Cancel
0 Juha Aaltonen over 9 years ago in reply to Jens Bauer

Yep. All instructions not matching the table are considered UNDEFINED.
It means that all those 'new' instructions must be added to the table too. (Sigh.)
Oh well, it's just a couple of hundred instructions more...
Cancel
Vote up 0 Vote down

Reply

Accept answer

Cancel
0 Jens Bauer over 9 years ago in reply to Juha Aaltonen

When I wrote my disassembler, there were illegal instructions, which occupied parts of legal instruction space.
In some cases, I had to make a special "illegal instruction" handling; eg. place that before the actual decoded instruction.
...
I'm quite impressed with all your work. You've absolutely done a lot in very little time!
I just took a look at the ARM_instructions.txt ...
If ignoring the condition-codes, you got 8 bits, which are almost always known.
Speed-wise it might be a real good idea to do this:
index = 0xff & (opcode >> 20);          /* isolate instruction group */
handleGroup[index](opcode);          /* jump directly to group handler */
-That means you'll shave several clock cycles off your execution time, without really sacrificing anything.
In assembly language it could of course be just a simple jump-table; r0 = opcode:
handle_group:
     ubfe r1,r0,#20,#8
     tbb [r1,lsl#1]
table:
     .4byte     MUL_AND_Group
     .4byte     MUL_AND_Group
     .4byte     MLA_EOR_Group
     .4byte     MLA_EOR_Group
     .4byte     UMAAL_SUB_Group
     .4byte     SUB_Group
     .4byte     MLS_RSB_Group
     .4byte     RSB_Group
     ...
     ...
A 256-entry table is fairly small on a RasPi
You can do this for "bits which are always known", but you can even extend it to include "bits which are often known"
Bits which are often known, could include bit 4 and perhaps bits 8...11; but it might be a good idea to wait determining what bits to include, till you have the complete table.
The above assembly code can then be declared as a function like this ...
void handle_group(uint32_t aOpcode);
and called that way; it'll indirectly jump to a C function, spending just a few clock cycles in total.
After that, you can probably focus on the low 20 bits, but in 16-bit thumb, there might be needs for modification, because 16-bit thumb does not have the 4-bit condition code field.
Cancel
Vote up 0 Vote down

Reply

Accept answer

Cancel

0 Juha Aaltonen over 9 years ago in reply to Jens Bauer

Unfortunately it's not so simple - this kind of things will mess it up causing "false positives":

c	c	c	c	1	B	0	n	n	n	n	t	t	t	t	(0)	(0)	(0)	(0)	1	1	T	T	T	T	SWP{B}<c>_<Rt>,_<Rt2>,_[<Rn>]	A1	A8.8.229
c	c	c	c	1	R	0	(1)	(1)	(1)	(1)	d	d	d	d	(0)	(0)	0	(0)	0	0	(0)	(0)	(0)	(0)	MRS<c>_<Rd>,_<spec_reg>	A1	B9.3.8
c	c	c	c	1	R	0	M	M	M	M	d	d	d	d	(0)	(0)	1	M	0	0	(0)	(0)	(0)	(0)	MRS<c>_<Rd>,_<banked_reg>	A1	B9.3.9
c	c	c	c	1	R	1	M	M	M	M	(1)	(1)	(1)	(1)	(0)	(0)	1	M	0	0	n	n	n	n	MSR<c>_<banked_reg>,_<Rn>	A1	B9.3.10
c	c	c	c	1	R	1	m	m	m	m	(1)	(1)	(1)	(1)	(0)	(0)	0	(0)	0	0	n	n	n	n	MSR<c>_<spec_reg>,_<Rn>	A1	B9.3.12

(0)

STRH<c>_<Rt>,_[<Rn>,+/-<Rm>]{!}_STRH<c>_<Rt>,_[<Rn>],+/-<Rm>

A8.8.218

(0)

LDRD<c>_<Rt>,_<Rt2>,_[<Rn>,+/-<Rm>]{!}_LDRD<c>_<Rt>,_<Rt2>,_[<Rn>],+/-<Rm>

A8.8.74

(0)

STRD<c>_<Rt>,_<Rt2>,_[<Rn>,+/-<Rm>]{!}_STRD<c>_<Rt>,_<Rt2>,_[<Rn>],+/-<Rm>

A8.8.211

(0)

LDRH<c>_<Rt>,_[<Rn>,+/-<Rm>]{!}_LDRH<c>_<Rt>,_[<Rn>],+/-<Rm>

A8.8.82

(0)

LDRSB<c>_<Rt>,_[<Rn>,+/-<Rm>]{!}_LDRSB<c>_<Rt>,_[<Rn>],+/-<Rm>

A8.8.86

(0)

LDRSH<c>_<Rt>,_[<Rn>,+/-<Rm>]{!}_LDRSH<c>_<Rt>,_[<Rn>],+/-<Rm>

A8.8.90

STRH<c>_<Rt>,_[<Rn>{,_#+/-<imm8>}]_STRH<c>_<Rt>,_[<Rn>],_#+/-<imm8>_STRH<c>_<Rt>,_[<Rn>,_#+/-<imm8>]!

A8.8.217

LDRD<c>_<Rt>,_<Rt2>,_[<Rn>{,_#+/-<imm8>}]_LDRD<c>_<Rt>,_<Rt2>,_[<Rn>],_#+/-<imm8>_LDRD<c>_<Rt>,_<Rt2>,_[<Rn>,_#+/-<imm8>]!

A8.8.72

STRD<c>_<Rt>,_<Rt2>,_[<Rn>{,_#+/-<imm8>}]_STRD<c>_<Rt>,_<Rt2>,_[<Rn>],_#+/-<imm8>_STRD<c>_<Rt>,_<Rt2>,_[<Rn>,_#+/-<imm8>]!

A8.8.210

LDRH<c>_<Rt>,_<label>_LDRH<c>_<Rt>,_[PC,_#-0]_Special_case

A8.8.81

LDRSB<c>_<Rt>,_<label>_LDRSB<c>_<Rt>,_[PC,_#-0]_Special_case

A8.8.85

LDRSH<c>_<Rt>,_<label>_LDRSH<c>_<Rt>,_[PC,_#-0]_Special_case

A8.8.89

LDRH<c>_<Rt>,_[<Rn>{,_#+/-<imm8>}]_LDRH<c>_<Rt>,_[<Rn>],_#+/-<imm8>_LDRH<c>_<Rt>,_[<Rn>,_#+/-<imm8>]!

A8.8.80

LDRSB<c>_<Rt>,_[<Rn>{,_#+/-<imm8>}]_LDRSB<c>_<Rt>,_[<Rn>],_#+/-<imm8>_LDRSB<c>_<Rt>,_[<Rn>,_#+/-<imm8>]!

A8.8.84

LDRSH<c>_<Rt>,_[<Rn>{,_#+/-<imm8>}]_LDRSH<c>_<Rt>,_[<Rn>],_#+/-<imm8>_LDRSH<c>_<Rt>,_[<Rn>,_#+/-<imm8>]!

A8.8.88

The only bit that is either '0' or '1' (instruction specific) when the first 3 bits (27, 26, 25) after condition code field is 0 0 0, is bit 4.

If bit 4 = 0 then you can use the next bits (24, 23) but if bit 4 is '1', you have to check from bit 7 what are the next bits.

If bit 7 is '0', then next bits are 24 and 23, if bit 7 is '1', the next bits are 6 and 5.

And so on.

if the bits 27, 26 and 25 are 0 1 0, then there is only one instruction: single data transfer:

In the list it's listed like in the manual, but in the reality it's:

A8.8.204

P = 1, pre-indexing, otherwise post-indexing or offset

U = 1 offset is added, otherwise offset is subtracted

B = 1 byte access, else word access

W = 1 writeback, else no writeback

L = 1 load, else store

It's a bit different with the media- or special LD/ST-instructions.

Sometimes all the above is not used, but are just part of opcode, sometimes B = 1 register, else immediate(?).

BTW, there will be another update to the ARM instruction list, and to the spreadsheet. I'll look into Thumbs not until I learn to do it better playing with ARM instructions first.