ARM/THUMB instructions that change execution path?

Has anybody come across a list of ARM & THUMB instructions that cause deviation from the linear instruction stream?

I've been trying to figure out gdb-stub single stepping using software interrupts, and in single stepping you need to find

the next instruction(s) where the next breakpoint instruction needs to be set.

There are three cases:

1) current instruction doesn't change the execution path. Next instruction is the next word.

2) current instruction is a jump. The operand defines the next instruction address

3) current instruction is conditional branch. One possible next instruction is the next word, the other possible

instruction address is defined by the operand. (That includes conditional add with PC as the target, and the like).

To implement single stepping, I need to tell those cases apart and figure out how to find out the possible branching address.

I could go through manuals of numerous processors instruction by instruction and maybe I'd be done within the next couple of years,

or I could find a list of instructions to check, or a paper that explains how to "decode" the instructions in a useful way.

Also, there doesn't seem to be lots of sources of ARM gdb servers or stubs around that use software breakpoints.

Parents

0 Jens Bauer over 9 years ago in reply to Juha Aaltonen

When I wrote my 68xxx debugger, the table was fine (though these were only 16-bit words).
-But ARM's instruction set is not too complicated either. I have not had a look at Cortex-A yet, but the time will come.
In case the table gets too large, you have an extra approach: To split the words into two 16-bit words, so you find the "main part", which leads to a "sub-tree".
Remember: The execution unit in the processor does the job very, very quickly, so I am convinced that ARM designed the instruction set, so it should be easy to dispatch (even by using code).
Yes, the hard part is to find out how.
The good thing about using pointers, is that you can make your lookup-routine in assembly language, and it can jump directly to your C routine.
You can then call it as a C-function, because it uses "goto-style", thus it'll be completely transparent and your C-code will behave like a normal subroutine-call; just very quickly.
The above look-up example can be unrolled easily; this will save a few clock cycles on each iteration:
find_instr:
    ldmia   r4!,{r1-r3,r12}
    and     r2,r2,r0
    cmp     r3,r2
    .rept   7
    ldmiane r4!,{r1-r3,r12}
    andne   r2,r2,r0
    cmpne   r3,r2
    .endr
    bne     find_instr
    /* here r0 contains the instruction opcode, r1 contains the name and r12 contains the address of the handler */
    pop     { ... }          /* restore any saved registers */
    bx      r12              /* jump directly to the handler */

-Change the '.rept' count as you like... make it 15 or 31, adjust it to suit your needs. Perhaps a large number may start to cause longer execution time, but it's a question of balance.
If you're lucky, you can place instructions that are used often in the beginning of the table (I did that with my debugger, and it started to become quite quick at disassembling).
This kind of code is something I really like. The table-lookup, masks and AND stuff - it brings out good memories too.
-But of course, sometimes it might be easier or shorter or quicker to write a switch-statement and use enumerations for handling each instruction type.
Some instruction types could be handled by the same handler; eg. AND/ORR/EOR and ADD/SUB.
In many cases, it's useful to think of instructions as being in "instruction groups". Eg. LDR/STR is a good example, AND/ORR/EOR, ASL/ASR/LSL/LSR/ROR, etc.
Cancel
Up 0 Down

Reply

Accept answer

Cancel

Reply

0 Jens Bauer over 9 years ago in reply to Juha Aaltonen

When I wrote my 68xxx debugger, the table was fine (though these were only 16-bit words).
-But ARM's instruction set is not too complicated either. I have not had a look at Cortex-A yet, but the time will come.
In case the table gets too large, you have an extra approach: To split the words into two 16-bit words, so you find the "main part", which leads to a "sub-tree".
Remember: The execution unit in the processor does the job very, very quickly, so I am convinced that ARM designed the instruction set, so it should be easy to dispatch (even by using code).
Yes, the hard part is to find out how.
The good thing about using pointers, is that you can make your lookup-routine in assembly language, and it can jump directly to your C routine.
You can then call it as a C-function, because it uses "goto-style", thus it'll be completely transparent and your C-code will behave like a normal subroutine-call; just very quickly.
The above look-up example can be unrolled easily; this will save a few clock cycles on each iteration:
find_instr:
    ldmia   r4!,{r1-r3,r12}
    and     r2,r2,r0
    cmp     r3,r2
    .rept   7
    ldmiane r4!,{r1-r3,r12}
    andne   r2,r2,r0
    cmpne   r3,r2
    .endr
    bne     find_instr
    /* here r0 contains the instruction opcode, r1 contains the name and r12 contains the address of the handler */
    pop     { ... }          /* restore any saved registers */
    bx      r12              /* jump directly to the handler */

-Change the '.rept' count as you like... make it 15 or 31, adjust it to suit your needs. Perhaps a large number may start to cause longer execution time, but it's a question of balance.
If you're lucky, you can place instructions that are used often in the beginning of the table (I did that with my debugger, and it started to become quite quick at disassembling).
This kind of code is something I really like. The table-lookup, masks and AND stuff - it brings out good memories too.
-But of course, sometimes it might be easier or shorter or quicker to write a switch-statement and use enumerations for handling each instruction type.
Some instruction types could be handled by the same handler; eg. AND/ORR/EOR and ADD/SUB.
In many cases, it's useful to think of instructions as being in "instruction groups". Eg. LDR/STR is a good example, AND/ORR/EOR, ASL/ASR/LSL/LSR/ROR, etc.
Cancel
Up 0 Down

Reply

Accept answer

Cancel

Children

0 Juha Aaltonen over 9 years ago in reply to Jens Bauer

If you're lucky, you can place instructions that are used often in the beginning of the table

You read my mind.

But ARM's instruction set is not too complicated either.

Assembly is not, but the encoding is.

(0)

MUL{S}<c> <Rd>,<Rn>,<Rm>

A8.8.114

AND{S}<c> <Rd>,<Rn>,<Rm>,<type>,<Rs>

A8.8.15

AND{S}<c> <Rd>,<Rn>,<Rm>{,<sift>}

A8.8.14

MLA{S}<c> <Rd>,<Rn>,<Rm>,<Ra>

A8.8.100

EOR{S}<c> <Rd>,<Rn>,<Rm>,<type>,<Rs>

A8.8.48

EOR{S}<c> <Rd>,<Rn>,<Rm>{,<sift>}

A8.8.47

UMAAL<c> <RdLo>,<RdHi>,<Rn>,<Rm>

A8.8.255

SUB{S}<c> <Rd>,SP,<Rm>{,<sift>}

A8.8.226

SUB{S}<c> <Rd>,<Rn>,<Rm>,<type>,<Rs>

A8.8.224

and:

c	c	c	c	1	1	0	S	n	n	n	n	d	d	d	d	s	s	s	s	0	T	T	1	m	m	m	m	ORR{S}<c> <Rd>,<Rn>,<Rm>,<type>,<Rs>	A1	A8.8.124
c	c	c	c	1	1	0	S	n	n	n	n	d	d	d	d	x	x	x	x	x	T	T	0	m	m	m	m	ORR{S}<c> <Rd>,<Rn>,<Rm>{,<sift>}	A1	A8.8.123
c	c	c	c	1	1	1	0	n	n	n	n	d	d	d	d	(1)	(1)	(1)	(1)	1	0	0	1	t	t	t	t	STREXD<c> <Rd>,<Rt>,<Rt2>,[<Rn>]	A1	A8.8.214
c	c	c	c	1	1	1	1	n	n	n	n	t	t	t	t	(1)	(1)	(1)	(1)	1	0	0	1	(1)	(1)	(1)	(1)	LDREXD<c> <Rt>,<Rt2>,[<Rn>]	A1	A8.8.77
c	c	c	c	1	1	1	S	(0)	(0)	(0)	(0)	d	d	d	d	0	0	0	0	0	0	0	0	m	m	m	m	MOV{S}<c> <Rd>,<Rm>	A1	A8.8.104
c	c	c	c	1	1	1	S	(0)	(0)	(0)	(0)	d	d	d	d	0	0	0	0	0	1	1	0	m	m	m	m	RRX{S}<c> <Rd>,<Rm>	A1	A8.8.151
c	c	c	c	1	1	1	S	(0)	(0)	(0)	(0)	d	d	d	d	m	m	m	m	0	0	0	1	n	n	n	n	LSL{S}<c> <Rd>,<Rn>,<Rm>	A1	A8.8.95
c	c	c	c	1	1	1	S	(0)	(0)	(0)	(0)	d	d	d	d	m	m	m	m	0	0	1	1	n	n	n	n	LSR{S}<c> <Rd>,<Rn>,<Rm>	A1	A8.8.97
c	c	c	c	1	1	1	S	(0)	(0)	(0)	(0)	d	d	d	d	m	m	m	m	0	1	0	1	n	n	n	n	ASR{S}<c> <Rd>,<Rn>,<Rm>	A1	A8.8.17

Oh, and for small assembly routines I've been using inline asm, like

void rpi2_trap_handler()

{

// IRQs need to be enabled for serial I/O

asm volatile (

"push {r0}\n\t"

"mrs r0, cpsr\n\t"

"bic r0, #128 @ enable irqs\n\t"

"msr cpsr, r0\n\t"

"pop {r0}\n\t"

);

gdb_trap_handler();

}