ARM/THUMB instructions that change execution path?

Has anybody come across a list of ARM & THUMB instructions that cause deviation from the linear instruction stream?

I've been trying to figure out gdb-stub single stepping using software interrupts, and in single stepping you need to find

the next instruction(s) where the next breakpoint instruction needs to be set.

There are three cases:

1) current instruction doesn't change the execution path. Next instruction is the next word.

2) current instruction is a jump. The operand defines the next instruction address

3) current instruction is conditional branch. One possible next instruction is the next word, the other possible

instruction address is defined by the operand. (That includes conditional add with PC as the target, and the like).

To implement single stepping, I need to tell those cases apart and figure out how to find out the possible branching address.

I could go through manuals of numerous processors instruction by instruction and maybe I'd be done within the next couple of years,

or I could find a list of instructions to check, or a paper that explains how to "decode" the instructions in a useful way.

Also, there doesn't seem to be lots of sources of ARM gdb servers or stubs around that use software breakpoints.

Parents Reply Children
  • Yes, my lazy nature...

    If I do a full decoding and then still handle instructions in groups, I'll have a decoder "sceleton" that I can use for something else if I happen to need to.

    case arm_xtra_hint:

        // WFE,WFI

        // neither changes the program flow

        retval = set_addr_lin();

        break;

    Then again:

    case arm_xtra_cmode:

        // Check cmode to see if it's VBIC (imm) or VMVN (imm)

        if (bitrng(instr, 11,9) != 7)

        {

            // either VBIC or VMVN

            if (bitrng(instr, 11, 10) == 3)

            {

                // VMVN

            }

            else if (bit(instr, 8))

            {

                // VBIC

            }

            else

            {

                // VMVN

            }

        }

        // else UNDEFINED

        break;

    Well, haven't got too far in this yet.

    (And I need to change the enums used for the switch-cases here.

  • The LD/ST encodings are "interesting":

    There are basically 4 main groups of them

    when the bits 27 -25:

    0 0 0:

         if bit 6 = 0

              bit 20 = LD/ST: 0=ST, 1=LD

              bit 5 = EX/H: 0=EX, 1=H (STREX/STRH)

                   EX: bits 22 - 21: 00=REX, 01=REXD, 10=REXB, 11=REXH

                   H: bit22: 1=imm, 0=reg

                   bit21=writeback

         if bit 6 = 1

              bit 20 = 0: LDRD/STRD

                   bit 5: 0=LD, 1=ST

                   bit 21=writeback

                   bit 22: 0=reg, 1=imm

              bit 20 = 1:

                   bit 21=writeback

                   bit 22: 0=reg, 1=imm

                   if bit 5 = 1 LDRSB

                   if bit 5 = 0 LDRSH

                   (there is no STRSB or STRSH)

    0 1 0: LD/ST imm

         bits 24 - 20: PUBWL

              P= post-indexing

              U=immediate sign (1=added, 0=subtracted)

              B=access (0=word, 1=byte)

              W=writeback (1=writeback, 0=n0 writeback)

              L: 1=load, 0=store

              (special case: P=0, W=1 => unprivileged)

    0 1 1: LD/ST reg

         same as LD/ST imm

    1 0 0: LDM/STM

         bits 24 - 20: BIMWL

              B= before (0=after,1=before)

              I=increment (0=decrement, i=increment)

              M=mode (0=current mode, 1=user mode)

              W=writeback (1=writeback, 0=n0 writeback)

              L: 1=load, 0=store