ARM/THUMB instructions that change execution path?

Has anybody come across a list of ARM & THUMB instructions that cause deviation from the linear instruction stream?

I've been trying to figure out gdb-stub single stepping using software interrupts, and in single stepping you need to find

the next instruction(s) where the next breakpoint instruction needs to be set.

There are three cases:

1) current instruction doesn't change the execution path. Next instruction is the next word.

2) current instruction is a jump. The operand defines the next instruction address

3) current instruction is conditional branch. One possible next instruction is the next word, the other possible

instruction address is defined by the operand. (That includes conditional add with PC as the target, and the like).

To implement single stepping, I need to tell those cases apart and figure out how to find out the possible branching address.

I could go through manuals of numerous processors instruction by instruction and maybe I'd be done within the next couple of years,

or I could find a list of instructions to check, or a paper that explains how to "decode" the instructions in a useful way.

Also, there doesn't seem to be lots of sources of ARM gdb servers or stubs around that use software breakpoints.

Parents
  • This is getting "interesting".

    I generated the masks and data from the excel, then I exported the data (only mask data, no instruction names) as .csv.

    Then I run sort and then uniq.

    The result was 475 lines. The data before uniq was 498 lines.

    => there are 23 lines (instructions) dropped.

    Run uniq -cd and searched the non-unique instruction data (added the instructions at the ends of lines):

      2 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 00 00 00 mrs reg norm/priv
      2 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 20 00 00 msr reg norm/priv
      2 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 A0 00 00 LSL (imm)/MOV
      2 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 01 A0 00 60 ROR (imm)/RRX
      2 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 01 A0 F0 00 LSL (imm)/MOV PC
      4 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 03 20 00 00 MSR (imm) norm/priv/NOP/YIELD
      2 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 03 20 00 02 WFE/WFI
      2 1 1 1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 F2 20 01 10 VMOV/VORR
      3 1 1 1 1 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 F2 80 00 10 VMOV/VORR/VSRH (imm)
      2 1 1 1 1 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 F2 80 00 30 VBIC/VMVN (imm)
      2 1 1 1 1 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 F2 80 08 10 VQSHR{U}N/VSHRN (imm)
      2 1 1 1 1 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 F2 80 08 50 VQRSHR{U}N/VRSHRN (imm)
      2 1 1 1 1 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 0 0 0 F2 80 0A 10 VMOVL/VSHLL(imm != size)
      2 1 1 1 1 0 0 1 1 1 0 1 1 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 F3 B2 02 00 VMOVN/VQMOV{U}N
      4 1 1 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 F4 00 00 00 VST1-4
      4 1 1 1 1 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 F4 20 00 00 VLD1-4

    I hope it's not very complicated to tell apart the instructions with the same data.

Reply
  • This is getting "interesting".

    I generated the masks and data from the excel, then I exported the data (only mask data, no instruction names) as .csv.

    Then I run sort and then uniq.

    The result was 475 lines. The data before uniq was 498 lines.

    => there are 23 lines (instructions) dropped.

    Run uniq -cd and searched the non-unique instruction data (added the instructions at the ends of lines):

      2 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 00 00 00 mrs reg norm/priv
      2 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 20 00 00 msr reg norm/priv
      2 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 A0 00 00 LSL (imm)/MOV
      2 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 01 A0 00 60 ROR (imm)/RRX
      2 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 01 A0 F0 00 LSL (imm)/MOV PC
      4 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 03 20 00 00 MSR (imm) norm/priv/NOP/YIELD
      2 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 03 20 00 02 WFE/WFI
      2 1 1 1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 F2 20 01 10 VMOV/VORR
      3 1 1 1 1 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 F2 80 00 10 VMOV/VORR/VSRH (imm)
      2 1 1 1 1 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 F2 80 00 30 VBIC/VMVN (imm)
      2 1 1 1 1 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 F2 80 08 10 VQSHR{U}N/VSHRN (imm)
      2 1 1 1 1 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 F2 80 08 50 VQRSHR{U}N/VRSHRN (imm)
      2 1 1 1 1 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 0 0 0 F2 80 0A 10 VMOVL/VSHLL(imm != size)
      2 1 1 1 1 0 0 1 1 1 0 1 1 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 F3 B2 02 00 VMOVN/VQMOV{U}N
      4 1 1 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 F4 00 00 00 VST1-4
      4 1 1 1 1 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 F4 20 00 00 VLD1-4

    I hope it's not very complicated to tell apart the instructions with the same data.

Children