ARM/THUMB instructions that change execution path?

Has anybody come across a list of ARM & THUMB instructions that cause deviation from the linear instruction stream?

I've been trying to figure out gdb-stub single stepping using software interrupts, and in single stepping you need to find

the next instruction(s) where the next breakpoint instruction needs to be set.

There are three cases:

1) current instruction doesn't change the execution path. Next instruction is the next word.

2) current instruction is a jump. The operand defines the next instruction address

3) current instruction is conditional branch. One possible next instruction is the next word, the other possible

instruction address is defined by the operand. (That includes conditional add with PC as the target, and the like).

To implement single stepping, I need to tell those cases apart and figure out how to find out the possible branching address.

I could go through manuals of numerous processors instruction by instruction and maybe I'd be done within the next couple of years,

or I could find a list of instructions to check, or a paper that explains how to "decode" the instructions in a useful way.

Also, there doesn't seem to be lots of sources of ARM gdb servers or stubs around that use software breakpoints.

Parents

0 Juha Aaltonen over 9 years ago in reply to Jens Bauer

Table approach may not be very good with ARM, because the instruction defining bits are not in constant places. Not even mostly, except the 3 bits after condition code, and sometimes a special register value makes another instruction.
Also, in this program, I don't care about the instruction as such, but just the 'next address' after the instruction.
Sometimes it's easier to execute than to 'simulate' the code:
unsigned int check_msr_reg(uint32_t instr)
{
    unsigned int new_pc = rpi2_reg_context.reg.r15;
    // if user mode, then can't even guess
    tmp1 = (uint32_t) rpi2_reg_context.reg.cpsr;
    if ((tmp1 & 0x1f) == 0) // user mode
    {
        // UNPREDICTABLE - whether banked or not
        // the bits 15 - 0 are UNKNOWN
        new_pc = INSTR_ADDR_UNDEF;
    }
    else
    {
        // privileged mode - both reg and banked reg
        tmp2 = (instr & 0xffff0fff) | (1 << 12); // edit Rd = r1
        iptr = (uint32_t *) mrs_regb;
        *iptr = tmp2;
        asm(
            "push {r0, r1}\n\t"
            "mrs r1, cpsr @ save cpsr\n\t"
            "push {r1}\n\t"
            "ldr r0, =tmp1 @ set cpsr\n\t"
            "msr cpsr, r0 @ note: user mode is already excluded\n\t"
            "mrs_regb: .word 0 @ execute instr with our registers\n\t"
            "ldr r0, =tmp2\n\t"
            "str r1, [r0] @ store result to tmp2\n\t"
            "pop {r1} @ restore cpsr\n\t"
            "msr cpsr, r1\n\t"
            "pop {r0, r1}\n\t"
            );
        new_pc = (unsigned int) tmp2;
    }
    return new_pc;
}
In this project this far I've learned about ARM (never really used before), awk (never used it before) and inline assembly (accessing C variables). For some unknown reason, I haven't been keen to use the inline assembly extensions though.
Cancel
Up 0 Down

Reply

Accept answer

Cancel

Reply

0 Juha Aaltonen over 9 years ago in reply to Jens Bauer

Table approach may not be very good with ARM, because the instruction defining bits are not in constant places. Not even mostly, except the 3 bits after condition code, and sometimes a special register value makes another instruction.
Also, in this program, I don't care about the instruction as such, but just the 'next address' after the instruction.
Sometimes it's easier to execute than to 'simulate' the code:
unsigned int check_msr_reg(uint32_t instr)
{
    unsigned int new_pc = rpi2_reg_context.reg.r15;
    // if user mode, then can't even guess
    tmp1 = (uint32_t) rpi2_reg_context.reg.cpsr;
    if ((tmp1 & 0x1f) == 0) // user mode
    {
        // UNPREDICTABLE - whether banked or not
        // the bits 15 - 0 are UNKNOWN
        new_pc = INSTR_ADDR_UNDEF;
    }
    else
    {
        // privileged mode - both reg and banked reg
        tmp2 = (instr & 0xffff0fff) | (1 << 12); // edit Rd = r1
        iptr = (uint32_t *) mrs_regb;
        *iptr = tmp2;
        asm(
            "push {r0, r1}\n\t"
            "mrs r1, cpsr @ save cpsr\n\t"
            "push {r1}\n\t"
            "ldr r0, =tmp1 @ set cpsr\n\t"
            "msr cpsr, r0 @ note: user mode is already excluded\n\t"
            "mrs_regb: .word 0 @ execute instr with our registers\n\t"
            "ldr r0, =tmp2\n\t"
            "str r1, [r0] @ store result to tmp2\n\t"
            "pop {r1} @ restore cpsr\n\t"
            "msr cpsr, r1\n\t"
            "pop {r0, r1}\n\t"
            );
        new_pc = (unsigned int) tmp2;
    }
    return new_pc;
}
In this project this far I've learned about ARM (never really used before), awk (never used it before) and inline assembly (accessing C variables). For some unknown reason, I haven't been keen to use the inline assembly extensions though.
Cancel
Up 0 Down

Reply

Accept answer

Cancel

Children

0 Jens Bauer over 9 years ago in reply to Juha Aaltonen

Ah, yes. things are coming back to me.
My debugger usually ran on a 68000, but my MegaSTE had a 68010, so I wrote an instruction emulator. It could emulate 68010, 20, 30, 40 and CPU32 instructions (the latter was never tested, though).
-Sometimes it's also a lot faster to simulate instruction execution.
Cancel
Up 0 Down

Reply

Accept answer

Cancel