We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
Why are there different encodings of instructions?
What's the idea/background/etc for their co-existence?
Can different encodings be mixed in the code? (Not ARM encodings with Thumb encodings- without ARM/Thumb mode change,
but, like A1 and A2 or T1 and T2)?
I'm trying to put together a gdb stub, and for single stepping the machine code needs to be partially decoded.
How can one tell apart the encoding of an instruction in a machine code program (binary)?
Oh, and an additional question: what do the bit values in parenthesis mean?
For the case when cond is 0b1111, see Unconditional instructions on page A5-216.
t = UInt(Rt); n = UInt(Rn); imm32 = Zeros(32); // Zero offset
if t == 15 || n == 15 then UNPREDICTABLE;
Encoding T1 ARMv6T2, ARMv7
LDREX<c> <Rt>, [<Rn>{, #<imm>}]
Encoding A1 ARMv6*, ARMv7
LDREX<c> <Rt>, [<Rn>]
1 1 0 1 0 0 0 0 1 0 1 Rn Rt (1) (1) (1) (1) imm8
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
1
1514131211 10 9 8 7 6 5 4 3 2 1 0
cond 0 0 0 1 1 0 0 1 Rn Rt (1) (1) (1) (1) 1 0 0 1 (1) (1) (1) (1)
At least QRC0001_UAL.pdf doesn't contain the binary representation, and I need to be sure that all possible (working) instructions of ARMv7-A are handled.
Your starting point is the Quick Reference Card - that gives you your target instruction checklist. Then, for all its flaws, you need the Architecture Reference Manual, because it has all of the binary patterns IN CONTEXT. Finally, you might want to download the official GDB Source Code - it's free! The truth is, which ever way you slice it, you've got a Magnum Opus on your hands here! Best of Luck, and please let us know how you get on.
The truth is, which ever way you slice it, you've got a Magnum Opus on your hands here!
I've become to realize. Long laborous and frustrating task.
I did get gdb sources (gdb_7.9.0), but without some crossreferencer it's very hard to find stuff there.
The same goes with OpenOCD. Function pointers everywhere, and no clue where they are set.
[edit]
I found a pretty good example of how to do the decoding far enough to handle single stepping.
gdbserver sources: arm_tdep.c, rm_get_next_pc_raw() and thumb_get_next_pc_raw().
The decoding is really done with style! The experience with ARM instruction sets shows.
I think I'll go with it and return to the full instruction list later.