I wonder if the bit 22 has some function in instructions like LDRH, STRH, LDRSBT, LDRD, ... (bits 27, 26, 25 = 0, 0, 0)?
In some instructions, like LDR, STR, LDRB (bits 27, 26, 25 = 0, 1, 1) , I understand, it chooses between byte and word access.
Seems a fine answer to me.
The arm arm has the rules about decoding in it and the ones you are interested in come under 'Extra load/store instructions'. They weren't there from the start but came in in ARMv4 or the double ones in ARMv5. They had to fit into the space left over by the earlier instructions so I wouldn't expect a fully consistent and easy decoding into bit meanings. Also if you are unsure about something you could try putting in a hexadecimal word and seeing how it disassembles..
Hello daith,
I tried to dis-assemble the hexadecimal words.
The below are the results.
code mnemonic bit22 inverted mnemonic comment----------------------------------------------------------------------------------------------------- e5901000 ldr r1, [r0] e5d01000 ldrb r1, [r0] " byte access e5801000 str r1, [r0] e5c01000 strb r1, [r0] " byte access e4b01000 ldrt r1, [r0], #0 e4f01000 ldrbt r1, [r0], #0 " byte access e4a01000 strt r1, [r0], #0 e4e01000 strbt r1, [r0], #0 " byte access e1901f9f ldrex r1, [r0] e1d01f9f ldrexb r1, [r0] " byte access e1801f92 strex r1, r2, [r0] e1c01f92 strexb r1, r2, [r0] " byte access e1d010b0 ldrh r1, [r0] e19010b0 ldrh r1, [r0, r0] " immediate e1c010b0 strh r1, [r0] e18010b0 strh r1, [r0, r0] " immediate e0f010b0 ldrht r1, [r0], #0 e0b010b0 ldrht r1, [r0], r0 " immediate e0e010b0 strht r1, [r0], #0 e0a010b0 strht r1, [r0], r0 " immediate e1f01f9f ldrexh r1, [r0] e1b01f9f ldrexd r1, [r0] " half word access e1e01f92 strexh r1, r2, [r0] e1a01f92 strexd r1, r2, [r0] " half word access e5d01000 ldrb r1, [r0] e5901000 ldr r1, [r0] " byte access e5c01000 strb r1, [r0] e5801000 str r1, [r0] " byte access e4f01000 ldrbt r1, [r0], #0 e4b01000 ldrt r1, [r0], #0 " byte access e4e01000 strbt r1, [r0], #0 e4a01000 strt r1, [r0], #0 " byte access e1d01f9f ldrexb r1, [r0] e1901f9f ldrex r1, [r0] " byte access e1c01f92 strexb r1, r2, [r0] e1801f92 strex r1, r2, [r0] " byte access e1d010d0 ldrsb r1, [r0] e19010d0 ldrsb r1, [r0, r0] " immediate e1d010f0 ldrsh r1, [r0] e19010f0 ldrsh r1, [r0, r0] " immediate e1c000d0 ldrd r0, [r0] e18000d0 ldrd r0, [r0, r0] " immediate e1c000f0 strd r0, [r0] e18000f0 strd r0, [r0, r0] " immediate e1b02f9f ldrexd r2, [r0] e1f02f9f ldrexh r2, [r0] " half word access e1a04f92 strexd r4, r2, [r0] e1e04f92 strexh r4, r2, [r0] " half word access
Best regards,
Yasuhiko Koumoto.
I've seen that.It looks like new command groups are added in best fit manner to keep the instruction set fragmentation low.
Thanks!
There is something for me to chew for a while.
BTW, what do you mean by "immediate"? I would expect to see some constant value in immediates.
Also, funny to see that it's not byte/word access, but more like full size/half size access.
Hello,
"immediate" means basically "base register + displacement" addressing such as "LDR r0, [r1,#immd]".
To the contrary "index" means "base register + index" addressing such as "LDR r0,[r1,r2]".
I always thought of [r1,#imm] as being indexed, just as a constant index.
But it would be more correct to call it immediate and the base register + index register for indexed mode.
-There are a few more modes:
pre-increment / pre-decrement (aka pre-update):
ldr r0,[r1,#10]!
post-increment / post-decrement (aka post-update):
ldr r0,[r1],#10
The following two are allowed in the older ARM architectures, but unfortunately not in thumb or thumb2:
ldr r0,[r1,r2]!
ldr r0,[r1],r2!
-The post-update would have been particularly useful for me, particularly when being able to use them with LSL, LSR and ASR#16.
But as the Cortex is clocked higher, it will still outperform the older architectures even when adding the extra add instruction (which is "for free" on the Cortex-M7 by the way).
I, myself, am accustomed with the convention used with many other processors:
(6502, 68xx 68xxx, VAX11, PDP11, 8085, x86, PPC (I recall), ...)
stc; set carry - no operands => implied
ldr r, #val; the value 'val' is loaded into r - immediate
ldr r, address; the value in address 'address' is loaded into r - absolute
ldr r1, r2; the value of r2 is loaded into r1 - register(-direct)
ldr r1, [ r2]; the value pointed to by r2 is loaded into r1 - register indirect (or indexed)
ldr r1, [r2, #val]; the value pointed to by (r2 + #val) is loaded into r1 - indexed
ldr r1, [PC, #val]; the value pointed to by (PC + #val) is loaded into r1 - (PC-)relative
ldr r1, [r2++]; the value pointed to by r2 is loaded into r1, then r2 is incremented - indirect (or indexed) post-increment
...
The '[ ]' (sometimes '( )') usually means indirection, so
ldr r1, [r2], #imm would mean that imm would be added to the number pointed to by r2 and the result is loaded into r1.
(= add r1, [r2], #imm)
I don't recall encountering such instruction (whereas the 'add'-version is very familiar).
The ARM instruction set is, however, quite different also in philosophy, so different naming convention is
quite understandable.
Hello jensbauer,
thank you for your comments.
If we consider about the pre-indexed or post-indexed, it would be reasonable to take the immediate for the index.
Therefore, we should call it as the immediate indexing or the register indexing, shouldn't we?
in legacy (or so called CISC) processors, the load or the move instruction included all addressing modes.
However, at RISC processors, it have come that the load and the move are thought different functions.
The move is used for the immediate or the register addressing.
The load (or store) is used for the memory addressing.
This would come from the load/store architecture.
The memory addressing will form "base register + immediate index" or "base register + (index register with shift)".
From the RISC view point, it would not be strange and it would be even normal.
The ARM specific feature would be that the memory addressing had the pre-indexed or post-indexed mode".
I am sorry but I cannot understand the meaning of "add r1, [r2], #imm".
ldr r1, [r2], #imm
would mean the following sequence because it was the post-indexed.
1) load a memory contents of [r2] into r1
2) add #imm to r2.
This would be simllar to the legacy processer's notation of "ldr r1,[r2++]".
In the "[r2++]" case, the incremented value is only the access size (i.e. 1 for byte, 2 for half-word and 4 for word).
In the ARM case, the indexing is generalized to be able to add any value specified by #imm.
yasuhikokoumoto wrote: Therefore, we should call it as the immediate indexing or the register indexing, shouldn't we?
yasuhikokoumoto wrote:
I agree. If we do that, then there will be no doubt what we mean. I will start practicing this.
With "ldr r1, [r2], #imm = add r1, [r2], #imm" I mean the architectures I'm more familiar with, not in "ARM language".