Using movw and movt to load a label address into a register in Arm 32 architecture. but this is not position independent code.
movw r1, #:lower16:ASM_NAME(forkx)movt r1, #:upper16:ASM_NAME(forkx)
As per the manual also it specifies that it will be resolved at the link time.
Need a position independent code, so as per the manual adr, adrl can be used, but getting below error:
../asm-arm/unix_arm.S:115:1: error: unsupported relocation on symboladr r1, __be_forkx
../asm-arm/unix_arm.S:60:1: error: invalid instruction, did you mean: adr?adrl r1, __be_forkx
it seems label can not be used in the aarch32, it is fine in aarch64 and works as intendent.
is the usage of adr command is improper? Is there a way to achieve this in aarch32? is there any equivalent command that can be used?
Thanks a lot for all your time and help.
So suggestion is to use a add and subtrach a offset to PC to get the real offset. got the suggestion and trying to implement.
trying to achieve this with add and sub.
../asm-arm/unix_arm.S:116:9: error: expected relocatable expression add r1, pc, __be_forkx
../asm-arm/unix_arm.S:116:9: error: expected relocatable expression add r1, pc, #__be_forkx
I have used the similar add instruction in A64 as below and it works fine to get the label offset.
adrp x4, __be_forkx add x4,x4, :lo12:__be_forkx
Is A32 have some specific way? checked in the add instruction in the manual below:
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.100069_0609_00_en/pge1425889876961.html&_ga=2.3850608.1165552664.1585541156-1811185655.1581583184
it specifies it need to be a constant and as below:
ADD Rd, pc, #immimm range 0-1020, word aligned. Rd must be a Lo register. Bits[1:0] of the PC are read as 0 in this instruction.
ADD Rd, pc, #imm
imm
Rd
DeepakHegde said:I have used the similar add instruction in A64 as below and it works fine to get the label offset. adrp x4, __be_forkx add x4,x4, :lo12:__be_forkx
I tested the above snippet; it generates two relocations. So, it cannot be said to have worked.
Edit: GNU assembler's info on adrp relocations.
it is working for me, and is getting the proper address of forkx, verified this with gdb.
adrp is loading a 4K start address, which will mask the lower 12bit into X4.
add command above will add the lower 12 bit to X4, which is making upto the proper address.
but same i can not do in A32, as adrp is not present. so looking for a alternative.
DeepakHegde said:it is working for me, and is getting the proper address of forkx, verified this with gdb.
Do you mean to say that you already handle the adrp relocations? If not, the next time your binary is run, the address of forkx will change, and there won't be anybody to fix it.
In A64 i have done this and with gdb back trace can see proper address is getting loaded.
(gdb) info registers
x0 0x30a0 12448
x1 0x0 0
x2 0x7fdf1bf6b0 549203998384
x3 0x55ad57f898 367980443800
x4 0x55583dd000 367089537024
x5 0x0 0
x6 0x1 1
2. after the add:
x4 0x55583dd5c0 367089537024
3. Address we are getting from the instruction adrp x4,ASM_NAME(forkx) , here We are loading forkx structure address to x4, if we see the address of the forkx we can see below:
(gdb) print &forkx
$13 = (sprocess * __be *) 0x55583dd5c0 <__be_forkx>
every time image is loaded i can see this address is fine and working fine. and check for PIC/PIE and TEXTREL is fine on the created image in A64.
but need the same for A32.
I think that the offsets calculated by adrp are being forced to be included in the resulting binary without the help of relocations. This would then mean that the location of forkx with respect to the adrp instruction must be kept fixed..
You can do this with A32, assuming that the following distances do not change across different runs of the same binary:
/* 1.s */ .text .global forkx .global _start _start: nop nop load_dist: ldr r1, dist_forkx load_addr: ldr r0, [pc, r1] dist_forkx: .word forkx-load_addr-8
/* 2.s */ .text .fill 0x41020 .global forkx forkx: nop
as 1.s -o 1.o as 2.s -o 2.o ld 1.o 2.o
Edit: 1.s tries to read from the address of forkx. Modified 1.s below:
/* 1.s v2 */ .text .global forkx .global _start _start: nop nop load_dist: ldr r1, dist_forkx /* save lr if necessary */ bl load_addr load_addr: add r0,lr,r1 dist_forkx: .word forkx-load_addr
Edit2: An even simpler version. All code untested.
/* 1.s v3 */ .text .global forkx .global _start _start: nop nop load_dist: ldr r1, dist_forkx load_addr: adr r0, load_addr add r0,r0,r1 dist_forkx: .word forkx-load_addr
I have used the ldr instruction as below and with that compilation goes fine, but it will not be ASLR. address will be fixed.
ldr r1, =__be_forkx
I will try this also, not able to understand this 100%, with this will i have address of forkx in r0?
DeepakHegde said:ldr r1, =__be_forkx
That will cause a relocation to be emitted.
DeepakHegde said:I will try this also, not able to understand this 100%, with this will i have address of forkx in r0?
Yes. Instead of storing the absolute address of forkx inside a literal, it now stores the distance between the instruction that wants the address of forkx and the forkx itself. That distance must remain constant, however, across multiple runs of the same binary.
The 1.s pasted earlier tries to read from the address of forkx. That read is not needed. I have updated the post with a modified 1.s.
I am trying this, with this individual file compilation is fine. now have to check in a arm32 platform
in parallel started looking into the GCC compiler.
with gcc adrl command is also supported but for the same adr or adrl command getting the below error:
adrl r1, __be_forkx
error:
../asm-arm/unix_arm.S: Assembler messages:../asm-arm/unix_arm.S:61: Error: undefined symbol __be_forkx used as an immediate value
GLOBL_REF(__be_forkx)#.global __be_forkx
is added on top for global reference. is there anything going wrong?
DeepakHegde said:I am trying this, with this individual file compilation is fine. now have to check in a arm32 platform
In addition, does the binary also use a global offset table (GOT)? If so, it might be easier to just keep track of the location of got and patch its entries. If a GOT is present and is being used, it is likely that the address of the global/external symbol forkx is in one of its entries. There might be a different assembler syntax to refer to a got entry for a global.
DeepakHegde said:../asm-arm/unix_arm.S:61: Error: undefined symbol __be_forkx used as an immediate value
adr/l won't work with external symbols.
1.s v3 is working fine. Thanks a lot for all the help,
even tried a 1.c file with extern variable, with -fPIC compiler option it generate a similar istruction.
need one help to understand this assembly code.
dist_forkx: .word forkx-load_addr
hoe above 2 line able to map to external forkx at run time?
DeepakHegde said:dist_forkx: .word forkx-load_addr hoe above 2 line able to map to external forkx at run time?
They don't need to, because the need for directly knowing/mapping/fixing/reading the absolute address of forkx at runtime was removed. That's the basic concept of PIC - take advantage of the fact that relative distances between certain sections/symbols do not change once the binary is built.
The general concept isn't also new. You can understand it, for e.g., if you look at the code generated when you call a function within the same binary. When bl calls into a function within the same binary, the assembler/linker performs such distance calculations, and emits appropriate PC-relative branches, insofar as allowed by the ISA and the ABI.
Got it.. Thanks a lot surati.