How to add pthread in examples of sve project?

 

#include <pthread.h>
 

I use this line code in Arm Development Studio built-in sve_array_subtract example and has build errors that "pthread.h" file not found. How to solve it, I wanna use both multi-thread and sve instrinsics in the project.

Parents
  • Hi again

    I suspect that the issue here relates to "unaligned accesses".  By default, the compiler assumes that the target platform supports unaligned accesses, so will generate code containing e.g. LDUR & STUR instructions.  However, to run such code reliably, target platforms (FVP_Base_AEMvA model in this case) must have unaligned accesses enabled.  To ensure this, you must configure the core in the FVP model to allow unaligned accesses, by setting the A bit in the SCTLR, either using assembler code, or by using the Debugger.  I suspect that your original code would have crashed before reaching memcpy without it.

    As an alternative temporary workaround, you can prevent the compiler from generating code that depends on unaligned accesses by compiling with "-mno-unaligned-access".  There may be a performance and/or code size impact for this.  See:

    developer.arm.com/.../-munaligned-access---mno-unaligned-access

    I was able to build and run your test case, and it worked for me - see screenshot.

    I used Arm DS 2025.1 to compile/link your code with:

    armclang --target=aarch64-arm-none-eabi -march=armv8-a -xc++ -fno-exceptions -O0 -g -mno-unaligned-access -c -o test.o test.cpp
    armlink --scatter=scatter.scat -o test.axf  test.o 

    Then ran/debugged the executable via Arm Debugger on the FVP_Base_AEMvA model with these parameters:
    -C cluster0.NUM_CORES=0x1 -C bp.secure_memory=false

    I could see the struct data being copied correctly from param22 at 0x9FFFFFCC (on the stack) to pHandle->param at 0x8006EEE0 (in calloc'd memory).

    Note that sizeof(init_param) is 8.  Being such a small value, the Arm Compiler 6 code-generator optimizes-away the call to memcpy, and replaces it with a simple load/store sequence:

    EL3:0x00000000800203D8 : LDR      w8,[sp,#0x3c]
    EL3:0x00000000800203DC : LDR      w10,[sp,#0x40]
    EL3:0x00000000800203E0 : STR      w10,[x9,#4]
    EL3:0x00000000800203E4 : STR      w8,[x9,#0]

    Note that you can have multiple Memory views (and other views), side-by-side, open simultaneously in the Arm DS GUI.

    Hope this helps

    Stephen

Reply
  • Hi again

    I suspect that the issue here relates to "unaligned accesses".  By default, the compiler assumes that the target platform supports unaligned accesses, so will generate code containing e.g. LDUR & STUR instructions.  However, to run such code reliably, target platforms (FVP_Base_AEMvA model in this case) must have unaligned accesses enabled.  To ensure this, you must configure the core in the FVP model to allow unaligned accesses, by setting the A bit in the SCTLR, either using assembler code, or by using the Debugger.  I suspect that your original code would have crashed before reaching memcpy without it.

    As an alternative temporary workaround, you can prevent the compiler from generating code that depends on unaligned accesses by compiling with "-mno-unaligned-access".  There may be a performance and/or code size impact for this.  See:

    developer.arm.com/.../-munaligned-access---mno-unaligned-access

    I was able to build and run your test case, and it worked for me - see screenshot.

    I used Arm DS 2025.1 to compile/link your code with:

    armclang --target=aarch64-arm-none-eabi -march=armv8-a -xc++ -fno-exceptions -O0 -g -mno-unaligned-access -c -o test.o test.cpp
    armlink --scatter=scatter.scat -o test.axf  test.o 

    Then ran/debugged the executable via Arm Debugger on the FVP_Base_AEMvA model with these parameters:
    -C cluster0.NUM_CORES=0x1 -C bp.secure_memory=false

    I could see the struct data being copied correctly from param22 at 0x9FFFFFCC (on the stack) to pHandle->param at 0x8006EEE0 (in calloc'd memory).

    Note that sizeof(init_param) is 8.  Being such a small value, the Arm Compiler 6 code-generator optimizes-away the call to memcpy, and replaces it with a simple load/store sequence:

    EL3:0x00000000800203D8 : LDR      w8,[sp,#0x3c]
    EL3:0x00000000800203DC : LDR      w10,[sp,#0x40]
    EL3:0x00000000800203E0 : STR      w10,[x9,#4]
    EL3:0x00000000800203E4 : STR      w8,[x9,#0]

    Note that you can have multiple Memory views (and other views), side-by-side, open simultaneously in the Arm DS GUI.

    Hope this helps

    Stephen

Children
No data