We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
Hi all,
It is a well known fact that performing an aligned vector load with an unaligned memory address should lead to segmentation fault.
However, when I do try to run code segment below using the same, i do not see any segmentation fault.
------------------------------------------------------------------------------------------------
#include <stdio.h>#include <stdlib.h>#include <time.h>#include "arm_neon.h"void add (uint32x4_t *data_a,uint32x4_t *data_b) { /* Set each sixteen values of the vector to 3. * * Remark: a 'q' suffix to intrinsics indicates * the instruction run for 128 bits registers. */ *data_a = vaddq_u32 (*data_a, *data_b);}int main (int argc,char** argv) { unsigned int n = atoi(argv[1]); /* Create custom arbitrary data. */ uint32_t uint32_data_a[n]; uint32_t uint32_data_b[n]; uint32_t uint32_data_c[n]; struct timespec start,end; for(uint32_t i = 1; i <= n ; i+=1) { uint32_data_a[i-1] = i; uint32_data_b[i-1] = i; uint32_data_c[i-1] = i; } /* Create the vector with our data. */ uint32x4_t data_a; uint32x4_t data_b; clock_gettime(CLOCK_MONOTONIC,&start); for(int count = 0; count < 10; count++) { for(int i = 0; i < n ; i+=4) { /* Load our custom data into the vector register. */ data_a = vld1q_u32 (uint32_data_a + i); data_b = vld1q_u32 (uint32_data_b + i); /* Call of the add3 function. */ add(&data_a,&data_b); vst1q_u32(uint32_data_c + i,data_a); } } clock_gettime(CLOCK_MONOTONIC,&end); double time_usec=(((double)end.tv_sec * 1000000 + (double)end.tv_nsec/1000) - ((double)start.tv_sec *1000000 + (double)start.tv_nsec/1000)); printf("Time taken for aligned load is : %fus and count is %d \n", time_usec/10,n ); for(uint32_t i = 0; i < n ; i++) printf("%2d ",uint32_data_c[i]); printf("\n"); return 0;}
----------------------------------------------------------------------------------------------------
Clearly almost every access to memory in this case is unaligned? is there any reason for this inconsistent behavior? Thanks in advance.
Aketh TM
Hi,
I recommend to narrow the focus to a specific instruction and load address you think should be causing segfault.
Use objdump -S to see the source code mixed with the disassembly, gdb to single step by source and assembly and print registers, or just printf() statements to print pointer values.
I didn't see any unaligned accesses. The stride of 1 is used for pointer arithmetic which moves to the next vector.
Thanks,
Jason