This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

calling cl_arm_import_memory failed with error code -6 on RK3399

hey, I am new to OPENCL. bumped into an issue when we try to leverage the GPU for some math operations (basically matrix multiplication).

following is my sample code, 

char *allocptr = malloc(WIDTH*HEIGHT*2);

cl_mem buffer = clImportMemoryARM(context,CL_MEM_READ_WRITE, NULL,allocp$

if (error ==CL_SUCCESS)
{
    printf("sucess\n");
}
else
{
    printf("error %d\n.",error);
}

the code can be compiled without any issue, however it through out error -6 at run time. Can anyone shed some lights here?

the CLINFO is pasted below as well.

firefly@firefly:~$ sudo clinfo -a
Platform #0
  Name:                                  ARM Platform
  Vendor:                                ARM
  Version:                               OpenCL 1.2 v1.r14p0-01rel0-git(966ed26).f44c85cb3d2ceb87e8be88e7592755c3
  Profile:                               FULL_PROFILE
  Extensions:                            cl_khr_global_int32_base_atomics
                                         cl_khr_global_int32_extended_atomics
                                         cl_khr_local_int32_base_atomics
                                         cl_khr_local_int32_extended_atomics
                                         cl_khr_byte_addressable_store
                                         cl_khr_3d_image_writes
                                         cl_khr_fp64
                                         cl_khr_int64_base_atomics
                                         cl_khr_int64_extended_atomics
                                         cl_khr_fp16
                                         cl_khr_gl_sharing
                                         cl_khr_icd
                                         cl_khr_egl_event
                                         cl_khr_egl_image
                                         cl_khr_image2d_from_buffer
                                         cl_arm_core_id
                                         cl_arm_printf
                                         cl_arm_thread_limit_hint
                                         cl_arm_non_uniform_work_group_size
                                         cl_arm_import_memory

  Device #0
    Name:                                Mali-T860
    Type:                                GPU
    Vendor:                              ARM
    Vendor ID:                           140517376
    Profile:                             FULL_PROFILE
    Available:                           Yes
    Version:                             OpenCL 1.2 v1.r14p0-01rel0-git(966ed26).f44c85cb3d2ceb87e8be88e7592755c3
    Driver version:                      1.2
    Compiler available:                  Yes
    Address space size:                  64
    Little endian:                       Yes
    Error correction support:            No
    Address alignment (bits):            1024
    Smallest alignment (bytes):          128
    Resolution of timer (ns):            1000
    Max clock frequency (MHz):           200
    Max compute units:                   4
    Max constant args:                   8
    Max constant buffer size:            64 kB 
    Max mem alloc size:                  489 MB 942 kB 
    Max parameter size:                  1024
    Command-queue supported props:       Out of order execution
                                         Profiling
    Execution capabilities:              OpenCL kernels
    Global memory size:                  1 GB 935 MB 696 kB 
    Global memory cache size:            256 kB 
    Global memory line cache size:       64
    Local memory size:                   32 kB 
    Local memory type:                   Global
    Global memory cache type:            Read write
    Max work group size:                 256
    Max work item dimensions:            3
    Max work item sizes:                 (256, 256, 256)
    Image support:                       Yes
    Max 2D image height:                 65536
    Max 2D image width:                  65536
    Max 3D image depth:                  65536
    Max 3D image height:                 65536
    Max 3D image width:                  65536
    Max read image args:                 128
    Max write image args:                8
    Max samplers:                        16
    Preferred vector width char:         16
    Preferred vector width short:        8
    Preferred vector width int:          4
    Preferred vector width long:         2
    Preferred vector width float:        4
    Preferred vector width double:       2
    Half precision float capability:     Denorms
                                         Inf and NaNs
                                         Round to nearest even rounding mode
                                         Round to zero rounding mode
                                         Round to +ve and -ve infinity rounding modes
                                         IEEE754-2008 fused multiply-add
    Single precision float capability:   Denorms
                                         Inf and NaNs
                                         Round to nearest even rounding mode
                                         Round to zero rounding mode
                                         Round to +ve and -ve infinity rounding modes
                                         IEEE754-2008 fused multiply-add
    Double precision float capability:   Denorms
                                         Inf and NaNs
                                         Round to nearest even rounding mode
                                         Round to zero rounding mode
                                         Round to +ve and -ve infinity rounding modes
                                         IEEE754-2008 fused multiply-add
    Extensions:                          cl_khr_global_int32_base_atomics
                                         cl_khr_global_int32_extended_atomics
                                         cl_khr_local_int32_base_atomics
                                         cl_khr_local_int32_extended_atomics
                                         cl_khr_byte_addressable_store
                                         cl_khr_3d_image_writes
                                         cl_khr_fp64
                                         cl_khr_int64_base_atomics
                                         cl_khr_int64_extended_atomics
                                         cl_khr_fp16
                                         cl_khr_gl_sharing
                                         cl_khr_icd
                                         cl_khr_egl_event
                                         cl_khr_egl_image
                                         cl_khr_image2d_from_buffer
                                         cl_arm_core_id
                                         cl_arm_printf
                                         cl_arm_thread_limit_hint
                                         cl_arm_non_uniform_work_group_size
                                         cl_arm_import_memory

Parents
  • Hi,

    Your code got mangled on the way to the forum (the call to clImportMemoryARM looks incomplete). What size are you passing to clImportMemoryARM?

    One thing that is suspicious is that your device doesn't seem to report cl_arm_import_memory_host which should be present when host imports are supported. That being said, some older versions of the driver had this bug and the feature was introduced for host imports so I'm fairly confident it's there on your device.

    Taking a step back, do you imperatively need zero-copy imports for your application or would standard buffers created with clCreateBuffer suffice?

    Regards,

    Kevin

Reply
  • Hi,

    Your code got mangled on the way to the forum (the call to clImportMemoryARM looks incomplete). What size are you passing to clImportMemoryARM?

    One thing that is suspicious is that your device doesn't seem to report cl_arm_import_memory_host which should be present when host imports are supported. That being said, some older versions of the driver had this bug and the feature was introduced for host imports so I'm fairly confident it's there on your device.

    Taking a step back, do you imperatively need zero-copy imports for your application or would standard buffers created with clCreateBuffer suffice?

    Regards,

    Kevin

Children