Skip to content

Latest commit

 

History

History
156 lines (105 loc) · 5.59 KB

clGetKernelSubGroupInfo.adoc

File metadata and controls

156 lines (105 loc) · 5.59 KB

clGetKernelSubGroupInfo

Returns information about the kernel object.

cl_int clGetKernelSubGroupInfo(cl_kernel kernel,
                               cl_device_id device,
                               cl_kernel_sub_group_info param_name,
                               size_t input_value_size,
                               const void *input_value,
                               size_t param_value_size,
                               void *param_value,
                               size_t *param_value_size_ret)

Parameters

kernel

Specifies the kernel object being queried.

device

Identifies a specific device in the list of devices associated with kernel. The list of devices is the list of devices in the OpenCL context that is associated with kernel. If the list of devices associated with kernel is a single device, device can be a NULL value.

param_name

Specifies the information to query. The list of supported param_name types and the information returned in param_value by clGetKernelSubGroupInfo is described in the table below.

input_value_size

Specifies the size in bytes of memory pointed to by input_value. This size must be == size of input type as described in table below.

input_value

A pointer to memory where the appropriate parameterization of the query is passed from. If input_value is NULL, it is ignored.

param_value

A pointer to memory where the appropriate result being queried is returned. If param_value is NULL, it is ignored.

param_value_size

Used to specify the size in bytes of memory pointed to by param_value. This size must be ≥ size of return type as described in the table below.

param_value_size_ret

Returns the actual size in bytes of data copied to param_value. If param_value_size_ret is NULL, it is ignored.

cl_kernel_sub_group_info Input Type Return Type Info. returned in param_value

CL_KERNEL_MAX_SUB_- GROUP_SIZE_FOR_NDRANGE

size_t *

size_t

Returns the maximum sub-group size for this kernel. All subgroups must be the same size, while the last sub-group in any work-group (i.e. the sub-group with the maximum index) could be the same or smaller size.

The input_value must be an array of size_t values corresponding to the local work size parameter of the intended dispatch. The number of dimensions in the ND-range will be inferred from the value specified for input_value_size.

CL_KERNEL_SUB_GROUP_- COUNT_FOR_NDRANGE

size_t *

size_t

Returns the number of sub-groups that will be present in each work-group for a given local work size. All workgroups, apart from the last work-group in each dimension in the presence of non-uniform work-group sizes, will have the same number of subgroups.

The input_value must be an array of size_t values corresponding to the local work size parameter of the intended dispatch. The number of dimensions in the ND-range will be inferred from the value specified for input_value_size.

CL_KERNEL_LOCAL_SIZE_ FOR_SUB_GROUP_COUNT

size_t

size_t []

Returns the local size that will generate the requested number of sub- groups for the kernel. The output array must be an array of size_t values corresponding to the local size parameter. Any returned work-group will have one dimension. Other dimensions inferred from the value specified for param_value_size will be filled with the value 1. The returned value will produce an exact number of sub-groups and result in no partial groups for an executing kernel except in the case where the last work- group in a dimension has a size different from that of the other groups. If no work-group size can accommodate the requested number of sub-groups, 0 will be returned in each element of the return array.

CL_KERNEL_MAX_NUM_SUB_GROUPS

ignored

size_t

This provides a mechanism for the application to query the maximum number of sub-groups that may make up each work-group to execute a kernel on a specific device given by device. The OpenCL implementation uses the resource requirements of the kernel (register usage etc.) to determine what this work-group size should be. The returned value may be used to compute a work-group size to enqueue the kernel with to give a round number of sub-groups for an enqueue.

CL_KERNEL_COMPILE_NUM_SUB_GROUPS

ignored

size_t

Returns the number of sub-groups specified in the kernel source or IL. If the sub-group count is not specified using the above attribute then 0 is returned.

Errors

Returns CL_SUCCESS if the function is executed successfully. Otherwise, it returns one of the following errors:

  • CL_INVALID_DEVICE if device is not in the list of devices associated with kernel or if device is NULL but there is more than one device associated with kernel.

  • CL_INVALID_VALUE if param_name is not valid, or if size in bytes specified by param_value_size is < size of return type as described in the table above and param_value is not NULL.

  • CL_INVALID_VALUE if param_name is CL_KERNEL_SUB_GROUP_SIZE_FOR_NDRANGE and the size in bytes specified by input_value_size is not valid or if input_value is NULL.

  • CL_INVALID_KERNEL if kernel is a not a valid kernel object.

  • CL_OUT_OF_RESOURCES if there is a failure to allocate resources required by the OpenCL implementation on the device.

  • CL_OUT_OF_HOST_MEMORY if there is a failure to allocate resources required by the OpenCL implementation on the host.

Also see