Returns information about the kernel object.
cl_int clGetKernelSubGroupInfo(cl_kernel kernel,
cl_device_id device,
cl_kernel_sub_group_info param_name,
size_t input_value_size,
const void *input_value,
size_t param_value_size,
void *param_value,
size_t *param_value_size_ret)
kernel
-
Specifies the kernel object being queried.
device
-
Identifies a specific device in the list of devices associated with
kernel
. The list of devices is the list of devices in the OpenCL context that is associated withkernel
. If the list of devices associated withkernel
is a single device,device
can be a NULL value. param_name
-
Specifies the information to query. The list of supported
param_name
types and the information returned inparam_value
byclGetKernelSubGroupInfo
is described in the table below. input_value_size
-
Specifies the size in bytes of memory pointed to by
input_value
. This size must be == size of input type as described in table below. input_value
-
A pointer to memory where the appropriate parameterization of the query is passed from. If
input_value
is NULL, it is ignored. param_value
-
A pointer to memory where the appropriate result being queried is returned. If
param_value
is NULL, it is ignored. param_value_size
-
Used to specify the size in bytes of memory pointed to by
param_value
. This size must be ≥ size of return type as described in the table below. param_value_size_ret
-
Returns the actual size in bytes of data copied to
param_value
. Ifparam_value_size_ret
is NULL, it is ignored.cl_kernel_sub_group_info Input Type Return Type Info. returned in param_value
CL_KERNEL_MAX_SUB_- GROUP_SIZE_FOR_NDRANGE
size_t *
size_t
Returns the maximum sub-group size for this kernel. All subgroups must be the same size, while the last sub-group in any work-group (i.e. the sub-group with the maximum index) could be the same or smaller size.
The
input_value
must be an array ofsize_t
values corresponding to the local work size parameter of the intended dispatch. The number of dimensions in the ND-range will be inferred from the value specified forinput_value_size
.CL_KERNEL_SUB_GROUP_- COUNT_FOR_NDRANGE
size_t *
size_t
Returns the number of sub-groups that will be present in each work-group for a given local work size. All workgroups, apart from the last work-group in each dimension in the presence of non-uniform work-group sizes, will have the same number of subgroups.
The
input_value
must be an array ofsize_t
values corresponding to the local work size parameter of the intended dispatch. The number of dimensions in the ND-range will be inferred from the value specified forinput_value_size
.CL_KERNEL_LOCAL_SIZE_ FOR_SUB_GROUP_COUNT
size_t
size_t []
Returns the local size that will generate the requested number of sub- groups for the kernel. The output array must be an array of
size_t
values corresponding to the local size parameter. Any returned work-group will have one dimension. Other dimensions inferred from the value specified forparam_value_size
will be filled with the value 1. The returned value will produce an exact number of sub-groups and result in no partial groups for an executing kernel except in the case where the last work- group in a dimension has a size different from that of the other groups. If no work-group size can accommodate the requested number of sub-groups, 0 will be returned in each element of the return array.CL_KERNEL_MAX_NUM_SUB_GROUPS
ignored
size_t
This provides a mechanism for the application to query the maximum number of sub-groups that may make up each work-group to execute a kernel on a specific device given by device. The OpenCL implementation uses the resource requirements of the kernel (register usage etc.) to determine what this work-group size should be. The returned value may be used to compute a work-group size to enqueue the kernel with to give a round number of sub-groups for an enqueue.
CL_KERNEL_COMPILE_NUM_SUB_GROUPS
ignored
size_t
Returns the number of sub-groups specified in the kernel source or IL. If the sub-group count is not specified using the above attribute then 0 is returned.
Returns CL_SUCCESS
if the function is executed successfully.
Otherwise, it returns one of the following errors:
-
CL_INVALID_DEVICE
ifdevice
is not in the list of devices associated withkernel
or ifdevice
is NULL but there is more than one device associated withkernel
. -
CL_INVALID_VALUE
ifparam_name
is not valid, or if size in bytes specified byparam_value_size
is < size of return type as described in the table above andparam_value
is not NULL. -
CL_INVALID_VALUE
ifparam_name
isCL_KERNEL_SUB_GROUP_SIZE_FOR_NDRANGE
and the size in bytes specified byinput_value_size
is not valid or ifinput_value
is NULL. -
CL_INVALID_KERNEL
ifkernel
is a not a valid kernel object. -
CL_OUT_OF_RESOURCES
if there is a failure to allocate resources required by the OpenCL implementation on the device. -
CL_OUT_OF_HOST_MEMORY
if there is a failure to allocate resources required by the OpenCL implementation on the host.