Name

    NV_device_attribute_query

Name Strings

    cl_nv_device_attribute_query

Contributors

    Cyril Zeller, NVIDIA Corporation
    Yogesh Kini, NVIDIA Corporation
    Kedar Patil, NVIDIA Corporation
    
Notice

    Copyright NVIDIA Corporation, 2009.

IP Status

    NVIDIA Proprietary.

Version

    October 5, 2009 (version 1.0)

Dependencies
  
    OpenCL 1.0 is required.

Overview
    
    This extension provides a mechanism to query device attributes specific to
    NVIDIA hardware. This will enable the programmer to optimize OpenCL kernels
    based on the specifics of the hardware.
     
Details

    OpenCL 1.0 specification allows the programmer to query various device
    attributes. The complete list of these attributes are listed in table 4.3.
    However there is no way to query vendor specific information. This
    extension extends this table to include NVIDIA specific device attribute
    queries.
    
    This extension extends the table 4.3 of OpenCL 1.0 specification to include
    the following 
    
  |------------------------------------------------------------------------------------------------------------------|
  | CL_DEVICE_COMPUTE_CAPABILITY_MAJOR_NV  |  cl_uint  |  Returns the major revision number that defines the CUDA    |
  |                                        |           |  compute capability of the device.                          |
  |------------------------------------------------------------------------------------------------------------------|
  | CL_DEVICE_COMPUTE_CAPABILITY_MINOR_NV  |  cl_uint  |  Returns the minor revision number that defines the CUDA    |
  |                                        |           |  compute capability of the device.                          |
  |------------------------------------------------------------------------------------------------------------------|
  | CL_DEVICE_REGISTERS_PER_BLOCK_NV       |  cl_unit  |  Maximum number of 32-bit registers available to a          | 
  |                                        |           |  work-group; this number is shared by all work-groups       |
  |                                        |           |  simultaneously resident on a multiprocessor.               |
  |------------------------------------------------------------------------------------------------------------------|
  | CL_DEVICE_WARP_SIZE_NV                 |  cl_uint  |  Warp size in work-items.                                   |
  |                                        |           |                                                             |
  |------------------------------------------------------------------------------------------------------------------|
  | CL_DEVICE_GPU_OVERLAP_NV               |  cl_bool  |  Returns CL_TRUE if the device can concurrently copy memory |
  |                                        |           |  between host and device while executing a kernel, or       |  
  |                                        |           |  CL_FALSE if not.                                           |
  |------------------------------------------------------------------------------------------------------------------|           
  | CL_DEVICE_KERNEL_EXEC_TIMEOUT_NV       |  cl_bool  |  CL_TRUE if there is a run time limit for kernels executed  |
  |                                        |           |  on the device, or CL_FALSE if not.                         |
  |------------------------------------------------------------------------------------------------------------------|
  | CL_DEVICE_INTEGRATED_MEMORY_NV         |  cl_bool  |  CL_TRUE if the device is integrated with the memory        |
  |                                        |           |  subsystem, or CL_FALSE if not.                             |  
  |------------------------------------------------------------------------------------------------------------------|        

    The function clGetDeviceInfo can be called with the constants above in
    order to query the device attributes. 

