Name

    NV_compiler_options

Name Strings

    cl_nv_compiler_options

Dependencies

    OpenCL 1.0 is required

Contributors

    Cyril Zeller
    Joshua Newman

Overview

    This extension allows the programmer to pass options to the PTX assembler
    allowing greater control over code generation.

Details

    Section 5.4.3 of the OpenCL 1.0 specification lists compiler options that
    can be passed to clBuildProgram. This extension adds the following
    options:

    -cl-nv-maxrregcount=<N>
        Passed on to ptxas as --maxrregcount <N>
            N is a positive integer.
        Specify the maximum number of registers that GPU functions can use.
        Until a function-specific limit, a higher value will generally increase
        the performance of individual GPU threads that execute this function.
        However, because thread registers are allocated from a global register
        pool on each GPU, a higher value of this option will also reduce the
        maximum thread block size, thereby reducing the amount of thread
        parallelism. Hence, a good maxrregcount value is the result of a
        trade-off.
        If this option is not specified, then no maximum is assumed. Otherwise
        the specified value will be rounded to the next multiple of 4 registers
        until the GPU specific maximum of 128 registers.

    -cl-nv-opt-level=<N>
        Passed on to ptxas as --opt-level <N>
            N is a positive integer, or 0 (no optimization).
        Specify optimization level.
        Default value:  3.

    -cl-nv-verbose
        Passed on to ptxas as --verbose
        Enable verbose mode.
        Output will be reported in the build log (accessible through the
        callback parameter to clBuildProgram).
