Name

    Loop unroll pragma extension

Name Strings

    cl_nv_pragma_unroll

Dependencies
  
    OpenCL 1.0 is required

Contributors

    Jian-Zhong Wang
    Bastiaan Aarts
    Vinod Grover

Overview
    
    This extension extends the OpenCL C language with a hint that
    allows loops to be unrolled. This pragma must be used for a loop
    and can be used to specify full unrolling or partial unrolling by
    a certain amount. This is a hint and the compiler may ignore this
    pragma for any reason.

Goals

    The principal goal of the pragma unroll is to improve the
    performance of loops via unrolling. Typically this enables other
    optimizations or improves instruction level parallelism of a
    thread.

Details

    A user may specify that a loop in the source program be
    unrolled. This is done via a pragma. The syntax of this pragma is
    as follows

        #pragma unroll [unroll-factor]

    The pragma unroll may optionally specify an unroll factor. The
    pragma must be placed immediately before the loop and only applies
    to that loop.

    If unroll factor is not specified then the compiler will try to do
    complete or full unrolling of the loop. If a loop unroll factor is
    specified the compiler will perform partial loop unrolling. The
    loop factor, if specified, must be a compile time non negative
    integer constant.

    A loop unroll factor of 1 means that the compiler should not
    unroll the loop.

    A complete unroll specification has no effect if the trip count of
    the loop is not compile-time computable.

Examples    

    This sections lists a few examples illustrating valid and invalid
    uses.

    - Complete unrolling example

         #pragma unroll
	 for (int i = 0; i < 32; i++) {
	    ...
	 }
	
      This example full unrolling is requested from the compiler. Note
      that, since the trip count is known to be 32 the compiler will
      most likely honor this request. In the following example, the
      trip count is not known so unrolling pragma will be ignored.
      
         #pragma unroll
	 for (int i = 0; i < n; i++) {
	    ...
	 }
	
    - no unrolling example

         #pragma unroll 1
	 for (int i = 0; i < 64; i++) {
	     ...
	 }


    - partial unrolling example

        #pragma unroll 4
	for (int i = 0; i < n; i++) {
	   ...
	}

      Note that, in this example the trip count is not knownt at
      compile time, but a partial unroll factor of 4 is valid.

    - invalid unroll pragma usage

       The following examples describe some invalid uses of loop
       unrolling pragmas.

        #pragma unroll -1
	for (...) {
	  ...
	}

        This is invalid because the loop unroll factor is negative.

	#pragma unroll
	if (...) {
	   ...
        }

	This is invalid because the pragma is used on a non loop
	construct

	#pragma unroll x+1
	for (...) {
	   ...
	}

	This is invalid since the loop unroll factor is not a
	compile-time known value.
