I was trying to compiled the following code:
#pragma omp parallel shared (j)
{
#pragma omp for schedule(dynamic)
for(i = 0; i != j; i++)
{
// do
If I were to see the statement
for(i = 0; i != j; i++)
used instead of the statement
for(i = 0; i < j; i++)
I would be left wondering why the programmer had made that choice, never mind that it can mean the same thing. It may be that OpenMP is making a hard syntactic choice in order to force a certain clarity of code.
Here's code which raises challenges for the use of !=
and may help explain why it isn't allowed.
#include <cstdio>
int main(){
int j=10;
#pragma omp parallel for
for(int i = 0; i < j; i++){
printf("%d\n",i++);
}
}
notice that i
is incremented in both the for
statement as well as within the loop itself leading to the possibility (but not the guarantee) of an infinite loop.
If the predicate is <
then the loop's behavior can still be well-defined in a parallel context without the compiler having to check within the loop for changes to i
and determining how those changes will affect the loop's bounds.
If the predicate is !=
then the loop's behavior is no longer well-defined and it may be infinite in extent, preventing easy parallel subdivision.
I think there is perhaps no good reason other than having extended existing functionality to get this far.
IIRC originally these had to be static so that it could determine at compile time how to generate the loop code... it could just be a hangover from that.
Contrary to what it may look like, schedule(dynamic)
does not work with dynamic number of elements. Rather the assignment of iteration blocks to threads is what is dynamic. With static scheduling this assignment is precomputed at the beginning of the worksharing construct. With dynamic scheduling iteration blocks are given out to threads on the first come, first served basis.
The OpenMP standard is pretty clear that the amount of iteratons is precomputed once the workshare construct is encountered, hence the loop counter may not be modified inside the body of the loop (OpenMP 3.1 specification, §2.5.1 - Loop Construct):
The iteration count for each associated loop is computed before entry to the outermost loop. If execution of any associated loop changes any of the values used to compute any of the iteration counts, then the behavior is unspecified.
The integer type (or kind, for Fortran) used to compute the iteration count for the collapsed loop is implementation defined.
A worksharing loop has logical iterations numbered 0,1,...,N-1 where N is the number of loop iterations, and the logical numbering denotes the sequence in which the iterations would be executed if the associated loop(s) were executed by a single thread. The
schedule
clause specifies how iterations of the associated loops are divided into contiguous non-empty subsets, called chunks, and how these chunks are distributed among threads of the team. Each thread executes its assigned chunk(s) in the context of its implicit task. The chunk_size expression is evaluated using the original list items of any variables that are made private in the loop construct. It is unspecified whether, in what order, or how many times, any side-effects of the evaluation of this expression occur. The use of a variable in aschedule
clause expression of a loop construct causes an implicit reference to the variable in all enclosing constructs.
The rationale behind these relational operator restriction is quite simple - it provides clear indication on what is the direction of the loop, it alows easy computation of the number of iterations, and it provides similar semantics of the OpenMP worksharing directive in C/C++ and Fortran. Also other relational operations would require close inspection of the loop body in order to understand how the loop goes which would be unaceptable in many cases and would make the implementation cumbersome.
OpenMP 3.0 introduced the explicit task
construct which allows for parallelisation of loops with unknown number of iterations. There is a catch though: tasks introduce some severe overhead and the one task per loop iteration only makes sense if these iterations take quite some time to be executed. Otherwise the overhead would dominate the execution time.
The answer is simple. OpenMP does not allow premature termination of a team of threads. With == or !=, OpenMP has no way of determining when the loop stops. 1. One or more threads could hit the termination condition, which might not be unique. 2. OpenMP has no way to shut down the other threads that might never detect the condition.
.
For signed int, the wrap around behavior is undefined. If we allow !=
, programmers may get unexpected tripcount. The problem is whether the compiler can generate code to compute a trip count for the loop.
For a simple loop, like:
for( i = 0; i < n; ++i )
the compiler can determine that there are 'n' iterations, if n>=0, and zero iterations if n < 0.
For a loop like:
for( i = 0; i != n; ++i )
again, a compiler should be able to determine that there are 'n' iterations, if n >= 0; if n < 0, we don't know how many iterations it has.
For a loop like:
for( i = 0; i < n; i += 2 )
the compiler can generate code to compute the trip count (loop iteration count) as floor((n+1)/2) if n >= 0, and 0 if n < 0.
For a loop like:
for( i = 0; i != n; i += 2 )
the compiler can't determine whether 'i' will ever hit 'n'. What if 'n' is an odd number?
For a loop like:
for( i = 0; i < n; i += k )
the compiler can generate code to compute the trip count as floor((n+k-1)/k) if n >= 0, and 0 if n < 0, because the compiler knows that the loop must count up; in this case, if k < 0, it's not a legal OpenMP program.
For a loop like:
for( i = 0; i != n; i += k )
the compiler doesn't even know if i is counting up or down. It doesn't know if 'i' will ever hit 'n'. It may be an infinite loop.
Credits: OpenMP ARB