variable list of dependencies for openmp 4.5 tasks

问题

I am writing a fortran code using task-based paradigm. I use my DAG to express the dependencies. Using OpenMP 4.5, I can use the clause depend which takes as input a dependence-type and a list of dependencies.

This mechanism works well when you know explicitly the number of dependencies. However, in my case, I would create tasks that are expected to have a list of dependencies which varies from 1 to n elements.

Reading the documentation OpenMP-4.5_doc, I have not found any useful mechanism that allows to provide a variable list of dependencies.

Let us take an example. Consider the computation of the traffic. A road has as dependencies the computed state of the predecessor road(s) (Hope this is clear enough). Therefore, the computation of this road is performed when all predecessor roads traffic is computed.

Using Fortran style, we have the following sketch of code:

!road is a structure such that
! type(road) :: road%dep(:)
! integer    :: traffic

type(road) :: road

!$omp task shared(road)
!$omp depend(in: road%dep) depend(inout:road)
  call compute_traffic(road)
!$omp end task

What I am trying to do is to use the field %dep as a list of dependencies for openmp. Alternatively, we can consider that %dep has a different type as a list of pointers that point to the concerned roads.

To go beyond this illustration, I work on sparse direct solver and more precisely on the Cholesky factorization and its application. Using multi frontal approach, you get many small dense blocks. The factorization as well as the solve is split into two subroutines, first the factorization (or the solve) of the diagonal block, second the update of the off diagonal blocks. The update of a dense block need the update of all previous dense blocks that share the same rows.

The fact is that I have a task to update an off-diagonal block that can depend to more than one block and obviously, the number of dependencies is related to the pattern (the structure) of the input matrix to factor. Therefore, it is not possible to determine the number of dependencies statically. That is why I am trying to give a list of blocks in the clause depend.

回答1:

The feature you are looking for has been proposed under the name multiple dependency by Vidal et al. in the International Workshop on OpenMP, 2015 (see here for an open access version).

As far as I know, this feature has not found its way into OpenMP tasks (yet?), but you could use OmpSs, the OpenMP forerunner where this proposal (and many more) was implemented.

The ugly workaround otherwise, as your dependency number needs to be defined at compile time, is to write (or generate) a switch (or rater SELECT CASE for Fortran) on the number of dependencies, with each its own separate pragma.

I don't know a whole lot about Fortran I'm afraid, but in C you can get a long way with X-macros and _Pragma(). I thinkg GNU fortran uses the C preprocessor, so hopefully you can transpose this some code I once used (otherwise you're probably going to have to write all your cases by hand):

// L(n, X) = Ln(X) is a list of n macro expansions of X
#define L_EVALN(N, X) L ## N(X)
#define L(N, X) L_EVALN(N, X)

#define L1(X)         X(1,  b)
#define L2(X)  L1(X)  X(2,  c)
#define L3(X)  L2(X)  X(3,  d)
#define L4(X)  L3(X)  X(4,  e)
#define L5(X)  L4(X)  X(5,  f)
#define L6(X)  L5(X)  X(6,  g)
#define L7(X)  L6(X)  X(7,  h)
#define L8(X)  L7(X)  X(8,  i)
#define L9(X)  L8(X)  X(9,  j)
#define L10(X) L9(X)  X(10, k)
#define L11(X) L10(X) X(11, l)
#define L12(X) L11(X) X(12, m)
#define L13(X) L12(X) X(13, n)


// Expand x, stringify, and put inside _Pragma()
#define EVAL_PRAGMA(x) _Pragma (#x)
#define DO_PRAGMA(x) EVAL_PRAGMA(x)

// X-macro to define dependecies on b{id} (size n{id})
#define OMP_DEPS(num, id) , [n_ ## id]b_ ## id

// X-macro to define symbols b{id} n{id} for neighbour #num
#define DEFINE_DEPS(num, id)               \
            double *b_ ## id =  b[num];    \
            int     n_ ## id = nb[num];

// Calls each X-macros N times
#define N_OMP_DEPS(N)       L(N, OMP_DEPS)
#define N_CALL_DEPS(N)      L(N, CALL_DEPS)
#define N_DEFINE_DEPS(N)    L(N, DEFINE_DEPS)

// defines the base task with 1 dependency on b_a == *b,
// to which we can add any number of supplementary dependencies
#define OMP_TASK(EXTRA) DO_PRAGMA(omp task depend(in: [n_a]b_a EXTRA))

// if there are N neighbours, define N deps and depend on them
#define CASE(N, ...) case N:                                                     \
                {                                                                \
                    N_DEFINE_DEPS(N)                                             \
                    OMP_TASK(N_OMP_DEPS(N))                                      \
                    {                                                            \
                        for (int i = 0; i < n; i++) b[i] = ... ;                 \
                    }                                                            \
                } break;

int task(int n, int *nb, double **b)
{
    double *b_a = b[0];
    int nb_a = b[0];
    switch(n)
    {
        CASE(1)
        CASE(2)
        CASE(3)
        CASE(4)
    }
}

That would generate the following code (if you prettify it):

int task(int n, int *nb, double **b)
{
    double *b_a = b[0];
    int nb_a = b[0];
    switch (n)
    {
        case 1:
        {
            double *b_b = b[1];
            int n_b = nb[1];
            #pragma omp task depend(in: [n_a]b_a , [n_b]b_b)
            {
                for (int i = 0; i < n; i++)
                    b[i] = ... ;
            }
        } break;
        case 2:
        {
            double *b_b = b[1];
            int n_b = nb[1];
            double *b_c = b[2];
            int n_c = nb[2];
            #pragma omp task depend(in: [n_a]b_a , [n_b]b_b , [n_c]b_c)
            {
                for (int i = 0; i < n; i++)
                    b[i] = ... ;
            }
        } break;
        case 3:
        {
            double *b_b = b[1];
            int n_b = nb[1];
            double *b_c = b[2];
            int n_c = nb[2];
            double *b_d = b[3];
            int n_d = nb[3];
            #pragma omp task depend(in: [n_a]b_a , [n_b]b_b , [n_c]b_c , [n_d]b_d)
            {
                for (int i = 0; i < n; i++)
                    b[i] = ... ;
            }
        } break;
        case 4:
        {
            double *b_b = b[1];
            int n_b = nb[1];
            double *b_c = b[2];
            int n_c = nb[2];
            double *b_d = b[3];
            int n_d = nb[3];
            double *b_e = b[4];
            int n_e = nb[4];
            #pragma omp task depend(in: [n_a]b_a , [n_b]b_b , [n_c]b_c , [n_d]b_d , [n_e]b_e)
            {
                for (int i = 0; i < n; i++)
                    b[i] = ... ;
            }
        } break;
    }
}

As horrendous as this is, it's a workaround and its main perk is: it works.

来源：https://stackoverflow.com/questions/48911320/variable-list-of-dependencies-for-openmp-4-5-tasks

标签

fortran

task

scheduled-tasks

openmp