SWI Prolog CLP(FD) scheduling

问题

I am solving a scheduling task in SWI Prolog using the CLPFD library. Since it is the first time I solve something more serious than was the sendmory I probably need some good advices from more experienced users. Let me briefly describe the domain/the task.

Domain

I have a "calendar" for a month. Everyday there are 2 for the whole day, 2 for the whole night (long 12h service). There are also, only Mon-Fri 10 more workers for 8 hours (short service).

The domain constraints are, obviously:

There are no consecutive services (no day after night and vice versa, no short day service after night)
On worker can serve up to 2 consecutive night services in a row
Each worker has a limited amount of hours for a month
There are 19 workers available

My approach is as following:

Variables

For every field in the calendar I have a variable defined:

DxD_y where x is number of the day and y is 1 or 2 for the long day service
DxN_y where x is number of the day and y is 1 or 2 for the long night service
DxA_y where x is number of the day and y is 0 .. 9 for the short day service
SUM_x where x is a worker number (1..19) denoting sum of hours for a worker

Each of the D variables has a domain 1..19. To simplify it for now, SUM_X #=< 200 for each X.

Constraints

all_distinct() for each variable for the same day - each worker can serve only one service/day
global_cardinality() to count number of occurrences for each number 1..19 for list with short services and long services - this defines variables LSUM_X and SSUM_X - number of occurrences of worker X in Long or Short services
SUM_X #= 12*LSUM_X + 8*SSUM_X for each worker
DxN_y #\= Dx+1D_z - to avoid long day service after a night one
- bunch of similar constraints like the one above to cover all the domain constraints
DxNy #= Dx+1Ny #==> DxNy #\= Dx+2Ny - to avoid three consecutive night services, there are constraints each combination of x and y

Notes

All the variables and constraints are directly stated in the pl script. I do not use prolog predicates to generate constraint - because I have a model in a .NET application (frontend) and I can easily generate all the stuff from the .NET code into a prolog code.

I think my approach is overall good. Running the scheduler on some smaller example works well (7 days, 4 long services, 1 short service, 8 workers). Also I was able to get some valid results on the full blown case - 30 days, 19 workers, 4 long and 10 short services per day.

However, I am not completely satisfied with the current status. Let me explain why.

Questions

I read some articles about modelling scheduling problems and some of them uses a bit different approach - introducing only boolean variables for each combination of my variable (calendar field) and worker to flag if the worker is assigned to a particular calendar field. Is this a better approach?
If you calculate the overall amount-of-work limits and overall hour in calendar you find out that the workers are not 100% utilized. But the solver creates the solution most likely in this way: utilize the first worker for 100% and then grab the next one. So the SUMs in the solution appears like [200,200,200...200,160,140,80,50,0,]. I would be glad if the workers will be more or less equally utilized. Is there some simple/efficient way how to achieve that? I considered defining somewhat like define the differences between workers and minimize it, but it sound very complex for me and I am afraid I would take ages to compute that. I use labeling([random_variable(29)], Vars), but it only reorders the variables, so there are still these gaps, only in different order. Probably I want the labeling procedure will take the values in some other order than up or down (in some pseudo-random way).
How should I order the constraints? I think the order of constraints matters with respect to efficiency of the labeling.
How to debug/optimize performance of labeling? I hoped solving this type of task will take some seconds or maximally a couple of minutes in case very tight conditions for sums. For example labeling with the the bisect option took ages.

I can provide some more code examples if needed.

回答1:

That's a lot of questions, let me try to address some.

... introducing only boolean variables for each combination of my variable (calendar field) and worker to flag if the worker is assigned to a particular calendar field. Is this a better approach?

This is typically done when a MILP (Mixed Integer Linear Programming) solver is used, where higher-level concepts (such as alldifferent etc) have to be expressed as linear inequalities. Such formulations then usually require lots of boolean variables. Constraint Programming is more flexible here and offers more modelling choices, but unfortunately there is no simple answer, it depends on the problem. Your choice of variables influences both how hard it is to express your problem constraints, and how efficiently it solves.

So the SUMs in the solution appears like [200,200,200...200,160,140,80,50,0,]. I would be glad if the workers will be more or less equally utilized. Is there some simple/efficient way how to achieve that?

You already mention the idea of minimizing differences, and this is how such a balancing requirement would usually be implemented. It does not need to be complicated. If originally we have this unbalanced first solution:

?- length(Xs,5), Xs#::0..9, sum(Xs)#=20, labeling(Xs).
Xs = [0, 0, 2, 9, 9]

then simply minimizing the maximum of list elements will already give you (in combination with the sum-constraint) a balanced solution:

?- length(Xs,5), Xs#::0..9, sum(Xs)#=20, Cost#=max(Xs), minimize(labeling(Xs),Cost).
Xs = [4, 4, 4, 4, 4]
Cost = 4

You could also minimize the difference between minimum and maximum:

?- length(Xs,5), Xs#::0..9, sum(Xs)#=20, Cost#=max(Xs)-min(Xs), minimize(labeling(Xs),Cost).
Xs = [4, 4, 4, 4, 4]
Cost = 0

or even the sum of squares. [Sorry, my examples are for ECLiPSe rather than SWI/clpfd, but should show the general idea.]

How should I order the constraints? I think the order of constraints matters with respect to efficiency of the labeling.

You should not worry about this. While it might have some influence, it is too unpredictable and depends too much on implementation details to allow for any general recommendations. This is really the job of the solver implementer.

How to debug/optimize performance of labeling?

For realistic problems, you will often need (a) a problem-specific labeling heuristic, and (b) some variety of incomplete search. Visualization of the search tree or the search progress can help with tailoring heuristics. You can find some discussion of these issues in chapter 6 of this online course.

来源：https://stackoverflow.com/questions/55593598/swi-prolog-clpfd-scheduling

标签

prolog

scheduling

swi-prolog

clpfd

clp