Currently, I have a bunch of luigi tasks queued together, with a simple dependency chain( a -> b -> c -> d
). d
gets executed first, and
d6tflow allows you to reset and force rerun of tasks, see details at https://d6tflow.readthedocs.io/en/latest/control.html#manually-forcing-task-reset-and-rerun.
# force execution including downstream tasks
d6tflow.run([TaskTrain()],force=[TaskGetData()])
# reset single task
TaskGetData().invalidate()
# reset all downstream task output
d6tflow.invalidate_downstream(TaskGetData(), TaskTrain())
# reset all upstream task input
d6tflow.invalidate_upstream(TaskTrain())
Caveat: it only works for d6tflow tasks and targets, which are modified local targets, but not for all luigi targets. Should take you a long way and is optimized for data science workflows. Works well for local worker, haven't tested on central server.