How best to run one-off migration tasks in a kubernetes cluster

后端 未结 3 443
生来不讨喜
生来不讨喜 2021-01-31 09:01

I have database migrations which I\'d like to run before deploying a new version of my app into a Kubernetes cluster. I want these migrations to be run automatically as part of

相关标签:
3条回答
  • 2021-01-31 09:50

    You could try to make both the migration jobs and app independent of each other by doing the following:

    • Have the migration job return successfully even when the migration failed. Keep a machine-consumable record somewhere of what the outcome of the migration was. This could be done either explicitly (by, say, writing the latest schema version into some database table field) or implicitly (by, say, assuming that a specific field must have been created along a successful migration job). The migration job would only return an error code if it failed for technical reasons (auch as unavailability of the database that the migration should be applied to). This way, you can do the migrations via Kubernetes Jobs and rely on its ability to run to completion eventually.
    • Built the new app version such that it can work with the database in both pre and post migration phases. What this means depends on your business requirements: The app could either turn idle until the migration has completed successfully, or it could return different results to its clients depending on the current phase. The key point here is that the app processes the migration outcome that the migration jobs produced previously and acts accordingly without terminating erroneously.

    Combining these two design approaches, you should be able to develop and execute the migration jobs and app independently of each other and not have to introduce any temporal coupling.

    Whether this idea is actually reasonable to implement depends on more specific details of your case, such as the complexity of your database migration efforts. The alternative, as you mentioned, is to simply deploy unmanaged pods into the cluster that do the migration. This requires a bit more wiring as you will need to regularly check the result and distinguish between successful and failed outcomes.

    0 讨论(0)
  • 2021-01-31 09:54

    blocking while waiting on the result of a queued-up job seems to require hand-rolled scripts

    This isn't necessary anymore thanks to the kubectl wait command.

    Here's how I'm running db migrations in CI:

    kubectl apply -f migration-job.yml
    kubectl wait --for=condition=complete --timeout=60s job/migration
    kubectl delete job/migration
    

    In case of failure or timeout, one of the two first CLI commands returns with an erroneous exit code which then forces the rest of the CI pipeline to terminate.

    migration-job.yml describes a kubernetes Job resource configured with restartPolicy: Never and a reasonably low activeDeadlineSeconds.

    You could also use the spec.ttlSecondsAfterFinished attribute instead of manually running kubectl delete but that's still in alpha at the time of writing and not supported by Google Kubernetes Engine at least.

    0 讨论(0)
  • 2021-01-31 09:59

    Considering the age of this question I'm not sure if initContainers were available at the time but they are super helpful now.

    https://kubernetes.io/docs/concepts/workloads/pods/init-containers/

    The way I recently set this up was to have a postgres pod and our django application running in the same namespace, however the django pod has 3 initContainers:

    1. init-migrations
    2. init-fixtures
    3. init-createsuperUser

    What this will do is run the django pod and the postgres pod in parallel but also continuously run the initContainers until the postgres pod comes up and then your migrations should run.

    As for the pods perpetually restarting, maybe they've fixed the restartPolicy by now. I'm currently pretty new to kubernetes but this is what I've found works for me.

    0 讨论(0)
提交回复
热议问题