I have a django script that should be run at a specified time every day. I am trying to achieve this using crontab
. The script is supposed to dump the database,
I’m not very good at reading strace
output, but I think the one you posted indicates that your program has invoked git
and is awaiting its termination. You mention uploading to BitBucket, so here’s a shot in the dark: git
tries to push to an ssh remote; when you run it as yourself, ssh-agent
authenticates you transparently; but when you run it as root, there’s no ssh-agent
, thus git
prompts for ssh password and awaits your input.
Try doing the git
invocation manually under sudo su
and check.
If this does not help, you need to get at the output of git
(or whatever it is you’re actually invoking there). Check the documentation for the sh package for details on how to redirect the standard output and standard error.
This is also a shot in the dark - our team has had issues running management commands through cron. We never bothered to track down why they were flaky, but after much hair-pulling we reverted to invoking the python functions directly rather than going through manage.py and things have been humming along fine ever since.
Since some version of python /my_django_project_path/manage.py database_bu
works for you, it means the problem is with your cron environment
, or in the way you have set up your cron and not with the script itself (as in the size of file to be uploaded or network connectivity is not causing the issue).
Firstly, you are running the script as
47 16 * * * root python /my_django_project_path/manage.py database_bu
You are providing it a username root
, which is not the same user as your current user, while the shell command worked for your current user. The fact that the same command doesn't run from root
user using sudo su
suggests that your root user account is not properly configured anyway. FWIW, scheduling something as root should almost always be avoided because it can lead to weird file permission issues.
So try scheduling your cron job as follows from that current user.
47 16 * * * cd /my_django_project_path/ && python manage.py database_bu
This may still not run the cron job completely. In which case, the problem could be at 2 places - your shell environment is having some variables that are missing from your cron environment, or your .netrc
file is not being read properly for credentials, or both.
In my experience, I have found that PATH
variable causes the most troubles, so run echo $PATH
on your shell, and if the path value you get is /some/path:/some/other/path:/more/path/values
, run your cron job like
47 16 * * * export PATH="/some/path:/some/other/path:/more/path/values" && cd /my_django_project_path/ && python manage.py database_bu
If this doesn't work out, check all the environment variables next.
Use printenv > ~/environment.txt
from a normal shell to get all the environment variables set in the shell. Then use the following cron entry * * * * * printenv > ~/cron_environment.txt
to identify the missing variables from the cron environment. Alternatively, you can use the snippet in a script to get the value of environment from with the script
import os
os.system("printenv")
Compare the two, figure out any other relevant variables which are different (like HOME
), and try using the same within the script/cron entry to check if they work or not.
If things still don't work out, then I think the remaining problem should be with your bitbucket credentials in .netrc
in which saving the username and password. The contents .netrc
might not be available in the cron environment.
Instead, create and set up an ssh keypair for your account and let the backup happen over ssh
instead of https
(Its better if you generate a ssh key without passphrase in this step, to avoid ssh-keys' gotchas).
Once you have setup the ssh keys, you will accordingly have to edit the existing origin url from .git/config
file of your project root (or will have to add a new remote origin_ssh
using git remote add origin_ssh url
for the ssh protocol).
Note that https
urls for the repo is like https://user@bitbucket.org/user/repo.git
while the ssh one is like git@bitbucket.org:user/repo.git
.
PS: bitbucket
, or rather git
is not the ideal solution for backups, there are tonnes of threads hanging around for better backup strategies. Also, while debugging, run your crons every minute (* * * * *
), or at similarly low frequency for faster debugging.
Edit
OP says in the comment that setting the PWD
variable worked for him.
PWD=/my_django_project_path/helpers/management/commands to /etc/environment
This is what I had suggested earlier, one of the environment variable available in the shell not being present in cron environment.
In general, crown always runs with a reduced set of environment variable and permission, and setting the right variables will make cron work.
Also since you are using a .netrc
file for permissions, it is specific to your account, and therefore that won't work with any other account (including the sudo
account for root
), unless you configure the same setting in your other account as well.
That reminds me of a very frustrating gotcha. Do you have a newline at the end of your crontab file? From man crontab:
...cron requires that each entry in a crontab end in a newline character. If the last entry in a crontab is missing the newline, cron will consider the crontab (at least partially) broken and refuse to install it.