I\'m trying to launch a background process from a CGI scripts. Basically, when a form is submitted the CGI script will indicate to the user that his or her request is being
There are situations where passing work off to a daemon or cron is not appropriate. Sometimes you really DO need to fork, let the parent exit (to keep Apache happy) and let something slow happen in the child.
What worked for me: When done generating web output, and before the fork:
fflush(stdout), close(0), close(1), close(2); // in the process BEFORE YOU FORK
Then fork() and have the parent immediately exit(0);
The child then AGAIN does close(0), close(1), close(2); and also a setsid(); ...and then gets on with whatever it needs to do.
Why you need to close them in the child even though they were closed in the primordial process in advance is confusing to me, but this is what worked. It didn't without the 2nd set of closes. This was on Linux (on a raspberry pi).
I needed to break the stdout as well as the stderr like this:
sys.stdout.flush()
os.close(sys.stdout.fileno()) # Break web pipe
sys.sterr.flush()
os.close(sys.stderr.fileno()) # Break web pipe
if os.fork(): # Get out parent process
sys.exit()
#background processing follows here
My head still hurting on that one. I tried all possible ways to use your code with fork and stdout closing, nulling or anything but nothing worked. The uncompleted process output display depends on webserver (Apache or other) config, and in my case it wasn't an option to change it, so tries with "Transfer-Encoding: chunked;chunk=CRLF" and "sys.stdout.flush()" didn't worked either. Here is the solution that finally worked.
In short, use something like:
if len(sys.argv) == 1: # I'm in the parent process
childProcess = subprocess.Popen('./myScript.py X', bufsize=0, stdin=open("/dev/null", "r"), stdout=open("/dev/null", "w"), stderr=open("/dev/null", "w"), shell=True)
print "My HTML message that says to wait a long time"
else: # Here comes the child and his long process
# From here I cannot print to Webserver, but I can write in files that will be refreshed in my web page.
time.sleep(15) # To verify the parent completes rapidly.
I use the "X" parameter to make the distinction between parent and child because I call the same script for both, but you could do it simpler by calling another script. If a complete example would be useful, please ask.
For thous that have "sh: 1: Syntax error: redirection unexpected"
with the at/batch solution try using something like this:
Make sure that the at command is installed and the user running the application ins't in /etc/at.deny
os.system("echo sudo /srv/scripts/myapp.py | /usr/bin/at now")
I think there are two issues: setsid
is in the wrong place and doing buffered IO operations in one of the transient children:
if os.fork():
print "success"
sys.exit(0)
if os.fork():
os.setsid()
sys.exit()
You've got the original process (grandparent, prints "success"), the middle parent, and the grandchild ("lol.txt").
The os.setsid()
call is being performed in the middle parent after the grandchild has been spawned. The middle parent can't influence the grandchild's session after the grandchild has been created. Try this:
print "success"
sys.stdout.flush()
if os.fork():
sys.exit(0)
os.setsid()
if os.fork():
sys.exit(0)
This creates a new session before spawning the grandchild. Then the middle parent dies, leaving the session without a process group leader, ensuring that any calls to open a terminal will fail, making sure there's never any blocking on terminal input or output, or sending unexpected signals to the child.
Note that I've also moved the success
to the grandparent; there's no guarantee of which child will run first after calling fork(2)
, and you run the risk that the child would be spawned, and potentially try to write output to standard out or standard error, before the middle parent could have had a chance to write success
to the remote client.
In this case, the streams are closed quickly, but still, mixing standard IO streams among multiple processes is bound to give difficulty: keep it all in one process, if you can.
Edit I've found a strange behavior I can't explain:
#!/usr/bin/python
import os
import sys
import time
print "Content-type: text/plain\r\n\r\npid: " + str(os.getpid()) + "\nppid: " + str(os.getppid())
sys.stdout.flush()
if os.fork():
print "\nfirst fork pid: " + str(os.getpid()) + "\nppid: " + str(os.getppid())
sys.exit(0)
os.setsid()
print "\nafter setsid pid: " + str(os.getpid()) + "\nppid: " + str(os.getppid())
sys.stdout.flush()
if os.fork():
print "\nsecond fork pid: " + str(os.getpid()) + "\nppid: " + str(os.getppid())
sys.exit(0)
#os.sleep(1) # comment me out, uncomment me, notice following line appear and dissapear
print "\nafter second fork pid: " + str(os.getpid()) + "\nppid: " + str(os.getppid())
The last line, after second fork pid
, only appears when the os.sleep(1)
call is commented out. When the call is left in place, the last line never appears in the browser. (But otherwise all the content is printed to the browser.)
This double-forking approach is some kind of hack, which to me is indication it shouldn't be done :). For CGI anyway. Under the general principle that if something is too hard to accomplish, you are probably approaching it the wrong way.
Luckily you give the background info on what you need - a CGI call to initiate some processing that happens independently and to return back to the caller. Well sure - there are unix commands that do just that - schedule command to run at specific time (at
) or whenever CPU is free (batch
). So do this instead:
import os
os.system("batch <<< '/home/some_user/do_the_due.py'")
# or if you don't want to wait for system idle,
# os.system("at now <<< '/home/some_user/do_the_due.py'")
print 'Content-type: text/html\n'
print 'Done!'
And there you have it. Keep in mind that if there is some output to stdout/stderr, that will be mailed to the user (which is good for debugging but otherwise script probably should keep quiet).
PS. i just remembered that Windows also has version of at
, so with minor modification of the invocation you can have that work under apache on windows too (vs fork trick that won't work on windows).
PPS. make sure the process running CGI is not excluded in /etc/at.deny
from scheduling batch jobs