running script multiple times simultaniously in python 2.7

后端 未结 3 1349
长发绾君心
长发绾君心 2021-01-24 04:44

Hello I am trying to run a script multiple times but would like this to take place at the same time from what I understood i was to use subprocess and threading together howev

3条回答
  •  北恋
    北恋 (楼主)
    2021-01-24 05:42

    Ran some quick tests. Using the framework of your script:

    #!/usr/bin/env python
    
    import os
    import threading
    from subprocess import Popen
    
    class myThread(threading.Thread):
        def run(self):
            for filename in os.listdir("./newscript/"):
                if '.htm' in filename:
                    Popen("./busy.sh")
    
    myThread().start()
    

    I then populated the "newscript" folder with a bunch of ".htm" files against which to run the script against.

    Where "busy.sh" is basically:

    #!/usr/bin/env bash
    while :
    do
        uptime >> $$
        sleep 1
    done
    

    The code you have does indeed fire off multiple processes running in the background. I did this with a newscript folder containing 200 files, and I see 200 processes all running in the background.

    You noted that you want them to run all in the background at the same time.

    For the most part, parallel processes are running in the background "roughly" in parallel, but because of the way that most common operating systems are setup, "parallel" is more like "nearly parallel" or more commonly referred to as asynchronously. If you look at the access times VERY closely, the various processes spawned in this manner will each take a turn, but they will never all do something at the same time.

    That is something to be aware of. Especially since you are accessing files controlled by the OS and underlying filesystem.

    For what you are trying to do: process a bunch of files inbound, how you are doing it is basically spawning off a process to process the file in the background for each file that appears.

    There are a couple of issues with the logic as presented:

    1. High risk of a fork bomb situation, as your spawning is unbounded and there is no tracking of what is still spawned.
    2. The way you are spawning, by calling out and executing another program results in an OS level process being spawned, which is more resource intensive.

    Suggestion:

    Instead of spawning off jobs, you would be better off taking the file processing code you would be spawning and turning it into a Python function. Re-write your code as a daemonized process, which watches the folder and keeps track of how many processes are spawned, so that the level of background processes handing file conversion is managed.

    When processing the file, you would spin off a Python thread to handle it, which would be a lighter weight alternative to spawning off an OS level thread.

提交回复
热议问题