Many threads to write log file at same time in Python

前端 未结 3 984
独厮守ぢ
独厮守ぢ 2020-12-10 04:57

I am writing a script to retrieve WMI info from many computers at the same time then write this info in a text file:

f = open(\"results.txt\", \'w+\') ## to          


        
相关标签:
3条回答
  • 2020-12-10 05:34

    You can simply create your own locking mechanism to ensure that only one thread is ever writing to a file.

    import threading
    lock = threading.Lock()
    
    def write_to_file(f, text, file_size):
        lock.acquire() # thread blocks at this line until it can obtain lock
    
        # in this section, only one thread can be present at a time.
        print >> f, text, file_size
    
        lock.release()
    
    def filesize(asset):  
        f = open("results.txt", 'a+')  
        c = wmi.WMI(asset)  
        wql = 'SELECT FileSize,Name FROM CIM_DataFile where (Drive="D:" OR Drive="E:") and Caption like "%file%"'  
        for item in c.query(wql):  
            write_to_file(f, item.Name.split("\\")[2].strip().upper(), str(item.FileSize))
    

    You may want to consider placing the lock around the entire for loop for item in c.query(wql): to allow each thread to do a larger chunk of work before releasing the lock.

    0 讨论(0)
  • 2020-12-10 05:44

    For another solution, use a Pool to calculate data, returning it to the parent process. This parent then writes all data to a file. Since there's only one proc writing to the file at a time, there's no need for additional locking.

    Note the following uses a pool of processes, not threads. This makes the code much simpler and easier than putting something together using the threading module. (There is a ThreadPool object, but it's not documented.)

    source

    import glob, os, time
    from multiprocessing import Pool
    
    def filesize(path):
        time.sleep(0.1)
        return (path, os.path.getsize(path))
    
    paths = glob.glob('*.py')
    pool = Pool()                   # default: proc per CPU
    
    with open("results.txt", 'w+') as dataf:
        for (apath, asize) in pool.imap_unordered(
                filesize, paths,
        ):
            print >>dataf, apath,asize
    

    output in results.txt

    zwrap.py 122
    usercustomize.py 38
    tpending.py 2345
    msimple4.py 385
    parse2.py 499
    
    0 讨论(0)
  • 2020-12-10 05:50

    print is not thread safe. Use the logging module instead (which is):

    import logging
    import threading
    import time
    
    
    FORMAT = '[%(levelname)s] (%(threadName)-10s) %(message)s'
    
    logging.basicConfig(level=logging.DEBUG,
                        format=FORMAT)
    
    file_handler = logging.FileHandler('results.log')
    file_handler.setFormatter(logging.Formatter(FORMAT))
    logging.getLogger().addHandler(file_handler)
    
    
    def worker():
        logging.info('Starting')
        time.sleep(2)
        logging.info('Exiting')
    
    
    t1 = threading.Thread(target=worker)
    t2 = threading.Thread(target=worker)
    
    t1.start()
    t2.start()
    

    Output (and contents of results.log):

    [INFO] (Thread-1  ) Starting
    [INFO] (Thread-2  ) Starting
    [INFO] (Thread-1  ) Exiting
    [INFO] (Thread-2  ) Exiting
    

    Instead of using the default name (Thread-n), you can set your own name using the name keyword argument, which the %(threadName) formatting directive then will then use:

    t = threading.Thread(name="My worker thread", target=worker)
    

    (This example was adapted from an example from Doug Hellmann's excellent article about the threading module)

    0 讨论(0)
提交回复
热议问题