break the function after certain time

后端 未结 5 1824
长情又很酷
长情又很酷 2020-11-27 16:34

In python, for a toy example:

for x in range(0, 3):
    # call function A(x)

I want to continue the for loop if function A takes more than

相关标签:
5条回答
  • 2020-11-27 16:44

    The comments are correct in that you should check inside. Here is a potential solution. Note that an asynchronous function (by using a thread for example) is different from this solution. This is synchronous which means it will still run in series.

    import time
    
    for x in range(0,3):
        someFunction()
    
    def someFunction():
        start = time.time()
        while (time.time() - start < 5):
            # do your normal function
    
        return;
    
    0 讨论(0)
  • 2020-11-27 16:44

    This seems like better idea (sorry, not sure of the python names of thing yet):

    import signal
    
    def signal_handler(signum, frame):
        raise Exception("Timeout!")
    
    signal.signal(signal.SIGALRM, signal_handler)
    signal.alarm(3)   # Three seconds
    try:
        for x in range(0, 3):
            # call function A(x)
    except Exception, msg:
        print "Timeout!"
    signal.alarm(0)    # reset
    
    0 讨论(0)
  • 2020-11-27 16:53

    If you can break your work up and check every so often, that's almost always the best solution. But sometimes that's not possible—e.g., maybe you're reading a file off an slow file share that every once in a while just hangs for 30 seconds. To deal with that internally, you'd have to restructure your whole program around an async I/O loop.

    If you don't need to be cross-platform, you can use signals on *nix (including Mac and Linux), APCs on Windows, etc. But if you need to be cross-platform, that doesn't work.

    So, if you really need to do it concurrently, you can, and sometimes you have to. In that case, you probably want to use a process for this, not a thread. You can't really kill a thread safely, but you can kill a process, and it can be as safe as you want it to be. Also, if the thread is taking 5+ seconds because it's CPU-bound, you don't want to fight with it over the GIL.

    There are two basic options here.


    First, you can put the code in another script and run it with subprocess:

    subprocess.check_call([sys.executable, 'other_script.py', arg, other_arg],
                          timeout=5)
    

    Since this is going through normal child-process channels, the only communication you can use is some argv strings, a success/failure return value (actually a small integer, but that's not much better), and optionally a hunk of text going in and a chunk of text coming out.


    Alternatively, you can use multiprocessing to spawn a thread-like child process:

    p = multiprocessing.Process(func, args)
    p.start()
    p.join(5)
    if p.is_alive():
        p.terminate()
    

    As you can see, this is a little more complicated, but it's better in a few ways:

    • You can pass arbitrary Python objects (at least anything that can be pickled) rather than just strings.
    • Instead of having to put the target code in a completely independent script, you can leave it as a function in the same script.
    • It's more flexible—e.g., if you later need to, say, pass progress updates, it's very easy to add a queue in either or both directions.

    The big problem with any kind of parallelism is sharing mutable data—e.g., having a background task update a global dictionary as part of its work (which your comments say you're trying to do). With threads, you can sort of get away with it, but race conditions can lead to corrupted data, so you have to be very careful with locking. With child processes, you can't get away with it at all. (Yes, you can use shared memory, as Sharing state between processes explains, but this is limited to simple types like numbers, fixed arrays, and types you know how to define as C structures, and it just gets you back to the same problems as threads.)


    Ideally, you arrange things so you don't need to share any data while the process is running—you pass in a dict as a parameter and get a dict back as a result. This is usually pretty easy to arrange when you have a previously-synchronous function that you want to put in the background.

    But what if, say, a partial result is better than no result? In that case, the simplest solution is to pass the results over a queue. You can do this with an explicit queue, as explained in Exchanging objects between processes, but there's an easier way.

    If you can break the monolithic process into separate tasks, one for each value (or group of values) you wanted to stick in the dictionary, you can schedule them on a Pool—or, even better, a concurrent.futures.Executor. (If you're on Python 2.x or 3.1, see the backport futures on PyPI.)

    Let's say your slow function looked like this:

    def spam():
        global d
        for meat in get_all_meats():
            count = get_meat_count(meat)
            d.setdefault(meat, 0) += count
    

    Instead, you'd do this:

    def spam_one(meat):
        count = get_meat_count(meat)
        return meat, count
    
    with concurrent.futures.ProcessPoolExecutor(max_workers=1) as executor:
        results = executor.map(spam_one, get_canned_meats(), timeout=5)
        for (meat, count) in results:
            d.setdefault(meat, 0) += count
    

    As many results as you get within 5 seconds get added to the dict; if that isn't all of them, the rest are abandoned, and a TimeoutError is raised (which you can handle however you want—log it, do some quick fallback code, whatever).

    And if the tasks really are independent (as they are in my stupid little example, but of course they may not be in your real code, at least not without a major redesign), you can parallelize the work for free just by removing that max_workers=1. Then, if you run it on an 8-core machine, it'll kick off 8 workers and given them each 1/8th of the work to do, and things will get done faster. (Usually not 8x as fast, but often 3-6x as fast, which is still pretty nice.)

    0 讨论(0)
  • 2020-11-27 16:58

    I think creating a new process may be overkill. If you're on Mac or a Unix-based system, you should be able to use signal.SIGALRM to forcibly time out functions that take too long. This will work on functions that are idling for network or other issues that you absolutely can't handle by modifying your function. I have an example of using it in this answer:

    https://stackoverflow.com/a/24921763/3803152

    Editing my answer in here, though I'm not sure I'm supposed to do that:

    import signal
    
    class TimeoutException(Exception):   # Custom exception class
        pass
    
    def timeout_handler(signum, frame):   # Custom signal handler
        raise TimeoutException
    
    # Change the behavior of SIGALRM
    signal.signal(signal.SIGALRM, timeout_handler)
    
    for i in range(3):
        # Start the timer. Once 5 seconds are over, a SIGALRM signal is sent.
        signal.alarm(5)    
        # This try/except loop ensures that 
        #   you'll catch TimeoutException when it's sent.
        try:
            A(i) # Whatever your function that might hang
        except TimeoutException:
            continue # continue the for loop if function A takes more than 5 second
        else:
            # Reset the alarm
            signal.alarm(0)
    

    This basically sets a timer for 5 seconds, then tries to execute your code. If it fails to complete before time runs out, a SIGALRM is sent, which we catch and turn into a TimeoutException. That forces you to the except block, where your program can continue.

    EDIT: whoops, TimeoutException is a class, not a function. Thanks, abarnert!

    0 讨论(0)
  • 2020-11-27 17:04

    Maybe some one find this decorator useful, based on TheSoundDefense answer:

    import time
    import signal
    
    class TimeoutException(Exception):   # Custom exception class
        pass
    
    
    def break_after(seconds=2):
        def timeout_handler(signum, frame):   # Custom signal handler
            raise TimeoutException
        def function(function):
            def wrapper(*args, **kwargs):
                signal.signal(signal.SIGALRM, timeout_handler)
                signal.alarm(seconds)
                try:
                    res = function(*args, **kwargs)
                    signal.alarm(0)      # Clear alarm
                    return res
                except TimeoutException:
                    print u'Oops, timeout: %s sec reached.' % seconds, function.__name__, args, kwargs
                return
            return wrapper
        return function
    

    test:

    @break_after(3)
    def test(a,b,c):
        return time.sleep(10)
    
    >>> test(1,2,3)
    Oops, timeout: 3 sec reached. test (1, 2, 3) {}
    
    0 讨论(0)
提交回复
热议问题