Parallelize brute force generation

后端 未结 2 1424
深忆病人
深忆病人 2021-01-26 02:50

I\'ve searched a lot for this, but was unable to find something I could grasp my head around. What I\'m searching for is how to make this algorithm parallel. It doesn\'t matter

相关标签:
2条回答
  • 2021-01-26 03:31

    If I understand your generator expression correctly, you are trying to generate all the words with a length from 1 .. maxlength given a certain alphabet.

    I don't know exactly how you want to split your problem (do you have N workers?), but one obvious way would be to split the list of words just by the first letter, hand those out to the various parallel workers, which then all have to append all possible combinations of words with 0 .. maxlength - 1 letters from the alphabet.

    0 讨论(0)
  • 2021-01-26 03:45

    First thing you have to do is get disenchanted with 1-liner disease ;-) That is, parallel processing adds many complexities of its own, so you want to keep your code as simple and transparent as possible. Here's a way that generalizes @BasSwinckels's suggestion. It's not short! But it's very effective: it will nail your CPU meter to the wall no matter how many cores you have.

    CHARSET = "abcdefghijklmnopqrstuvwxyx"
    MAX_LENGTH = 6  # generate all strings from CHARSET with length 1 thru MAX_LENGTH
    NUM_PROCESSES = None # defaults to all available cores
    
    from itertools import product
    
    # string_gen is what the workers run.  Everything else
    # runs in the main program.
    def string_gen(prefix, suffix_len, length):
        # Generate all strings of length `length` starting with `prefix`.
        # If length > suffix_len, only the last suffix_len characters
        # need to be generated.
        num_done = 0
        if length <= suffix_len:
            assert prefix == ""
            for t in product(CHARSET, repeat=length):
                result = "".join(t)
                # do something with result
                num_done += 1
        else:
            assert len(prefix) + suffix_len == length
            for t in product(CHARSET, repeat=suffix_len):
                result = prefix + "".join(t)
                # do something with result
                num_done += 1
        return num_done
    
    def record_done(num):
        global num_done
        num_done += num
        print num_done, "done so far"
    
    def do_work(pool, strings_per_chunk=1000000):
        # What's the most chars we can cycle through without
        # exceeding strings_per_chunk?  Could do with this
        # logs, but I'm over-reacting to 1-liner disease ;-)
        N = len(CHARSET)
        suffix_len = 1
        while N**suffix_len <= strings_per_chunk:
            suffix_len += 1
        suffix_len -= 1
        print "workers will cycle through the last", suffix_len, "chars"
    
        # There's no point to splitting up very short strings.
        max_short_len = min(suffix_len, MAX_LENGTH)
        for length in range(1, max_short_len + 1):
            pool.apply_async(string_gen, args=("", suffix_len, length),
                             callback=record_done)
        # And now the longer strings.
        for length in range(max_short_len + 1, MAX_LENGTH + 1):
            for t in product(CHARSET, repeat=length-suffix_len):
                prefix = "".join(t)
                pool.apply_async(string_gen, args=(prefix, suffix_len, length),
                                 callback=record_done)
    
    if __name__ == "__main__":
        import multiprocessing
        pool = multiprocessing.Pool(NUM_PROCESSES)
        num_done = 0
        do_work(pool)
        pool.close()
        pool.join()
        expected = sum(len(CHARSET)**i
                       for i in range(1, MAX_LENGTH + 1))
        assert num_done == expected, (num_done, expected)
    

    There are multiple pieces to this because what you want is "lumpy": strings of a variety of sizes. Parallel gimmicks are usually easier the more utterly uniform the structure of the problem. But lumpiness can be handled - it just takes more code.

    Do note the assert statements, and the "useless" computations of num_done. Parallel code adds brand new dimensions of complexity, so do yourself a favor and code defensively from the start. You're going to try many things that simply won't work - that happens to everyone.

    Note too that breaking free of 1-liner disease also gives a more efficient approach, even without multiple cores: computing and joining the prefix just once for the longer strings will save countless billions of redundant joins over the course of longer runs.

    Have fun :-)

    0 讨论(0)
提交回复
热议问题