Increasing speed of python code

前端 未结 8 1814
野性不改
野性不改 2021-02-05 12:37

I have some python code that has many classes. I used cProfile to find that the total time to run the program is 68 seconds. I found that the following function in

相关标签:
8条回答
  • 2021-02-05 13:04

    Depending on how often you add new elements to self.people or change person.utility, you could consider sorting self.people by the utility field.

    Then you could use a bisect function to find the lower index i_pivot where the person[i_pivot].utility >= price condition is met. This would have a lower complexity ( O(log N) ) than your exhaustive loop ( O(N) )

    With this information, you could then update your people list if needed :

    Do you really need to update the utility field each time ? In the sorted case, you could easily deduce this value while iterating : for example, considering your list sorted in incresing order, utility = (index >= i_pivot)

    Same question with customers and nonCustomers lists. Why do you need them? They could be replaced by slices of the original sorted list : for example, customers = self.people[0:i_pivot]

    All this would allow you to reduce the complexity of your algorithm, and use more built-in (fast) Python functions, this could speedup your implementation.

    0 讨论(0)
  • 2021-02-05 13:05

    There are many things you can try after optimizing your Python code for speed. If this program doesn't need C extensions, you can run it under PyPy to benefit from its JIT compiler. You can try making a C extension for possibly huge speedups. Shed Skin will even allow you to convert your Python program to a standalone C++ binary.

    I'm willing to time your program under these different optimization scenarios if you can provide enough code for benchmarking,

    Edit: First of all, I have to agree with everyone else: are you sure you're measuring the time correctly? The example code runs 100 times in under 0.1 seconds here, so there is a good chance the either the time is wrong or you have a bottleneck (IO?) that isn't present in the code sample.

    That said, I made it 300000 people so times were consistent. Here's the adapted code, shared by CPython (2.5), PyPy and Shed Skin:

    from time import time
    import random
    import sys
    
    
    class person(object):
        def __init__(self, util):
            self.utility = util
            self.customer = 0
    
    
    class population(object):
        def __init__(self, numpeople, util):
            self.people = []
            self.cus = []
            self.noncus = []
            for u in util:
                per = person(u)
                self.people.append(per)
    
    
    def f_w_append(popn):
        '''Function with append'''
        P = 75
        cus = []
        noncus = []
        # Help CPython a bit
        # cus_append, noncus_append = cus.append, noncus.append
        for per in popn.people:
            if  per.utility >= P:
                per.customer = 1
                cus.append(per)
            else:
                per.customer = 0
                noncus.append(per)
        return len(cus)
    
    
    def f_wo_append(popn):
        '''Function without append'''
        P = 75
        for per in popn.people:
            if  per.utility >= P:
                per.customer = 1
            else:
                per.customer = 0
    
        numcustomers = 0
        for per in popn.people:
            if per.customer == 1:
                numcustomers += 1
        return numcustomers
    
    
    def main():
        try:
            numpeople = int(sys.argv[1])
        except:
            numpeople = 300000
    
        print "Running for %s people, 100 times." % numpeople
    
        begin = time()
        random.seed(1)
        # Help CPython a bit
        uniform = random.uniform
        util = [uniform(0.0, 300.0) for _ in xrange(numpeople)]
        # util = [random.uniform(0.0, 300.0) for _ in xrange(numpeople)]
    
        popn1 = population(numpeople, util)
        start = time()
        for _ in xrange(100):
            r = f_wo_append(popn1)
        print r
        print "Without append: %s" % (time() - start)
    
    
        popn2 = population(numpeople, util)
        start = time()
        for _ in xrange(100):
            r = f_w_append(popn2)
        print r
        print "With append: %s" % (time() - start)
    
        print "\n\nTotal time: %s" % (time() - begin)
    
    if __name__ == "__main__":
        main()
    

    Running with PyPy is as simple as running with CPython, you just type 'pypy' instead of 'python'. For Shed Skin, you must convert to C++, compile and run:

    shedskin -e makefaster.py && make 
    
    # Check that you're using the makefaster.so file and run test
    python -c "import makefaster; print makefaster.__file__; makefaster.main()" 
    

    And here is the Cython-ized code:

    from time import time
    import random
    import sys
    
    
    cdef class person:
        cdef readonly int utility
        cdef public int customer
    
        def __init__(self, util):
            self.utility = util
            self.customer = 0
    
    
    class population(object):
        def __init__(self, numpeople, util):
            self.people = []
            self.cus = []
            self.noncus = []
            for u in util:
                per = person(u)
                self.people.append(per)
    
    
    cdef int f_w_append(popn):
        '''Function with append'''
        cdef int P = 75
        cdef person per
        cus = []
        noncus = []
        # Help CPython a bit
        # cus_append, noncus_append = cus.append, noncus.append
    
        for per in popn.people:
            if  per.utility >= P:
                per.customer = 1
                cus.append(per)
            else:
                per.customer = 0
                noncus.append(per)
        cdef int lcus = len(cus)
        return lcus
    
    
    cdef int f_wo_append(popn):
        '''Function without append'''
        cdef int P = 75
        cdef person per
        for per in popn.people:
            if  per.utility >= P:
                per.customer = 1
            else:
                per.customer = 0
    
        cdef int numcustomers = 0
        for per in popn.people:
            if per.customer == 1:
                numcustomers += 1
        return numcustomers
    
    
    def main():
    
        cdef int i, r, numpeople
        cdef double _0, _300
        _0 = 0.0
        _300 = 300.0
    
        try:
            numpeople = int(sys.argv[1])
        except:
            numpeople = 300000
    
        print "Running for %s people, 100 times." % numpeople
    
        begin = time()
        random.seed(1)
        # Help CPython a bit
        uniform = random.uniform
        util = [uniform(_0, _300) for i in xrange(numpeople)]
        # util = [random.uniform(0.0, 300.0) for _ in xrange(numpeople)]
    
        popn1 = population(numpeople, util)
        start = time()
        for i in xrange(100):
            r = f_wo_append(popn1)
        print r
        print "Without append: %s" % (time() - start)
    
    
        popn2 = population(numpeople, util)
        start = time()
        for i in xrange(100):
            r = f_w_append(popn2)
        print r
        print "With append: %s" % (time() - start)
    
        print "\n\nTotal time: %s" % (time() - begin)
    
    if __name__ == "__main__":
        main()
    

    For building it, it's nice to have a setup.py like this one:

    from distutils.core import setup
    from distutils.extension import Extension
    from Cython.Distutils import build_ext
    
    ext_modules = [Extension("cymakefaster", ["makefaster.pyx"])]
    
    setup(
      name = 'Python code to speed up',
      cmdclass = {'build_ext': build_ext},
      ext_modules = ext_modules
    )
    

    You build it with: python setupfaster.py build_ext --inplace

    Then test: python -c "import cymakefaster; print cymakefaster.file; cymakefaster.main()"

    Timings were run five times for each version, with Cython being the fastest and easiest of the code generators to use (Shed Skin aims to be simpler, but cryptic error messages and implicit static typing made it harder here). As for best value, PyPy gives impressive speedup in the counter version with no code changes.

    #Results (time in seconds for 30000 people, 100 calls for each function):
                      Mean      Min  Times    
    CPython 2.5.2
    Without append: 35.037   34.518  35.124, 36.363, 34.518, 34.620, 34.559
    With append:    29.251   29.126  29.339, 29.257, 29.259, 29.126, 29.272
    Total time:     69.288   68.739  69.519, 70.614, 68.746, 68.739, 68.823
    
    PyPy 1.4.1
    Without append:  2.672    2.655   2.655,  2.670,  2.676,  2.690,  2.668
    With append:    13.030   12.672  12.680, 12.725, 14.319, 12.755, 12.672
    Total time:     16.551   16.194  16.196, 16.229, 17.840, 16.295, 16.194
    
    Shed Skin 0.7 (gcc -O2)
    Without append:  1.601    1.599   1.599,  1.605,  1.600,  1.602,  1.599
    With append:     3.811    3.786   3.839,  3.795,  3.798,  3.786,  3.839
    Total time:      5.704    5.677   5.715,  5.705,  5.699,  5.677,  5.726
    
    Cython 0.14 (gcc -O2)
    Without append:  1.692    1.673   1.673,  1.710,  1.678,  1.688,  1.711
    With append:     3.087    3.067   3.079,  3.080,  3.119,  3.090,  3.067
    Total time:      5.565    5.561   5.562,  5.561,  5.567,  5.562,  5.572
    

    Edit: Aaaand more meaningful timings, for 80000 calls with 300 people each:

    Results (time in seconds for 300 people, 80000 calls for each function):
                      Mean      Min  Times
    CPython 2.5.2
    Without append: 27.790   25.827  25.827, 27.315, 27.985, 28.211, 29.612
    With append:    26.449   24.721  24.721, 27.017, 27.653, 25.576, 27.277
    Total time:     54.243   50.550  50.550, 54.334, 55.652, 53.789, 56.892
    
    
    Cython 0.14 (gcc -O2)
    Without append:  1.819    1.760   1.760,  1.794,  1.843,  1.827,  1.871
    With append:     2.089    2.063   2.100,  2.063,  2.098,  2.104,  2.078
    Total time:      3.910    3.859   3.865,  3.859,  3.944,  3.934,  3.951
    
    PyPy 1.4.1
    Without append:  0.889    0.887   0.894,  0.888,  0.890,  0.888,  0.887
    With append:     1.671    1.665   1.665,  1.666,  1.671,  1.673,  1.681
    Total time:      2.561    2.555   2.560,  2.555,  2.561,  2.561,  2.569
    
    Shed Skin 0.7 (g++ -O2)
    Without append:  0.310    0.301   0.301,  0.308,  0.317,  0.320,  0.303
    With append:     1.712    1.690   1.733,  1.700,  1.735,  1.690,  1.702
    Total time:      2.027    2.008   2.035,  2.008,  2.052,  2.011,  2.029
    

    Shed Skin becomes fastest, PyPy surpasses Cython. All three speed things up a lot compared to CPython.

    0 讨论(0)
  • 2021-02-05 13:05

    It's surprising that the function shown is such a bottleneck because it's so relatively simple. For that reason, I'd double check my profiling procedure and results. However, if they're correct, the most time consuming part of your function has to be the for loop it contains, of course, so it makes sense to focus on speeding that up. One way to do this is by replacing the if/else with straight-line code. You can also reduce the attribute lookup for the append list method slightly. Here's how both of those things could be accomplished:

    def qtyDemanded(self, timePd, priceVector):
        '''Returns quantity demanded in period timePd. In addition,
        also updates the list of customers and non-customers.
    
        Inputs: timePd and priceVector
        Output: count of people for whom priceVector[-1] < utility
        '''
    
        price = priceVector[-1] # last price
        kinds = [[], []] # initialize sublists of noncustomers and customers
        kindsAppend = [kinds[b].append for b in (False, True)] # append methods
    
        for person in self.people:
            person.customer = person.utility >= price  # customer test
            kindsAppend[person.customer](person)  # add to proper list
    
        self.nonCustomers = kinds[False]
        self.customers = kinds[True]
    
        return len(self.customers)
    

    That said, I must add that it seems a little redundant to have both a customer flag in each person object and also put each of them into a separate list depending on that attribute. Not creating these two lists would of course speed the loop up further.

    0 讨论(0)
  • 2021-02-05 13:07

    Some curious things I noted:

    timePd is passed as a parameter but never used

    price is an array but you only ever use the last entry - why not pass the value there instead of passing the list?

    count is initialized and never used

    self.people contains multiple person objects which are then copied to either self.customers or self.noncustomers as well as having their customer flag set. Why not skip the copy operation and, on return, just iterate over the list, looking at the customer flag? This would save the expensive append.

    Alternatively, try using psyco which can speed up pure Python, sometimes considerably.

    0 讨论(0)
  • 2021-02-05 13:08

    This comment rings alarm bells:

    '''Returns quantity demanded in period timePd. In addition,
    also updates the list of customers and non-customers.
    

    Aside from the fact that timePd is not used in the function, if you really want just to return the quantity, do just that in the function. Do the "in addition" stuff in a separate function.

    Then profile again and see which of these two functions you are spending most of your time in.

    I like to apply SRP to methods as well as classes: it makes them easier to test.

    0 讨论(0)
  • 2021-02-05 13:12

    You're asking for guesses, and mostly you're getting guesses.

    There's no need to guess. Here's an example.

    0 讨论(0)
提交回复
热议问题