python function slowing down for no apparent reason

后端 未结 6 1017
小鲜肉
小鲜肉 2021-01-22 16:56

I have a python function defined as follows which i use to delete from list1 the items which are already in list2. I am using python 2.6.2 on windows XP

def comp         


        
6条回答
  •  孤街浪徒
    2021-01-22 17:47

    Try a more pythonic approach to the filtering, something like

    [x for x in list1 if x not in set(list2)]
    

    Converting both lists to sets is unnessescary, and will be very slow and memory hungry on large amounts of data.

    Since your data is a list of lists, you need to do something in order to hash it. Try out

    list2_set = set([tuple(x) for x in list2])
    diff = [x for x in list1 if tuple(x) not in list2_set]
    

    I tested out your original function, and my approach, using the following test data:

    list1 = [[x+1, x*2] for x in range(38000)]
    list2 = [[x+1, x*2] for x in range(10000, 160000)]
    

    Timings - not scientific, but still:

     #Original function
     real    2m16.780s
     user    2m16.744s
     sys     0m0.017s
    
     #My function
     real    0m0.433s
     user    0m0.423s
     sys     0m0.007s
    

提交回复
热议问题