Efficiency of len() and pop() in Python

后端 未结 2 2012
野的像风
野的像风 2020-12-29 14:07

Why is this Significantly faster with comments? Shouldn\'t a pop, a comparison, and a length check be O(1)? Would that significantly affect the speed?

#! /us         


        
2条回答
  •  囚心锁ツ
    2020-12-29 14:53

    list.pop(index) is an O(n) operation, because after you remove the value from the list, you have to shift the memory location of every other value in the list over one. Calling pop repeatedly on large lists is great way to waste computing cycles. If you absolutely must remove from the front of a large list over and over use collections.deque, which will give you much faster insertions and deletions to thr front.

    len() is O(1) because deletions are O(n), since if you make sure all the values in a list are allocated in memory right next to each other, the total length of a list is just the tail's memory location - the head's memory location. If you don't care about the performance of len() and similar operations, then you can use a linked list to do constant time insertions and deletions - that just makes len() be O(n) and pop() be O(1) (and you get some other funky stuff like O(n) lookups).

    Everything I said about pop() goes for insert() also - except for append(), which usually takes O(1).

    I recently worked on a problem that required deleting lots of elements from a very large list (around 10,000,000 integers) and my initial dumb implementation just used pop() every time I needed to delete something - that turned out to not work at all, because it took O(n) to do even one cycle of the algorithm, which itself needed to n times.

    My solution was to create a set() called ignore in which I kept the indices of all "deleted" elements. I had little helper functions to help me not have to think about skipping these, so my algorithm didn't get too ugly. What eventually did it was doing a single O(n) pass every 10,000 iterations to delete all the elements in ignore and make ignore empty again, that way I got the increased performance from a shrinking list while only having to do one 10,000th of the work for my deletions.

    Also, ya, you should get a memory error because you are trying to allocate a list that is definitely much larger than your hard drive - much less your memory.

提交回复
热议问题