Python: Memory usage and optimization when modifying lists

前端 未结 7 1073
失恋的感觉
失恋的感觉 2021-02-04 03:28

The problem

My concern is the following: I am storing a relativity large dataset in a classical python list and in order to process the data I must iterate over the li

7条回答
  •  北海茫月
    2021-02-04 04:07

    Without knowing the specifics of what you're doing with this list, it's hard to know exactly what would be best in this case. If your processing stage depends on the current index of the list element, this won't work, but if not, it appears you've left off the most Pythonic (and in many ways, easiest) approach: generators.

    If all you're doing is iterating over each element, processing it in some way, then either including that element in the list or not, use a generator. Then you never need to store the entire iterable in memory.

    def process_and_generate_data(source_iterable):
        for item in source_iterable:
            dosomestuff(item)
            if not somecondition(item):
                yield item
    

    You would need to have a processing loop that dealt with persisting the processed iterable (writing it back to a file, or whatever), or if you have multiple processing stages you'd prefer to separate into different generators you could have your processing loop pass one generator to the next.

提交回复
热议问题