Writing a list to a file with Python

后端 未结 21 1833
孤街浪徒
孤街浪徒 2020-11-22 01:48

Is this the cleanest way to write a list to a file, since writelines() doesn\'t insert newline characters?

file.writelines([\"%s\\n\" % item  fo         


        
21条回答
  •  不思量自难忘°
    2020-11-22 02:30

    I thought it would be interesting to explore the benefits of using a genexp, so here's my take.

    The example in the question uses square brackets to create a temporary list, and so is equivalent to:

    file.writelines( list( "%s\n" % item for item in list ) )
    

    Which needlessly constructs a temporary list of all the lines that will be written out, this may consume significant amounts of memory depending on the size of your list and how verbose the output of str(item) is.

    Drop the square brackets (equivalent to removing the wrapping list() call above) will instead pass a temporary generator to file.writelines():

    file.writelines( "%s\n" % item for item in list )
    

    This generator will create newline-terminated representation of your item objects on-demand (i.e. as they are written out). This is nice for a couple of reasons:

    • Memory overheads are small, even for very large lists
    • If str(item) is slow there's visible progress in the file as each item is processed

    This avoids memory issues, such as:

    In [1]: import os
    
    In [2]: f = file(os.devnull, "w")
    
    In [3]: %timeit f.writelines( "%s\n" % item for item in xrange(2**20) )
    1 loops, best of 3: 385 ms per loop
    
    In [4]: %timeit f.writelines( ["%s\n" % item for item in xrange(2**20)] )
    ERROR: Internal Python error in the inspect module.
    Below is the traceback from this internal error.
    
    Traceback (most recent call last):
    ...
    MemoryError
    

    (I triggered this error by limiting Python's max. virtual memory to ~100MB with ulimit -v 102400).

    Putting memory usage to one side, this method isn't actually any faster than the original:

    In [4]: %timeit f.writelines( "%s\n" % item for item in xrange(2**20) )
    1 loops, best of 3: 370 ms per loop
    
    In [5]: %timeit f.writelines( ["%s\n" % item for item in xrange(2**20)] )
    1 loops, best of 3: 360 ms per loop
    

    (Python 2.6.2 on Linux)

提交回复
热议问题