Is this the cleanest way to write a list to a file, since writelines()
doesn\'t insert newline characters?
file.writelines([\"%s\\n\" % item fo
I thought it would be interesting to explore the benefits of using a genexp, so here's my take.
The example in the question uses square brackets to create a temporary list, and so is equivalent to:
file.writelines( list( "%s\n" % item for item in list ) )
Which needlessly constructs a temporary list of all the lines that will be written out, this may consume significant amounts of memory depending on the size of your list and how verbose the output of str(item)
is.
Drop the square brackets (equivalent to removing the wrapping list()
call above) will instead pass a temporary generator to file.writelines()
:
file.writelines( "%s\n" % item for item in list )
This generator will create newline-terminated representation of your item
objects on-demand (i.e. as they are written out). This is nice for a couple of reasons:
str(item)
is slow there's visible progress in the file as each item is processedThis avoids memory issues, such as:
In [1]: import os
In [2]: f = file(os.devnull, "w")
In [3]: %timeit f.writelines( "%s\n" % item for item in xrange(2**20) )
1 loops, best of 3: 385 ms per loop
In [4]: %timeit f.writelines( ["%s\n" % item for item in xrange(2**20)] )
ERROR: Internal Python error in the inspect module.
Below is the traceback from this internal error.
Traceback (most recent call last):
...
MemoryError
(I triggered this error by limiting Python's max. virtual memory to ~100MB with ulimit -v 102400
).
Putting memory usage to one side, this method isn't actually any faster than the original:
In [4]: %timeit f.writelines( "%s\n" % item for item in xrange(2**20) )
1 loops, best of 3: 370 ms per loop
In [5]: %timeit f.writelines( ["%s\n" % item for item in xrange(2**20)] )
1 loops, best of 3: 360 ms per loop
(Python 2.6.2 on Linux)