Does the Python “open” function save its content in memory or in a temp file?

后端 未结 6 1530
孤街浪徒
孤街浪徒 2021-01-18 01:42

For the following Python code:

fp = open(\'output.txt\', \'wb\')
# Very big file, writes a lot of lines, n is a very large number
for i in range(1, n):
    f         


        
相关标签:
6条回答
  • 2021-01-18 02:00

    If you a writing out a large file for which the writes might fail you a better off flushing the file to disk yourself at regular intervals using fp.flush(). This way the file will be in a location of your choosing that you can easily get to rather than being at the mercy of the OS:

    fp = open('output.txt', 'wb')
    counter = 0
    for line in many_lines:
        file.write(line)
        counter += 1
        if counter > 999:
            fp.flush()
    fp.close()
    

    This will flush the file to disk every 1000 lines.

    0 讨论(0)
  • 2021-01-18 02:09

    Building on ataylor's comment to the question:

    You might want to nest your loop. Something like

    for i in range(1,n):
        for each in range n:
            fp.write('something')
    fp.close()
    

    That way, the only thing that gets put into memory is the string "something", not "something" * n.

    0 讨论(0)
  • 2021-01-18 02:10

    File writing should never give a memory error; with all probability, you have some bug in another place.

    If you have a loop, and a memory error, then I would look if you are "leaking" references to objects.
    Something like:

    def do_something(a, b = []):
        b.append(a)
        return b
    
    fp = open('output.txt', 'wb') 
    
    for i in range(1, n): 
        something = do_something(i)
        fp.write(something)
    
    fp.close()
    

    I am now picking just an example, but in your actual case the reference leak may be much more difficult to find; however this case will just leak memory inside do_something because of the way Python handles default parameters of functions.

    0 讨论(0)
  • 2021-01-18 02:15

    It's stored in the operating system's disk cache in memory until it is flushed to disk, either implicitly due to timing or space issues, or explicitly via fp.flush().

    0 讨论(0)
  • 2021-01-18 02:21

    There will be write buffering in the Linux kernel, but at (ir)regular intervals they will be flushed to disk. Running out of such buffer space should never cause an application-level memory error; the buffers should empty before that happens, pausing the application while doing so.

    0 讨论(0)
  • 2021-01-18 02:21

    If you write line by line, it should not be a problem. You should show the code of what you are doing before the write. For a start you can try to delete objects where not necessary, use fp.flush() etc..

    0 讨论(0)
提交回复
热议问题