Unable to read huge (20GB) file from CPython

后端 未结 2 1530
终归单人心
终归单人心 2021-01-22 11:58

I have some CPython issue that I cannot understand. It all boils down to the fact that using the same code to read small text file works but cannot even read a single line from

2条回答
  •  傲寒
    傲寒 (楼主)
    2021-01-22 12:52

    Although your "test" only prints one line, that does not mean it is only reading one line from the file. For me in a \r-delimited test file, I also only get one line of output. However if I read each line in using a for loop, it still only prints one line. Or if I try readline() a second time on a multi-line file, it doesn't give any more lines.

    Try opening your file with the 'rU' parameter on the same file:

    f =  open('filename', 'rU')
    

    My tests of a file with several lines of \r-delimited text give:

    f = open('test.txt','r')  # Opening the "wrong" way
    for line in f:
        print line
    

    Output:

    abcdef
    

    Then with rU:

    f = open('test.txt','rU')
    for line in f:
        print line
    

    Output:

    abcdef
    
    abcdef
    
    abcdef
    
    abcdef
    
    abcdef
    

    EDIT: In support of Joran's explanation, this test pretty much shows it to be the case that the entire file is loading and the carriage return character is causing over-printing when you see only one line of output...

    f = open('test.txt','r')     #  Opening the "wrong" way again
    for line in f:
        print "XXX{}YYY".format(line)
    

    Output gets overwritten...

    YYYdefdef
    

提交回复
热议问题