Is there a difference between : “file.readlines()”, “list(file)” and “file.read().splitlines(True)”?

前端 未结 5 1431
北海茫月
北海茫月 2021-02-20 02:03

What is the difference between :

with open(\"file.txt\", \"r\") as f:
    data = list(f)

Or :

with open(\"file.txt\", \"r\") as         


        
5条回答
  •  一生所求
    2021-02-20 02:46

    TL;DR;

    Considering you need a list to manipulate them afterwards, your three proposed solutions are all syntactically valid. There is no better (or more pythonic) solution, especially since they all are recommended by the official Python documentation. So, choose the one you find the most readable and be consistent with it throughout your code. If performance is a deciding factor, see my timeit analysis below.


    Here is the timeit (10000 loops, ~20 line in test.txt),

    import timeit
    
    def foo():
        with open("test.txt", "r") as f:
            data = list(f)
    
    def foo1():
        with open("test.txt", "r") as f:
            data = f.read().splitlines(True)
    
    def foo2():
        with open("test.txt", "r") as f:
            data = f.readlines()
    
    print(timeit.timeit(stmt=foo, number=10000))
    print(timeit.timeit(stmt=foo1, number=10000))
    print(timeit.timeit(stmt=foo2, number=10000))
    
    >>>> 1.6370758459997887
    >>>> 1.410844805999659
    >>>> 1.8176437409965729
    

    I tried it with multiple number of loops and lines, and f.read().splitlines(True) always seems to be performing a bit better than the two others.

    Now, syntactically speaking, all of your examples seems to be valid. Refer to this documentation for more informations.

    According to it, if your goal is to read lines form a file,

    for line in f:
        ...
    

    where they states that it is memory efficient, fast, and leads to simple code. Which would be another good alternative in your case if you don't need to manipulate them in a list.

    EDIT

    Note that you don't need to pass your True boolean to splitlines. It has your wanted behavior by default.

    My personal recommendation

    I don't want to make this answer too opinion-based, but I think it would be beneficial for you to know, that I don't think performance should be your deciding factor until it is actually a problem for you. Especially since all syntax are allowed and recommended in the official Python doc I linked.

    So, my advice is,:

    First, pick the most logical one for your particular case and then choose the one you find the most readable and be consistent with it throughout your code.

提交回复
热议问题