How to cleanly loop over two files in parallel in Python

后端 未结 4 770
南笙
南笙 2020-12-17 19:58

I frequently write code like:

lines = open(\'wordprob.txt\',\'r\').readlines()
words = open(\'StdWord.txt\',\'r\').readlines()
i = 0
for line in lines:
    v         


        
相关标签:
4条回答
  • 2020-12-17 20:05

    In general enumerate is a good solution. In this case, you could do something like:

    lines = open('wordprob.txt','r').readlines()
    words = open('StdWord.txt','r').readlines()
    for word, line in zip(words, lines):
        v = [eval(s) for s in line.split()]
        if v[0] > v[1]:
                print word.strip(),
    
    0 讨论(0)
  • 2020-12-17 20:13

    It looks like you don't care what the value of i is. You just are using it as a way to pair up the lines and the words. Therefore, I recommend you read one line at a time, and at the same time read one word. Then they will match.

    Also, when you use .readlines() you read all the input at once into memory. For large inputs, this will be slow. For this simple code, one line at a time is all you need. The file object returned by open() can act as an iterator that returns one line at a time.

    If you can, you should avoid the use of eval(). In a simple exercise where you know what the input data will be, it is pretty safe, but if you get data from outside sources, the use of eval() could possibly allow your computer to be attacked. See this page for more info. I will write my example code to assume that you are using eval() to turn text into a float value. float() will work on an integer string value, too: float('3') will return 3.0.

    Also, it appears that the input lines can only have two values. If a line ever has extra values, your code will not detect this condition. We can change the code to explicitly unpack two values from the split line, and then if there are more than two values, Python will raise an exception. Plus, the code will be slightly nicer to read.

    So here is my suggested rewrite of this example:

    lines = open('wordprob.txt','rt')
    words = open('StdWord.txt','rt')
    
    for line in lines:
        word = words.next().strip()  # in Python 3: word = next(words).strip()
        a, b = [float(s) for s in line.split()]
        if a > b:
            print word,  # in Python 3: print(word + ' ', end='')
    

    EDIT: And here is the same solution, but using izip().

    import itertools
    lines = open('wordprob.txt','rt')
    words = open('StdWord.txt','rt')
    
    # in Python 3, just use zip() instead of izip()
    for line, word in itertools.izip(lines, words):
        word = word.strip()
        a, b = [float(s) for s in line.split()]
        if a > b:
            print word,  # in Python 3: print(word + ' ', end='')
    

    In Python 3, the built-in zip() returns an iterator, so you can just use that and not need to import itertools.

    EDIT: It is best practice to use a with statement to make sure the files are properly closed, no matter what. In recent versions of Python you can have multiple with statements, and I'll do that in my solution. Also, we can unpack a generator expression just as easily as we can unpack a list, so I've changed the line that sets a, b to use a generator expression; that should be slightly faster. And we don't need to strip word unless we are going to use it. Put the changes together to get:

    from itertools import izip
    
    with open('wordprob.txt','rt') as lines, open('StdWord.txt','rt') as words:
        # in Python 3, just use zip() instead of izip()
        for line, word in izip(lines, words):
            a, b = (float(s) for s in line.split())
            if a > b:
                print word.strip(),  # in Python 3: print(word.strip() + ' ', end='')
    
    0 讨论(0)
  • 2020-12-17 20:27

    Take a look at enumerate:

    >>> for i, season in enumerate(['Spring', 'Summer', 'Fall', 'Winter']):
    ...     print i, season
    0 Spring
    1 Summer
    2 Fall
    3 Winter
    
    0 讨论(0)
  • 2020-12-17 20:28

    You can try to use enumerate,

    http://docs.python.org/tutorial/datastructures.html#looping-techniques

    lines = open('wordprob.txt','r').readlines()
    words = open('StdWord.txt','r').readlines()
    for i,line in enumerate(lines):
            v = [eval(s) for s in line.split()]
            if v[0] > v[1]:
                    print words[i].strip()
    
    0 讨论(0)
提交回复
热议问题