Reading a Line From File Without Advancing [Pythonic Approach]

后端 未结 5 2023
日久生厌
日久生厌 2021-01-01 10:24

What\'s a pythonic approach for reading a line from a file but not advancing where you are in the file?

For example, if you have a file of

cat1
cat2         


        
相关标签:
5条回答
  • 2021-01-01 10:51

    Manually doing it is not that hard:

    f = open('file.txt')
    line = f.readline()
    print line
    >>> cat1
    # the calculation is: - (length of string + 1 because of the \n)
    # the second parameter is needed to move from the actual position of the buffer
    f.seek((len(line)+1)*-1, 1)
    line = f.readline()
    print line
    >>> cat1
    

    You can wrap this in a method like this:

    def lookahead_line(file):
        line = file.readline()
        count = len(line) + 1
        file.seek(-count, 1)
        return file, line
    

    And use it like this:

    f = open('file.txt')
    f, line = lookahead_line(f)
    print line
    

    Hope this helps!

    0 讨论(0)
  • 2021-01-01 11:05

    Solutions with tell()/seek() will not work with stdin and other iterators. More generic implementation can be as simple as this:

    class lookahead_iterator(object):
        __slots__ = ["_buffer", "_iterator", "_next"]
        def __init__(self, iterable):
            self._buffer = [] 
            self._iterator = iter(iterable)
            self._next = self._iterator.next
        def __iter__(self):
            return self 
        def _next_peeked(self):
            v = self._buffer.pop(0)
            if 0 == len(self._buffer):
                self._next = self._iterator.next
            return v
        def next(self):
            return self._next()
        def peek(self):
            v = next(self._iterator)
            self._buffer.append(v)
            self._next = self._next_peeked
            return v
    

    Usage:

    with open("source.txt", "r") as lines:
        lines = lookahead_iterator(lines)
        magic = lines.peek()
        if magic.startswith("#"):
            return parse_bash(lines)
        if magic.startswith("/*"):
            return parse_c(lines)
        if magic.startswith("//"):
            return parse_cpp(lines)
        raise ValueError("Unrecognized file")
    
    0 讨论(0)
  • 2021-01-01 11:09

    As far as I know, there's no builtin functionality for this, but such a function is easy to write, since most Python file objects support seek and tell methods for jumping around within a file. So, the process is very simple:

    • Find the current position within the file using tell.
    • Perform a read (or write) operation of some kind.
    • seek back to the previous file pointer.

    This allows you to do nice things like read a chunk of data from the file, analyze it, and then potentially overwrite it with different data. A simple wrapper for the functionality might look like:

    def peek_line(f):
        pos = f.tell()
        line = f.readline()
        f.seek(pos)
        return line
    
    print peek_line(f) # cat1
    print peek_line(f) # cat1
    

    You could implement the same thing for other read methods just as easily. For instance, implementing the same thing for file.read:

    def peek(f, length=1):
        pos = f.tell()
        data = f.read(length) # Might try/except this line, and finally: f.seek(pos)
        f.seek(pos)
        return data
    
    print peek(f, 4) # cat1
    print peek(f, 4) # cat1
    
    0 讨论(0)
  • 2021-01-01 11:10

    The more_itertools library offers a peekable class that allows you to peek() ahead without advancing an iterable.

    with open("file.txt", "r") as f:
        p = mit.peekable(f.readlines())
    
    p.peek()
    # 'cat1\n'
    
    next(p)
    # 'cat1\n'
    

    We can view the next line before calling next() to advance the iterable p. We can now view the next line by calling peek() again.

    p.peek()
    # 'cat2\n'
    

    See also the more_itertools docs, as peekable allows you to prepend() items to an iterable before advancing as well.

    0 讨论(0)
  • 2021-01-01 11:16

    You could use wrap the file up with itertools.tee and get back two iterators, bearing in mind the caveats stated in the documentation

    For example

    from itertools import tee
    import contextlib
    from StringIO import StringIO
    s = '''\
    cat1
    cat2
    cat3
    '''
    
    with contextlib.closing(StringIO(s)) as f:
      handle1, handle2 = tee(f)
      print next(handle1)
      print next(handle2)
    
     cat1
     cat1
    
    0 讨论(0)
提交回复
热议问题