Python: rewinding one line in file when iterating with f.next()

前端 未结 3 733
無奈伤痛
無奈伤痛 2020-12-20 14:23

Python\'s f.tell doesn\'t work as I expected when you iterate over a file with f.next():

>>> f=open(\".bash_profile\", \"r\")
>>> f.tell()
         


        
相关标签:
3条回答
  • 2020-12-20 14:45

    Python's file iterator does a lot of buffering, thereby advancing the position in the file far ahead of your iteration. If you want to use file.tell() you must do it "the old way":

    with open(filename) as fileob:
      line = fileob.readline()
      while line:
        print fileob.tell()
        line = fileob.readline()
    
    0 讨论(0)
  • 2020-12-20 14:47

    itertools.tee is probably the least-bad approach -- you can't "defeat" the buffering done by iterating on the file (nor would you want to: the performance effects would be terrible), so keeping two iterators, one "one step behind" the other, seems the soundest solution to me.

    import itertools as it
    
    with open('a.txt') as f:
      f1, f2 = it.tee(f)
      f2 = it.chain([None], f2)
      for thisline, prevline in it.izip(f1, f2):
        ...
    
    0 讨论(0)
  • 2020-12-20 14:56

    No. I would make an adapter that largely forwarded all calls, but kept a copy of the last line when you did next and then let you call a different method to make that line pop out again.

    I would actually make the adapter be an adapter that could wrap any iterable instead of a wrapper for file because that sounds like it would be frequently useful in other contexts.

    Alex's suggestion of using the itertools.tee adapter also works, but I think writing your own iterator adapter to handle this case in general would be cleaner.

    Here is an example:

    class rewindable_iterator(object):
        not_started = object()
    
        def __init__(self, iterator):
            self._iter = iter(iterator)
            self._use_save = False
            self._save = self.not_started
    
        def __iter__(self):
            return self
    
        def next(self):
            if self._use_save:
                self._use_save = False
            else:
                self._save = self._iter.next()
            return self._save
    
        def backup(self):
            if self._use_save:
                raise RuntimeError("Tried to backup more than one step.")
            elif self._save is self.not_started:
                raise RuntimeError("Can't backup past the beginning.")
            self._use_save = True
    
    
    fiter = rewindable_iterator(file('file.txt', 'r'))
    for line in fiter:
        result = process_line(line)
        if result is DoOver:
            fiter.backup()
    

    This wouldn't be too hard to extend into something that allowed you to backup by more than just one value.

    0 讨论(0)
提交回复
热议问题