Python string replace in a file without touching the file if no substitution was made

后端 未结 3 836
暗喜
暗喜 2021-01-05 22:56

What does Python\'s string.replace return if no string substitution was made? Does Python\'s file.open(f, \'w\') always touch the file even if no changes were made?

3条回答
  •  走了就别回头了
    2021-01-05 23:23

    Your case is a particular case: 'newtext' has exactly the same number of characters as 'oldtext'.

    Hence, it is possible to use one of the following codes to replace exactly a word 'oldtext' or a line in which the word 'oldtext' is present, by word 'newtext' or a line in which 'newtext' replaces 'oldtext'.

    .

    If files have not super-big size, the content of each file can be read entirely into memory:

    from os import fsync      # code using find()
    
    count = 0
    for match in all_files('*.html', '.'):
        with open(match,'rb+') as thefile:
            diag = False
            fno = thefile.fileno()
            content = thefile.read()
            thefile.seek(0,0)
            x = content.find('oldtext')
            while x>=0:
                diag = True
                thefile.seek(x,1)
                thefile.write('newtext')
                thefile.flush()
                fsync(fno)
                x = content[thefile.tell():].find('oldtext')
        if diag:
            cnt += 1
    

    or

    from os import fsync     # code using a regex
    import re
    pat = re.compile('oldtext')
    
    count = 0
    for match in all_files('*.html', '.'):
        with open(match,'rb+') as thefile:
            diag = False
            fno = thefile.fileno()
            content = thefile.read()
            thefile.seek(0,0)
            prec = 0
            for mat in pat.finditer(content):
                diag = True
                thefile.seek(mat.start()-prec,1)
                thefile.write('newtext')
                thefile.flush()
                fsync(fno)
                prec = mat.end()
        if diag:
            cnt += 1
    

    .

    For heavy files, a reading and rewriting line after line is possible:

    from os import fsync   # code for big files, using regex
    import re
    pat = re.compile('oldtext')
    
    count = 0
    for match in all_files('*.html', '.'):
        with open(match,'rb+') as thefile:
            diag = False
            fno = thefile.fileno()
            line = thefile.readline()
            while line:
                if 'oldtext' in line:
                    diag = True
                    thefile.seek(-len(line),1)
                    thefile.write(pat.sub('newtext',line))
                    thefile.flush()
                    fsync(fno) 
                line = thefile.readline()
        if diag:
            cnt += 1
    

    .

    The instructions thefile.flush() and fsync(fno) are required after each writing in order that the file handler thefile points with accuracy on the exact position in the file at any moment. They allow to obtain effective writing ordered by instuction write()

    flush() does not necessarily write the file’s data to disk. Use flush() followed by os.fsync() to ensure this behavior. http://docs.python.org/library/stdtypes.html#file.flush

    .

    These programs do the minimum. So I think they are fast.

    .

    Nota bene : a file opened in mode 'rb+' have no changing of its time of last modification if no modification has been performed.

提交回复
热议问题