Fastest Way to Delete a Line from Large File in Python

前端未结

关注

 9  556

I am working with a very large (~11GB) text file on a Linux system. I am running it through a program which is checking the file for errors. Once an error is found, I need

Line number

def removeLine2(filename, lineNumber):
    with open(filename, 'r+') as outputFile:
        with open(filename, 'r') as inputFile:

            currentLineNumber = 0 
            while currentLineNumber < lineNumber:
                inputFile.readline()
                currentLineNumber += 1

            seekPosition = inputFile.tell()
            outputFile.seek(seekPosition, 0)

            inputFile.readline()

            currentLine = inputFile.readline()
            while currentLine:
                outputFile.writelines(currentLine)
                currentLine = inputFile.readline()

        outputFile.truncate()

String

def removeLine(filename, key):
    with open(filename, 'r+') as outputFile:
        with open(filename, 'r') as inputFile:
            seekPosition = 0 
            currentLine = inputFile.readline()
            while not currentLine.strip().startswith('"%s"' % key):
                seekPosition = inputFile.tell()
                currentLine = inputFile.readline()

            outputFile.seek(seekPosition, 0)

            currentLine = inputFile.readline()
            while currentLine:
                outputFile.writelines(currentLine)
                currentLine = inputFile.readline()

        outputFile.truncate()

0 讨论(0)

抹茶落季

2020-11-30 04:56
Update: solution using sed as requested by poster in comment.

To delete for example the second line of file:
```
sed '2d' input.txt
```
Use the -i switch to edit in place. Warning: this is a destructive operation. Read the help for this command for information on how to make a backup automatically.
0 讨论(0)
发布评论:

提交评论
- 加载中...
情话喂你

2020-11-30 04:57

I think there was a somewhat similar if not exactly the same type of question asked here. Reading (and writing) line by line is slow, but you can read a bigger chunk into memory at once, go through that line by line skipping lines you don't want, then writing this as a single chunk to a new file. Repeat until done. Finally replace the original file with the new file.

The thing to watch out for is when you read in a chunk, you need to deal with the last, potentially partial line you read, and prepend that into the next chunk you read.

0 讨论(0)
发布评论:

提交评论
- 加载中...

上一页 1 2