Remove lines that contain certain string

后端未结

关注

 9  1334

I\'m trying to read a text from a text file, read lines, delete lines that contain specific string (in this case \'bad\' and \'naughty\'). The code I wrote goes like this:

相关标签:

9条回答

野趣味

2020-11-30 01:10

to_skip = ("bad", "naughty")
out_handle = open("testout", "w")

with open("testin", "r") as handle:
    for line in handle:
        if set(line.split(" ")).intersection(to_skip):
            continue
        out_handle.write(line)
out_handle.close()

0 讨论(0)

你的背包

2020-11-30 01:11
Use python-textops package :
```
from textops import *

'oldfile.txt' | cat() | grepv('bad') | tofile('newfile.txt')
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
广开言路

2020-11-30 01:12
You could simply not include the line into the new file instead of doing replace.
```
for line in infile :
     if 'bad' not in line and 'naughty' not in line:
            newopen.write(line)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

情深已故

2020-11-30 01:12

bad_words = ['doc:', 'strickland:','\n']

with open('linetest.txt') as oldfile, open('linetestnew.txt', 'w') as newfile:
    for line in oldfile:
        if not any(bad_word in line for bad_word in bad_words):
            newfile.write(line)

The \n is a Unicode escape sequence for a newline.

0 讨论(0)

名媛妹妹

2020-11-30 01:13
The else is only connected to the last if. You want elif:
```
if 'bad' in line:
    pass
elif 'naughty' in line:
    pass
else:
    newopen.write(line)
```
Also note that I removed the line substitution, as you don't write those lines anyway.
0 讨论(0)
发布评论:

提交评论
- 加载中...

不知归路

2020-11-30 01:27

Regex is a little quicker than the accepted answer (for my 23 MB test file) that I used. But there isn't a lot in it.

import re

bad_words = ['bad', 'naughty']

regex = f"^.*(:{'|'.join(bad_words)}).*\n"
subst = ""

with open('oldfile.txt') as oldfile:
    lines = oldfile.read()

result = re.sub(regex, subst, lines, re.MULTILINE) 

with open('newfile.txt', 'w') as newfile:
    newfile.write(result)

0 讨论(0)

1 2 下一页