I\'m trying to read a text from a text file, read lines, delete lines that contain specific string (in this case \'bad\' and \'naughty\'). The code I wrote goes like this:
to_skip = ("bad", "naughty")
out_handle = open("testout", "w")
with open("testin", "r") as handle:
for line in handle:
if set(line.split(" ")).intersection(to_skip):
continue
out_handle.write(line)
out_handle.close()
Use python-textops package :
from textops import *
'oldfile.txt' | cat() | grepv('bad') | tofile('newfile.txt')
You could simply not include the line into the new file instead of doing replace.
for line in infile :
if 'bad' not in line and 'naughty' not in line:
newopen.write(line)
bad_words = ['doc:', 'strickland:','\n']
with open('linetest.txt') as oldfile, open('linetestnew.txt', 'w') as newfile:
for line in oldfile:
if not any(bad_word in line for bad_word in bad_words):
newfile.write(line)
The \n
is a Unicode escape sequence for a newline.
The else
is only connected to the last if
. You want elif
:
if 'bad' in line:
pass
elif 'naughty' in line:
pass
else:
newopen.write(line)
Also note that I removed the line substitution, as you don't write those lines anyway.
Regex is a little quicker than the accepted answer (for my 23 MB test file) that I used. But there isn't a lot in it.
import re
bad_words = ['bad', 'naughty']
regex = f"^.*(:{'|'.join(bad_words)}).*\n"
subst = ""
with open('oldfile.txt') as oldfile:
lines = oldfile.read()
result = re.sub(regex, subst, lines, re.MULTILINE)
with open('newfile.txt', 'w') as newfile:
newfile.write(result)