问题
I need to replace all words in a text document that are of length 4 with a different word.
For example, if a text document contained the phrase "I like to eat very hot soup" the words "like", "very", and "soup" would be replaced with "something"
Then, instead of overwriting the original text document, it needs to create a new one with the changed phrase.
Here is what I have so far:
def replacement():
o = open("file.txt","a") #file.txt will be the file containing the changed phrase
for line in open("y.txt"): #y.txt is the original file
line = line.replace("????","something") #see below
o.write(line + "\n")
o.close()
I've tried changing "????" to something like
(str(len(line) == 4)
but that didn't work
回答1:
First lets make a function that returns something
if it's given a word of length 4 and the word it was given otherwise:
def maybe_replace(word, length=4):
if len(word) == length:
return 'something'
else:
return word
Now lets walk through your for loop. In each iteration you have a line of your original file. Lets split that into words. Python gives us the split
function that we can use:
split_line = line.split()
The default is to split on whitespace, which is exactly what we want. There's more documentation if you want it.
Now we want to get the list of calling our maybe_replace
function on every word:
new_split_line = [maybe_replace(word) for word in split_line]
Now we can join these back up together using the join method:
new_line = ' '.join(new_split_line)
And write it back to our file:
o.write(new_line + '\n')
So our final function will be:
def replacement():
o = open("file.txt","a") #file.txt will be the file containing the changed phrase
for line in open("y.txt"): #y.txt is the original file
split_line = line.split()
new_split_line = [maybe_replace(word) for word in split_line]
new_line = ' '.join(new_split_line)
o.write(new_line + '\n')
o.close()
回答2:
This will preserve extra spaces that you have, as other solutions using str.split()
do not.
import re
exp = re.compile(r'\b(\w{4})\b')
replaceWord = 'stuff'
with open('infile.txt','r') as inF, open('outfile.txt','w') as outF:
for line in inF:
outF.write(exp.sub(replaceWord,line))
This uses regular expressions to replace the text. There are three main parts to the regular expression used here. The first matches the beginning of a word:
\b
The second part matches exactly four letters (all alphanumeric characters and _
):
(\w{4})
The last part is like the first, it matches the end of a word
\b
回答3:
This seems like homework, so here are some key concepts.
When you read a file, you get lines
as strings
. You can split a line
into a list
by using a string method called .split()
, like such. words = line.split()
. This creates a list of words.
Now, a list
is iterable, meaning you can use a for loop over it, and do an operation on one item of the list
at a time. You want to check how long the word
is, so you have to iterate over words
with your loop, and do something with it. You've kind of close to figuring out how to check the length of a word using len(word)
.
You also need a place to store your final information as you go. Outside of the loop, you need to make a list
for results, and .append()
the words that you've checked as you go along.
Finally, you need to do this for each line
in your file, meaning a second for loop that iterates over the file.
回答4:
with open('file.txt', 'a') as write_file:
with open('y.txt') as read_file:
for line in read_file.readlines():
# Replace the needed words
line = line.replace('????', 'something')
write_file.write(line)
来源:https://stackoverflow.com/questions/13313778/python-3-2-replace-all-words-in-a-text-document-that-are-a-certain-length