I am newbie to programming, and have been studying python in my spare time for the past few months. I decided I was going to try and create a little script that converts Ame
As all the good answers above, I wrote a new version which I think is more pythonic, wish this helps:
# imported dictionary contains 1800 english:american spelling key:value pairs.
mydict = {
'color': 'colour',
}
def replace_all(text, mydict):
for english, american in mydict.iteritems():
text = text.replace(american, english)
return text
try:
with open('new_output.txt', 'w') as new_file:
with open('test_file.txt', 'r') as f:
for line in f:
new_line = replace_all(line, mydict)
new_file.write(new_line)
except:
print "Can't open file!"
Also you can see the answer I asked before, it contains many best practice advices: Loading large file (25k entries) into dict is slow in Python?
Here is a few other tips about how to write python more python:) http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html
Good luck:)
The print
statement adds a newline of its own, but your lines already have their own newlines. You can either strip the newline from your new_line
, or use the lower-level
output.write(new_line)
instead (which writes exactly what you pass to it).
For your second question, I think we need an actual example. replace()
should indeed replace all occurrences.
>>> "abc abc abcd ab".replace("abc", "def")
'def def defd ab'
I'm not sure what your third question is asking. If you want to replace the output file, do
output = open('output_test_file.txt', 'w')
'w'
means you're opening the file for writing.
The extra blank line you are seeing is because you are using print
to write out a line that already includes a newline character at the end. Since print
writes its own newline too, your output becomes double spaced. An easy fix is to use outfile.write(new_line)
instead.
As for the file modes, the issue is that you're opening the output file over and over. You should just open it once, at the start. Its usually a good idea to use with
statements to handle opening files, since they'll take care of closing them for you when you're done with them.
I don't undestand your other issue, with only some of the replacements happening. Is your dictionary missing the spellings for 'analyze'
and 'utilize'
?
One suggestion I'd make is to not do your replacements line by line. You can read the whole file in at once with file.read()
and then work on it as a single unit. This will probably be faster, since it won't need to loop as often over the items in your spelling dictionary (just once, rather than once per line):
with open('test_file.txt', 'r') as in_file:
text = in_file.read()
with open('output_test_file.txt', 'w') as out_file:
out_file.write(replace_all(text, spelling_dict))
Edit:
To make your code correctly handle words that contain other words (like "entire" containing "tire"), you probably need to abandon the simple str.replace
approach in favor of regular expressions.
Here's a quickly thrown together solution that uses re.sub
, given a dictionary of spelling changes from American to British English (that is, in the reverse order of your current dictionary):
import re
#from english_american_dictionary import ame_to_bre_spellings
ame_to_bre_spellings = {'tire':'tyre', 'color':'colour', 'utilize':'utilise'}
def replacer_factory(spelling_dict):
def replacer(match):
word = match.group()
return spelling_dict.get(word, word)
return replacer
def ame_to_bre(text):
pattern = r'\b\w+\b' # this pattern matches whole words only
replacer = replacer_factory(ame_to_bre_spellings)
return re.sub(pattern, replacer, text)
def main():
#with open('test_file.txt') as in_file:
# text = in_file.read()
text = 'foo color, entire, utilize'
#with open('output_test_file.txt', 'w') as out_file:
# out_file.write(ame_to_bre(text))
print(ame_to_bre(text))
if __name__ == '__main__':
main()
One nice thing about this code structure is that you can easily convert from British English spellings back to American English ones, if you pass a dictionary in the other order to the replacer_factory
function.