i have a list of sentences and basically my aim is to replace all diff occurrences of prepositions in the form "opp,nr,off,abv,behnd" with their correct spellings "opposite,near,above,behind" and so on. The soundex code of the words are same so i need to build an expression to iterate over this list word by word and if the soundex is same, replace it with the right spelling.
An example -
['Jack was standing nr the tree' ,
'they were abv everything he planned' ,
'Just stand opp the counter' ,
'Go twrds the gas station']
so i need to replace words nr,abv ,opp and twrds with their right full forms. The soundex code of towards and twrds is the same , so it should be replaced.
i need to iterate over this list..
here's the soundex algorithm :
import string
allChar = string.uppercase + string.lowercase
charToSoundex = string.maketrans(allChar, "91239129922455912623919292" * 2)
def soundex(source):
"convert string to Soundex equivalent"
# Soundex requirements:
# source string must be at least 1 character
# and must consist entirely of letters
if (not source) or (not source.isalpha()):
return "0000"
# Soundex algorithm:
# 1. make first character uppercase
# 2. translate all other characters to Soundex digits
digits = source[0].upper() + source[1:].translate(charToSoundex)
# 3. remove consecutive duplicates
digits2 = digits[0]
for d in digits[1:]:
if digits2[-1] != d:
digits2 += d
# 4. remove all "9"s
# 5. pad end with "0"s to 4 characters
return (digits2.replace('9', '') + '000')[:4]
if __name__ == '__main__':
import sys
if sys.argv[1:]:
print soundex(sys.argv[1])
else:
from timeit import Timer
names = ('Woo', 'Pilgrim', 'Flingjingwaller')
for name in names:
statement = "soundex('%s')" % name
t = Timer(statement, "from __main__ import soundex")
print name.ljust(15), soundex(name), min(t.repeat())
am a newbie ,so in case there's another approach you could suggest , it would be appreciated.. thanks.
I'll use enchant module:
import enchant
d = enchant.Dict("en_US")
phrase = ['Jack was standing nr the tree' ,
'they were abv everything he planned' ,
'Just stand opp the counter' ,
'Go twrds the gas station']
output = []
for section in phrase:
sect = ''
for word in section.split():
if d.check(word):
sect += word + ' '
else:
for correct_word in d.suggest(word):
if soundex(correct_word) == soundex(word):
sect += correct_word + ' '
output.append(sect[:-1])
来源:https://stackoverflow.com/questions/21628391/replace-words-using-soundex-python