I am trying to get a string to use in google geocoding api.I ve checked a lot of threads but I am still facing problem and I don\'t understand how to solve it.
I nee
Generally, there are two approaches: (1) regular expressions and (2) str.translate.
Decompose string and replace characters from the Unicode block \u0300-\u036f:
import unicodedata
import re
word = unicodedata.normalize("NFD", word)
word = re.sub("[\u0300-\u036f]", "", word)
It removes accents, circumflex, diaeresis, and so on:
pingüino > pinguino
εἴκοσι εἶσι > εικοσι εισι
For some languages, it could be another block, such as [\u0559-\u055f]
for Armenian script.
First, create replacement table (case-sensitive) and then apply it.
repl = str.maketrans(
"áéúíó",
"aeuio"
)
word.translate(repl)
Multi-char replacements are made as following:
repl = {
ord("æ"): "ae",
ord("œ"): "oe",
}
word.translate(repl)