How can I normalize / asciify Unicode characters in Google Sheets?

拈花ヽ惹草 提交于 2019-12-03 07:32:27
Max Makhrov

[[:^alpha:]] (negated ASCII character class) works fine for REGEXEXTRACT formula.

But =REGEXREPLACE("público","([[:alpha:]])[[:^alpha:]]","$1") gives "pblic" as a result. So, I guess, formula doesn't know what exact ASCII character must replace "ú".


Workaround

Let's take the word públicē; we need to replace two symbols in it. Put this word in cell A1, and this formula in cell B1:

=JOIN("",ArrayFormula(IFERROR(VLOOKUP(SPLIT(REGEXREPLACE(A1,"(.)","$1-"),"-"),D:E,2,0),SPLIT(REGEXREPLACE(A1,"(.)","$1-"),"-"))))

And then make directory of replacements in range D:E:

    D    E  
1   ú   u
2   ē   e
3  ...  ...

This formula is still ugly, but more useful because you can control your directory by adding more characters to the table.


Or use Java Script

Also found a good solution, which works in google sheets.

JaimeJCandau

This did it for me in Google Sheets, Google Apps Scripts, GAS

function normalizetext(text) {
    var weird = 'öüóőúéáàűíÖÜÓŐÚÉÁÀŰÍçÇ!@£$%^&*()_+?/*."';
    var normalized = 'ouooueaauiOUOOUEAAUIcC                 ';
    var idoff = -1,new_text = '';
    var lentext = text.toString().length -1

    for (i = 0; i <= lentext; i++) {
        idoff = weird.search(text.charAt(i));
        if (idoff == -1) {
            new_text = new_text + text.charAt(i);
        } else {
           new_text = new_text + normalized.charAt(idoff);
        }
    }

    return new_text;
}
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!