Remove accents/diacritics in a string in JavaScript

前端 未结 29 2446
轻奢々
轻奢々 2020-11-21 13:29

How do I remove accentuated characters from a string? Especially in IE6, I had something like this:

accentsTidy = function(s){
    var r=s.toLowerCase();
           


        
29条回答
  •  情书的邮戳
    2020-11-21 13:55

    I provided this answer for a similar question. It is based on quick table-lookup replacement for selected chars (latin 1+2), one for one (not possible to change German ü to "ue"), but works well for basic "normalization" to 7-bit ASCII.

    TAB_00C0 = "AAAAAAACEEEEIIII" +
        "DNOOOOO*OUUUUYIs" +
        "aaaaaaaceeeeiiii" +
        "?nooooo/ouuuuy?y" +
        "AaAaAaCcCcCcCcDd" +
        "DdEeEeEeEeEeGgGg" +
        "GgGgHhHhIiIiIiIi" +
        "IiJjJjKkkLlLlLlL" +
        "lLlNnNnNnnNnOoOo" +
        "OoOoRrRrRrSsSsSs" +
        "SsTtTtTtUuUuUuUu" +
        "UuUuWwYyYZzZzZzF";
    
    function stripDiacritics(source) {
        var result = source.split('');
        for (var i = 0; i < result.length; i++) {
            var c = source.charCodeAt(i);
            if (c >= 0x00c0 && c <= 0x017f) {
                result[i] = String.fromCharCode(TAB_00C0.charCodeAt(c - 0x00c0));
            } else if (c > 127) {
                result[i] = '?';
            }
        }
        return result.join('');
    }
    
    stripDiacritics("Šupa, čo? ľšťčžýæøåℌð")
    

    Any other characters are converted to ?, that is result is definitely 7-bit ASCII. No regex, no magic, simple char array work.

提交回复
热议问题