How can I change extended latin characters to their unaccented ASCII equivalents?

前端 未结 5 1483
萌比男神i
萌比男神i 2021-01-18 09:50

I need a generic transliteration or substitution regex that will map extended latin characters to similar looking ASCII characters, and all other extended characters to \'\'

5条回答
  •  清歌不尽
    2021-01-18 10:42

    When I would like translate some string, not only chars, I'm using this approach:

    my %trans = (
      'é' => 'e',
      'ê' => 'e',
      'á' => 'a',
      'ç' => 'c',
      'Ď' => 'D',
      map +($_=>''), qw(‡ Ω ‰)
    };
    
    my $re = qr/${ \(join'|', map quotemeta, keys %trans)}/;
    
    s/($re)/$trans{$1}/ge;
    

    If you want some more complicated you can use functions instead string constants. With this approach you can do anything what you want. But for your case tr should be more effective:

    tr/éêáçĎ/eeacD/;
    tr/‡Ω‰//d;
    

提交回复
热议问题