Ruby method to remove accents from UTF-8 international characters

后端 未结 4 2064
走了就别回头了
走了就别回头了 2020-12-04 15:00

I am trying to create a \'normalized\' copy of a string, to help reduce duplicate names in a database. The names contain many international characters (ie. accented letters)

相关标签:
4条回答
  • 2020-12-04 15:39

    The parameterize method could be a nice and simple solution to remove special characters in order to use the string as human readable identifier:

    > "Françoise Isaïe".parameterize
    => "francoise-isaie"
    
    0 讨论(0)
  • 2020-12-04 15:46

    If you are using rails,

    my_string = "L'Oréal"
    my_string.parameterize(separator=' ')
    
    0 讨论(0)
  • 2020-12-04 15:56

    I generally use I18n to handle this:

    1.9.3p392 :001 > require "i18n"
     => true
    1.9.3p392 :002 > I18n.transliterate("Hé les mecs!")
     => "He les mecs!"
    
    0 讨论(0)
  • 2020-12-04 15:58

    So far the following is the only way I've been able to accomplish what I need:

    str.tr(
    "ÀÁÂÃÄÅàáâãäåĀāĂ㥹ÇçĆćĈĉĊċČčÐðĎďĐđÈÉÊËèéêëĒēĔĕĖėĘęĚěĜĝĞğĠġĢģĤĥĦħÌÍÎÏìíîïĨĩĪīĬĭĮįİıĴĵĶķĸĹĺĻļĽľĿŀŁłÑñŃńŅņŇňʼnŊŋÒÓÔÕÖØòóôõöøŌōŎŏŐőŔŕŖŗŘřŚśŜŝŞşŠšſŢţŤťŦŧÙÚÛÜùúûüŨũŪūŬŭŮůŰűŲųŴŵÝýÿŶŷŸŹźŻżŽž",
    "AAAAAAaaaaaaAaAaAaCcCcCcCcCcDdDdDdEEEEeeeeEeEeEeEeEeGgGgGgGgHhHhIIIIiiiiIiIiIiIiIiJjKkkLlLlLlLlLlNnNnNnNnnNnOOOOOOooooooOoOoOoRrRrRrSsSsSsSssTtTtTtUUUUuuuuUuUuUuUuUuUuWwYyyYyYZzZzZz")
    

    But using this feels very 'hackish', and I would love to find a better way.

    0 讨论(0)
提交回复
热议问题