Rails friendly id with non-Latin characters

邮差的信 提交于 2019-12-01 09:15:58

There's a Rails API method for that transliterate

Example use:

transliterate('Ãrøskøbing')
# => "AEroskobing"

By default it only supports latin-based languages and Russian but you should be able to find rules for other alphabets as well (as explained in the linked doc)

EDIT
To achieve the same behaviour as wordpress you can simply use url encoding, as in example below

URI::encode('שלום') => "%D7%A9%D7%9C%D7%95%D7%9D"

Thanks to @michalszyndel notes and ideas I managed to get the following solution, hope it will be helpful for more people.

First, how to make non-unicode chars in slug:

extend FriendlyId
friendly_id :slug_candidates, :use => :scoped, :scope => :account

def slug_candidates
  :title_and_sequence
end

def title_and_sequence
  # This line switch all special chars to its unicode
  title_unicode = heb_to_unicode(title)

  slug = normalize_friendly_id(title_unicode)
      :
  # some login to add sequence in case of collision
  # and whatever you need from your slug
      :
end

def heb_to_unicode(str)
  heb_chars = 'אבגדהוזחטיכךלמםנןסעפףצץקרשת'
  heb_map = {}
  heb_chars.split("").each {|c| heb_map.merge!({c => URI::encode(c)})}
  # This regex replace all Hebrew letters to their unicode representation
  heb_re = Regexp.new(heb_map.keys.map { |x| Regexp.escape(x) }.join('|'))

  return str.gsub(heb_re, heb_map)
end

I also needed to modify normalize_friendly_id in order to avoid it to get rid of the %.
I simply took the code of parameterize method and added % to the regex:

def normalize_friendly_id(string)
  # replace accented chars with their ascii equivalents
  parameterized_string = I18n.transliterate(string)

  sep = '-'

  # Turn unwanted chars into the separator
  # We permit % in order to allow unicode in slug
  parameterized_string.gsub!(/[^a-zA-Z0-9\-_\%]+/, sep)
  unless sep.nil? || sep.empty?
    re_sep = Regexp.escape(sep)
    # No more than one of the separator in a row.
    parameterized_string.gsub!(/#{re_sep}{2,}/, sep)
    # Remove leading/trailing separator.
    parameterized_string.gsub!(/^#{re_sep}|#{re_sep}$/, '')
  end
  parameterized_string.downcase
end

Now if I save a model with the title שלום its slug is saved as %D7%A9%D7%9C%D7%95%D7%9D.
In order to find the instance using the friendly method I need to do the following:

id = URI::encode(params[:id]).downcase
Page.friendly.find(id)
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!