Rails friendly id with non-Latin characters

后端 未结 2 469
花落未央
花落未央 2021-01-19 17:39

I have a model which I use its friendly id as slug:

extend FriendlyId
friendly_id :slug_candidates, :use => :scoped, :scope => :account

def slug_candi         


        
相关标签:
2条回答
  • 2021-01-19 18:30

    There's a Rails API method for that transliterate

    Example use:

    transliterate('Ãrøskøbing')
    # => "AEroskobing"
    

    By default it only supports latin-based languages and Russian but you should be able to find rules for other alphabets as well (as explained in the linked doc)

    EDIT
    To achieve the same behaviour as wordpress you can simply use url encoding, as in example below

    URI::encode('שלום') => "%D7%A9%D7%9C%D7%95%D7%9D"
    
    0 讨论(0)
  • 2021-01-19 18:34

    Thanks to @michalszyndel notes and ideas I managed to get the following solution, hope it will be helpful for more people.

    First, how to make non-unicode chars in slug:

    extend FriendlyId
    friendly_id :slug_candidates, :use => :scoped, :scope => :account
    
    def slug_candidates
      :title_and_sequence
    end
    
    def title_and_sequence
      # This line switch all special chars to its unicode
      title_unicode = heb_to_unicode(title)
    
      slug = normalize_friendly_id(title_unicode)
          :
      # some login to add sequence in case of collision
      # and whatever you need from your slug
          :
    end
    
    def heb_to_unicode(str)
      heb_chars = 'אבגדהוזחטיכךלמםנןסעפףצץקרשת'
      heb_map = {}
      heb_chars.split("").each {|c| heb_map.merge!({c => URI::encode(c)})}
      # This regex replace all Hebrew letters to their unicode representation
      heb_re = Regexp.new(heb_map.keys.map { |x| Regexp.escape(x) }.join('|'))
    
      return str.gsub(heb_re, heb_map)
    end
    

    I also needed to modify normalize_friendly_id in order to avoid it to get rid of the %.
    I simply took the code of parameterize method and added % to the regex:

    def normalize_friendly_id(string)
      # replace accented chars with their ascii equivalents
      parameterized_string = I18n.transliterate(string)
    
      sep = '-'
    
      # Turn unwanted chars into the separator
      # We permit % in order to allow unicode in slug
      parameterized_string.gsub!(/[^a-zA-Z0-9\-_\%]+/, sep)
      unless sep.nil? || sep.empty?
        re_sep = Regexp.escape(sep)
        # No more than one of the separator in a row.
        parameterized_string.gsub!(/#{re_sep}{2,}/, sep)
        # Remove leading/trailing separator.
        parameterized_string.gsub!(/^#{re_sep}|#{re_sep}$/, '')
      end
      parameterized_string.downcase
    end
    

    Now if I save a model with the title שלום its slug is saved as %D7%A9%D7%9C%D7%95%D7%9D.
    In order to find the instance using the friendly method I need to do the following:

    id = URI::encode(params[:id]).downcase
    Page.friendly.find(id)
    
    0 讨论(0)
提交回复
热议问题