Best way to create SEO friendly URI string

前端 未结 3 1181
-上瘾入骨i
-上瘾入骨i 2021-02-14 18:55

The method should allows only "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ-" chars in URI strings.

What is the best way to make

相关标签:
3条回答
  • 2021-02-14 19:15

    The following regex will do the same thing as your algorithm. I'm not aware of libraries for doing this type of thing.

    String s = input
    .replaceAll(" ?- ?","-") // remove spaces around hyphens
    .replaceAll("[ ']","-") // turn spaces and quotes into hyphens
    .replaceAll("[^0-9a-zA-Z-]",""); // remove everything not in our allowed char set
    
    0 讨论(0)
  • 2021-02-14 19:17

    These are commonly called "slugs" if you want to search for more information.

    You may want to check out other answers such as How can I create a SEO friendly dash-delimited url from a string? and How to make Django slugify work properly with Unicode strings?

    They cover C# and Python more than javascript but have some language-agnostic discussion about slug conventions and issues you may face when making them (such as uniqueness, unicode normalization problems, etc).

    0 讨论(0)
  • 2021-02-14 19:30

    This is what the general consensus is:

    1. Lowercase the string.

      string = string.toLowerCase();
      
    2. Normalize all characters and get rid of all diacritical marks (so that e.g. é, ö, à becomes e, o, a).

      string = Normalizer.normalize(string, Form.NFD).replaceAll("\\p{InCombiningDiacriticalMarks}+", "");
      
    3. Replace all remaining non-alphanumeric characters by - and collapse when necessary.

      string = string.replaceAll("[^\\p{Alnum}]+", "-");
      

    So, summarized:

    public static String toPrettyURL(String string) {
        return Normalizer.normalize(string.toLowerCase(), Form.NFD)
            .replaceAll("\\p{InCombiningDiacriticalMarks}+", "")
            .replaceAll("[^\\p{Alnum}]+", "-");
    }
    
    0 讨论(0)
提交回复
热议问题