How does Stack Overflow generate its SEO-friendly URLs?

后端 未结 21 1898
-上瘾入骨i
-上瘾入骨i 2020-11-22 04:27

What is a good complete regular expression or some other process that would take the title:

How do you change a title to be part of the URL like Stack

21条回答
  •  青春惊慌失措
    2020-11-22 05:08

    No, no, no. You are all so very wrong. Except for the diacritics-fu stuff, you're getting there, but what about Asian characters (shame on Ruby developers for not considering their nihonjin brethren).

    Firefox and Safari both display non-ASCII characters in the URL, and frankly they look great. It is nice to support links like 'http://somewhere.com/news/read/お前たちはアホじゃないかい'.

    So here's some PHP code that'll do it, but I just wrote it and haven't stress tested it.

     1 && mb_strlen($c)===1)
                {
                    $real_slug .= $hyphen . $c;
                    $hyphen = '';
                }
                else
                {
                    switch($c)
                    {
                        case '&':
                            $hyphen = $real_slug ? '-and-' : '';
                            break;
                        case 'a':
                        case 'b':
                        case 'c':
                        case 'd':
                        case 'e':
                        case 'f':
                        case 'g':
                        case 'h':
                        case 'i':
                        case 'j':
                        case 'k':
                        case 'l':
                        case 'm':
                        case 'n':
                        case 'o':
                        case 'p':
                        case 'q':
                        case 'r':
                        case 's':
                        case 't':
                        case 'u':
                        case 'v':
                        case 'w':
                        case 'x':
                        case 'y':
                        case 'z':
    
                        case 'A':
                        case 'B':
                        case 'C':
                        case 'D':
                        case 'E':
                        case 'F':
                        case 'G':
                        case 'H':
                        case 'I':
                        case 'J':
                        case 'K':
                        case 'L':
                        case 'M':
                        case 'N':
                        case 'O':
                        case 'P':
                        case 'Q':
                        case 'R':
                        case 'S':
                        case 'T':
                        case 'U':
                        case 'V':
                        case 'W':
                        case 'X':
                        case 'Y':
                        case 'Z':
    
                        case '0':
                        case '1':
                        case '2':
                        case '3':
                        case '4':
                        case '5':
                        case '6':
                        case '7':
                        case '8':
                        case '9':
                            $real_slug .= $hyphen . $c;
                            $hyphen = '';
                            break;
    
                        default:
                           $hyphen = $hyphen ? $hyphen : ($real_slug ? '-' : '');
                    }
                }
            }
            return $real_slug;
        }
    

    Example:

    $str = "~!@#$%^&*()_+-=[]\{}|;':\",./<>?\n\r\t\x07\x00\x04 コリン ~!@#$%^&*()_+-=[]\{}|;':\",./<>?\n\r\t\x07\x00\x04 トーマス ~!@#$%^&*()_+-=[]\{}|;':\",./<>?\n\r\t\x07\x00\x04 アーノルド ~!@#$%^&*()_+-=[]\{}|;':\",./<>?\n\r\t\x07\x00\x04";
    echo slug($str);
    

    Outputs: コリン-and-トーマス-and-アーノルド

    The '-and-' is because &'s get changed to '-and-'.

提交回复
热议问题