How does Stack Overflow generate its SEO-friendly URLs?

后端 未结 21 1857
-上瘾入骨i
-上瘾入骨i 2020-11-22 04:27

What is a good complete regular expression or some other process that would take the title:

How do you change a title to be part of the URL like Stack

21条回答
  •  别跟我提以往
    2020-11-22 05:10

    I liked the way this is done without using regular expressions, so I ported it to PHP. I just added a function called is_between to check characters:

    function is_between($val, $min, $max)
    {
        $val = (int) $val; $min = (int) $min; $max = (int) $max;
    
        return ($val >= $min && $val <= $max);
    }
    
    function international_char_to_ascii($char)
    {
        if (mb_strpos('àåáâäãåa', $char) !== false)
        {
            return 'a';
        }
    
        if (mb_strpos('èéêëe', $char) !== false)
        {
            return 'e';
        }
    
        if (mb_strpos('ìíîïi', $char) !== false)
        {
            return 'i';
        }
    
        if (mb_strpos('òóôõö', $char) !== false)
        {
            return 'o';
        }
    
        if (mb_strpos('ùúûüuu', $char) !== false)
        {
            return 'u';
        }
    
        if (mb_strpos('çccc', $char) !== false)
        {
            return 'c';
        }
    
        if (mb_strpos('zzž', $char) !== false)
        {
            return 'z';
        }
    
        if (mb_strpos('ssšs', $char) !== false)
        {
            return 's';
        }
    
        if (mb_strpos('ñn', $char) !== false)
        {
            return 'n';
        }
    
        if (mb_strpos('ýÿ', $char) !== false)
        {
            return 'y';
        }
    
        if (mb_strpos('gg', $char) !== false)
        {
            return 'g';
        }
    
        if (mb_strpos('r', $char) !== false)
        {
            return 'r';
        }
    
        if (mb_strpos('l', $char) !== false)
        {
            return 'l';
        }
    
        if (mb_strpos('d', $char) !== false)
        {
            return 'd';
        }
    
        if (mb_strpos('ß', $char) !== false)
        {
            return 'ss';
        }
    
        if (mb_strpos('Þ', $char) !== false)
        {
            return 'th';
        }
    
        if (mb_strpos('h', $char) !== false)
        {
            return 'h';
        }
    
        if (mb_strpos('j', $char) !== false)
        {
            return 'j';
        }
        return '';
    }
    
    function url_friendly_title($url_title)
    {
        if (empty($url_title))
        {
            return '';
        }
    
        $url_title = mb_strtolower($url_title);
    
        $url_title_max_length   = 80;
        $url_title_length       = mb_strlen($url_title);
        $url_title_friendly     = '';
        $url_title_dash_added   = false;
        $url_title_char = '';
    
        for ($i = 0; $i < $url_title_length; $i++)
        {
            $url_title_char     = mb_substr($url_title, $i, 1);
    
            if (strlen($url_title_char) == 2)
            {
                $url_title_ascii    = ord($url_title_char[0]) * 256 + ord($url_title_char[1]) . "\r\n";
            }
            else
            {
                $url_title_ascii    = ord($url_title_char);
            }
    
            if (is_between($url_title_ascii, 97, 122) || is_between($url_title_ascii, 48, 57))
            {
                $url_title_friendly .= $url_title_char;
    
                $url_title_dash_added = false;
            }
            elseif(is_between($url_title_ascii, 65, 90))
            {
                $url_title_friendly .= chr(($url_title_ascii | 32));
    
                $url_title_dash_added = false;
            }
            elseif($url_title_ascii == 32 || $url_title_ascii == 44 || $url_title_ascii == 46 || $url_title_ascii == 47 || $url_title_ascii == 92 || $url_title_ascii == 45 || $url_title_ascii == 47 || $url_title_ascii == 95 || $url_title_ascii == 61)
            {
                if (!$url_title_dash_added && mb_strlen($url_title_friendly) > 0)
                {
                    $url_title_friendly .= chr(45);
    
                    $url_title_dash_added = true;
                }
            }
            else if ($url_title_ascii >= 128)
            {
                $url_title_previous_length = mb_strlen($url_title_friendly);
    
                $url_title_friendly .= international_char_to_ascii($url_title_char);
    
                if ($url_title_previous_length != mb_strlen($url_title_friendly))
                {
                    $url_title_dash_added = false;
                }
            }
    
            if ($i == $url_title_max_length)
            {
                break;
            }
        }
    
        if ($url_title_dash_added)
        {
            return mb_substr($url_title_friendly, 0, -1);
        }
        else
        {
            return $url_title_friendly;
        }
    }
    

提交回复
热议问题