Why my php substr() shows obscure characters when cutting a text?

后端 未结 4 833
夕颜
夕颜 2020-12-31 16:48

I\'m using the substr() function to limit the characters in strings. but sometimes, the output text contains some obscure characters and Question marks etc...

相关标签:
4条回答
  • 2020-12-31 17:12

    Just to extend the Gurmbo is answer. Using mb_substr will solve your problem but still if special characters comes at the end when you trip, it still shows the some special characters. So when I did some research, wordpress having method wp_html_excerpt to solve this problem.

    wp_html_excerpt method removes those special characters from the end of line.

    Here is the source code from wordpress.

    /**
     * Safely extracts not more than the first $count characters from html string.
     *
     * UTF-8, tags and entities safe prefix extraction. Entities inside will *NOT*
     * be counted as one character. For example & will be counted as 4, < as
     * 3, etc.
     *
     * @since 2.5.0
     *
     * @param string $str   String to get the excerpt from.
     * @param int    $count Maximum number of characters to take.
     * @param string $more  Optional. What to append if $str needs to be trimmed. Defaults to empty string.
     * @return string The excerpt.
     */
    function wp_html_excerpt( $str, $count, $more = null ) {
        if ( null === $more )
            $more = '';
        $str = wp_strip_all_tags( $str, true );
        $excerpt = mb_substr( $str, 0, $count );
        // remove part of an entity at the end
        $excerpt = preg_replace( '/&[^;\s]{0,6}$/', '', $excerpt );
        if ( $str != $excerpt )
            $excerpt = trim( $excerpt ) . $more;
        return $excerpt;
    }
    
    0 讨论(0)
  • 2020-12-31 17:24

    Because you are cutting your characters into half.

    Use mb_substr for multibyte character encodings like UTF-8. substr just counts bytes while mb_substr counts characters.

    0 讨论(0)
  • 2020-12-31 17:27

    The reason is that you use UTF-8, it's multibyte encoding,and substr() works with singlebyte only! htmlentities() doesn't matter.

    You SHOULD use mb_substr() http://php.net/manual/en/function.mb-substr.php and other multibyte functions

    0 讨论(0)
  • 2020-12-31 17:34

    If you have encoding problems you can also apply the html_entity_decode() function that convert all HTML entities to their applicable characters. For example:

    echo substr(html_entity_decode($string_to_cut), 0, 28) . "...";
    

    That also should work.

    0 讨论(0)
提交回复
热议问题