UTF-8 problems while reading CSV file with fgetcsv

前端 未结 6 1246
[愿得一人]
[愿得一人] 2020-12-02 22:45

I try to read a CSV and echo the content. But the content displays the characters wrong.

Mäx Müstermänn -> Mäx Müstermänn

Encoding of the CSV file is UT

相关标签:
6条回答
  • 2020-12-02 23:07

    In my case the source file has windows-1250 encoding and iconv prints tons of notices about illegal characters in input string...

    So this solution helped me a lot:

    /**
     * getting CSV array with UTF-8 encoding
     *
     * @param   resource    &$handle
     * @param   integer     $length
     * @param   string      $separator
     *
     * @return  array|false
     */
    private function fgetcsvUTF8(&$handle, $length, $separator = ';')
    {
        if (($buffer = fgets($handle, $length)) !== false)
        {
            $buffer = $this->autoUTF($buffer);
            return str_getcsv($buffer, $separator);
        }
        return false;
    }
    
    /**
     * automatic convertion windows-1250 and iso-8859-2 info utf-8 string
     *
     * @param   string  $s
     *
     * @return  string
     */
    private function autoUTF($s)
    {
        // detect UTF-8
        if (preg_match('#[\x80-\x{1FF}\x{2000}-\x{3FFF}]#u', $s))
            return $s;
    
        // detect WINDOWS-1250
        if (preg_match('#[\x7F-\x9F\xBC]#', $s))
            return iconv('WINDOWS-1250', 'UTF-8', $s);
    
        // assume ISO-8859-2
        return iconv('ISO-8859-2', 'UTF-8', $s);
    }
    

    Response to @manvel's answer - use str_getcsv instead of explode - because of cases like this:

    some;nice;value;"and;here;comes;combinated;value";and;some;others
    

    explode will explode string into parts:

    some
    nice
    value
    "and
    here
    comes
    combinated
    value"
    and
    some
    others
    

    but str_getcsv will explode string into parts:

    some
    nice
    value
    and;here;comes;combinated;value
    and
    some
    others
    
    0 讨论(0)
  • 2020-12-02 23:08

    Try putting this into the top of your file (before any other output):

    <?php
    
    header('Content-Type: text/html; charset=UTF-8');
    
    ?>
    
    0 讨论(0)
  • 2020-12-02 23:10

    The problem is that the function returns UTF-8 (it can check using mb_detect_encoding), but do not convert, and these characters takes as UTF-8. Тherefore, it's necessary to do the reverse-convert to initial encoding (Windows-1251 or CP1251) using iconv. But since by the fgetcsv returns an array, I suggest to write a custom function: [Sorry for my english]

    function customfgetcsv(&$handle, $length, $separator = ';'){
        if (($buffer = fgets($handle, $length)) !== false) {
            return explode($separator, iconv("CP1251", "UTF-8", $buffer));
        }
        return false;
    }
    
    0 讨论(0)
  • 2020-12-02 23:11

    Now I got it working (after removing the header command). I think the problem was that the encoding of the php file was in ISO-8859-1. I set it to UTF-8 without BOM. I thought I already have done that, but perhaps I made an additional undo.

    Furthermore, I used SET NAMES 'utf8' for the database. Now it is also correct in the database.

    0 讨论(0)
  • 2020-12-02 23:18

    Encountered similar problem: parsing CSV file with special characters like é, è, ö etc ...

    The following worked fine for me:

    To represent the characters correctly on the html page, the header was needed :

    header('Content-Type: text/html; charset=UTF-8');
    

    In order to parse every character correctly, I used:

    utf8_encode(fgets($file));
    

    Dont forget to use in all following string operations the 'Multibyte String Functions', like:

    mb_strtolower($value, 'UTF-8');
    
    0 讨论(0)
  • 2020-12-02 23:30

    Try this:

    <?php
    $handle = fopen ("specialchars.csv","r");
    echo '<table border="1"><tr><td>First name</td><td>Last name</td></tr><tr>';
    while ($data = fgetcsv ($handle, 1000, ";")) {
            $data = array_map("utf8_encode", $data); //added
            $num = count ($data);
            for ($c=0; $c < $num; $c++) {
                // output data
                echo "<td>$data[$c]</td>";
            }
            echo "</tr><tr>";
    }
    ?>
    
    0 讨论(0)
提交回复
热议问题