I am programmatically exporting data (using PHP 5.2) into a .csv test file.
Example data: Numéro 1
(note the accented e).
The data is utf-8
(
Prepending a BOM (\uFEFF) worked for me (Excel 2007), in that Excel recognised the file as UTF-8. Otherwise, saving it and using the import wizard works, but is less ideal.
Echo UTF-8 BOM before outputing CSV data. This fixes all character issues in Windows but doesnt work for Mac.
echo "\xEF\xBB\xBF";
It works for me because I need to generate a file which will be used on Windows PCs only.
open the file csv with notepad++ clic on Encode, select convert to UTF-8 (not convert to UTF-8(without BOM)) Save open by double clic with excel Hope that help Christophe GRISON
Note that including the UTF-8 BOM is not necessarily a good idea - Mac versions of Excel ignore it and will actually display the BOM as ASCII… three nasty characters at the start of the first field in your spreadsheet…
Below is the PHP code I use in my project when sending Microsoft Excel to user:
/**
* Export an array as downladable Excel CSV
* @param array $header
* @param array $data
* @param string $filename
*/
function toCSV($header, $data, $filename) {
$sep = "\t";
$eol = "\n";
$csv = count($header) ? '"'. implode('"'.$sep.'"', $header).'"'.$eol : '';
foreach($data as $line) {
$csv .= '"'. implode('"'.$sep.'"', $line).'"'.$eol;
}
$encoded_csv = mb_convert_encoding($csv, 'UTF-16LE', 'UTF-8');
header('Content-Description: File Transfer');
header('Content-Type: application/vnd.ms-excel');
header('Content-Disposition: attachment; filename="'.$filename.'.csv"');
header('Content-Transfer-Encoding: binary');
header('Expires: 0');
header('Cache-Control: must-revalidate, post-check=0, pre-check=0');
header('Pragma: public');
header('Content-Length: '. strlen($encoded_csv));
echo chr(255) . chr(254) . $encoded_csv;
exit;
}
UPDATED: Filename improvement and BUG fix correct length calculation. Thanks to TRiG and @ivanhoe011
This is just of a question of character encodings. It looks like you're exporting your data as UTF-8: é in UTF-8 is the two-byte sequence 0xC3 0xA9, which when interpreted in Windows-1252 is é. When you import your data into Excel, make sure to tell it that the character encoding you're using is UTF-8.