问题
I have the following c++ code in visual studio to read characters from a file.
ifstream infile;
infile.open(argv[1]);
if (infile.fail()) {
cout << "Error reading from file: " << strerror(errno) << endl;
cout << argv[0] << endl;
}
else {
char currentChar;
while (infile.get(currentChar)) {
cout << currentChar << " " << int(currentChar) << endl;
//... do something with currentChar
}
ofstream outfile("output.txt");
outfile << /* output some text based on currentChar */;
}
infile.close();
The file in this case is expected to contain mostly normal ASCII characters, with the exception of two: “
and ”
.
The problem is that the code in it's current form is not able to recognise those characters. cout
ing the character outputs garbage, and its int conversion yields a negative number that's different depending on where in the file it occurs.
I have a hunch that the problem is encoding, so I've tried to imbue infile
based on some examples on the internet, but I haven't seemed to get it right. infile.get
either fails when reaching the quote character, or the problem remains. What details am I missing?
回答1:
The file you are trying to read is likely UTF-8 encoded. The reason most characters read fine is because UTF-8 is backwards compatible with ASCII.
In order to read a UTF-8 file I'll refer you to this: http://en.cppreference.com/w/cpp/locale/codecvt_utf8
#include <fstream>
#include <iostream>
#include <string>
#include <locale>
#include <codecvt>
...
// Write file in UTF-8
std::wofstream wof;
wof.imbue(std::locale(std::locale::empty(), new std::codecvt_utf8<wchar_t,0x10ffff,std::generate_header>));
wof.open(L"file.txt");
wof << L"This is a test.";
wof << L"This is another test.";
wof << L"\nThis is the final test.\n";
wof.close();
// Read file in UTF-8
std::wifstream wif(L"file.txt");
wif.imbue(std::locale(std::locale::empty(), new std::codecvt_utf8<wchar_t,0x10ffff, std::consume_header>));
std::wstringstream wss;
wss << wif.rdbuf();
(from here)
回答2:
try:
while (infile.get(¤tChar, 1))
Also, be sure that you pass argv[1]
. Print its value:
cout<<argv[1]<<endl;
来源:https://stackoverflow.com/questions/48985128/characters-not-recognized-while-reading-from-file