How do I read a file into a std::string
, i.e., read the whole file at once?
Text or binary mode should be specified by the caller. The solution should b
Here's a version using the new filesystem library with reasonably robust error checking:
#include <cstdint>
#include <exception>
#include <filesystem>
#include <fstream>
#include <sstream>
#include <string>
namespace fs = std::filesystem;
std::string loadFile(const char *const name);
std::string loadFile(const std::string &name);
std::string loadFile(const char *const name) {
fs::path filepath(fs::absolute(fs::path(name)));
std::uintmax_t fsize;
if (fs::exists(filepath)) {
fsize = fs::file_size(filepath);
} else {
throw(std::invalid_argument("File not found: " + filepath.string()));
}
std::ifstream infile;
infile.exceptions(std::ifstream::failbit | std::ifstream::badbit);
try {
infile.open(filepath.c_str(), std::ios::in | std::ifstream::binary);
} catch (...) {
std::throw_with_nested(std::runtime_error("Can't open input file " + filepath.string()));
}
std::string fileStr;
try {
fileStr.resize(fsize);
} catch (...) {
std::stringstream err;
err << "Can't resize to " << fsize << " bytes";
std::throw_with_nested(std::runtime_error(err.str()));
}
infile.read(fileStr.data(), fsize);
infile.close();
return fileStr;
}
std::string loadFile(const std::string &name) { return loadFile(name.c_str()); };
Never write into the std::string's const char * buffer. Never ever! Doing so is a massive mistake.
Reserve() space for the whole string in your std::string, read chunks from your file of reasonable size into a buffer, and append() it. How large the chunks have to be depends on your input file size. I'm pretty sure all other portable and STL-compliant mechanisms will do the same (yet may look prettier).
#include <string>
#include <sstream>
using namespace std;
string GetStreamAsString(const istream& in)
{
stringstream out;
out << in.rdbuf();
return out.str();
}
string GetFileAsString(static string& filePath)
{
ifstream stream;
try
{
// Set to throw on failure
stream.exceptions(fstream::failbit | fstream::badbit);
stream.open(filePath);
}
catch (system_error& error)
{
cerr << "Failed to open '" << filePath << "'\n" << error.code().message() << endl;
return "Open fail";
}
return GetStreamAsString(stream);
}
usage:
const string logAsString = GetFileAsString(logFilePath);
Since this seems like a widely used utility, my approach would be to search for and to prefer already available libraries to hand made solutions, especially if boost libraries are already linked(linker flags -lboost_system -lboost_filesystem) in your project. Here (and older boost versions too), boost provides a load_string_file utility:
#include <iostream>
#include <string>
#include <boost/filesystem/string_file.hpp>
int main() {
std::string result;
boost::filesystem::load_string_file("aFileName.xyz", result);
std::cout << result.size() << std::endl;
}
As an advantage, this function doesn't seek an entire file to determine the size, instead uses stat() internally. As a possibly negligible disadvantage though, one could easily infer upon inspection of the source code: string is unnecessarily resized with '\0'
character which are rewritten by the file contents.
An updated function which builds upon CTT's solution:
#include <string>
#include <fstream>
#include <limits>
#include <string_view>
std::string readfile(const std::string_view path, bool binaryMode = true)
{
std::ios::openmode openmode = std::ios::in;
if(binaryMode)
{
openmode |= std::ios::binary;
}
std::ifstream ifs(path.data(), openmode);
ifs.ignore(std::numeric_limits<std::streamsize>::max());
std::string data(ifs.gcount(), 0);
ifs.seekg(0);
ifs.read(data.data(), data.size());
return data;
}
There are two important differences:
tellg()
is not guaranteed to return the offset in bytes since the beginning of the file. Instead, as Puzomor Croatia pointed out, it's more of a token which can be used within the fstream calls. gcount()
however does return the amount of unformatted bytes last extracted. We therefore open the file, extract and discard all of its contents with ignore()
to get the size of the file, and construct the output string based on that.
Secondly, we avoid having to copy the data of the file from a std::vector<char>
to a std::string
by writing to the string directly.
In terms of performance, this should be the absolute fastest, allocating the appropriate sized string ahead of time and calling read()
once. As an interesting fact, using ignore()
and countg()
instead of ate
and tellg()
on gcc compiles down to almost the same thing, bit by bit.
One way is to flush the stream buffer into a separate memory stream, and then convert that to std::string
:
std::string slurp(std::ifstream& in) {
std::ostringstream sstr;
sstr << in.rdbuf();
return sstr.str();
}
This is nicely concise. However, as noted in the question this performs a redundant copy and unfortunately there is fundamentally no way of eliding this copy.
The only real solution that avoids redundant copies is to do the reading manually in a loop, unfortunately. Since C++ now has guaranteed contiguous strings, one could write the following (≥C++14):
auto read_file(std::string_view path) -> std::string {
constexpr auto read_size = std::size_t{4096};
auto stream = std::ifstream{path.data()};
stream.exceptions(std::ios_base::badbit);
auto out = std::string{};
auto buf = std::string(read_size, '\0');
while (stream.read(& buf[0], read_size)) {
out.append(buf, 0, stream.gcount());
}
out.append(buf, 0, stream.gcount());
return out;
}