So normaly I do stuff like:
std::ifstream stream;
int buff_length = 8192;
boost::shared_array buffer( new char[buff_length]);
str
In a simplest form:
std::vector<unsigned char> vec(
std::istreambuf_iterator<char>(std::cin)
, std::istreambuf_iterator<char>()
);
Replace std::cin
with your actual stream.
The above is likely to do more than one memory allocation (for files larger than a very few bytes) because std::istreambuf_iterator<>
is an input-iterator, not a random-access or a forward iterator, so the length of the file can't be measured by subtracting iterators like end - begin
or calling std::distance(begin, end)
. It can be reduced to one memory allocation if the vector is created first empty, then std::vector<>::reserve()
is called to allocate memory for the file length and finally range insert is called vec.insert(vec.end(), beg, end)
with beg
and end
being std::istreambuf_iterator<>
as above to read the entire file.
If the file size is more then a few kilo-bytes it may be most efficient to map it into the process memory to avoid copying memory from the kernel to user-space.
The reason std::istreambuf_iterator<char>
is used is because the implementation uses std::char_traits<>
which normally has specializations only for char
and wchar_t
. Regardless, the C and C++ standards require all char
types to have the same binary layout with no padding bits, so conversions between char
, unsigned char
and signed char
(which are all distinct types, unlike signed int
and int
being the same type) preserve bit patterns and thus are safe.
[basic.fundamental/1]
Plain
char
,signed char
, andunsigned char
are three distinct types, collectively called narrow character types. Achar
, asigned char
, and anunsigned char
occupy the same amount of storage and have the same alignment requirements; that is, they have the same object representation... For narrow character types, all bits of the object representation participate in the value representation... For unsigned narrow character types, each possible bit pattern of the value representation represents a distinct number. These requirements do not hold for other types. In any particular implementation, a plainchar
object can take on either the same values as asigned char
or anunsigned char
; which one is implementation-defined. For each valuei
of typeunsigned char
in the range 0 to 255 inclusive, there exists a valuej
of typechar
such that the result of an integral conversion fromi
tochar
isj
, and the result of an integral conversion fromj
tounsigned char
isi
.