C++ streams confusion: istreambuf_iterator vs istream_iterator?

牧云@^-^@ 提交于 2019-11-27 11:13:13

IOstreams use streambufs to as their source / target of input / output. Effectively, the streambuf-family does all the work regarding IO and the IOstream-family is only used for formatting and to-string / from-string transformation.

Now, istream_iterator takes a template argument that says what the unformatted string-sequence from the streambuf should be formatted as, like istream_iterator<int> will interpret (whitespace-delimited) all incoming text as ints.

On the other hand, istreambuf_iterator only cares about the raw characters and iterates directly over the associated streambuf of the istream that it gets passed.

Generally, if you're only interested in the raw characters, use an istreambuf_iterator. If you're interested in the formatted input, use an istream_iterator.

All of what I said also applies to ostream_iterator and ostreambuf_iterator.

Here's a really badly kept secret: an iostream per se, has almost nothing to do with actually reading or writing from/to a file on your computer.

An iostream basically acts as a "matchmaker" between a streambuf and a locale:

The iostream stores some state about how conversions should be done (e.g., the current width and precision for a conversion). It uses those to direct the locale how and where to do a conversion (e.g., convert this number to a string in that buffer with width 8 and precision 5).

Although you didn't ask directly about it, the locale in its turn is really just a container--but (for rather an oddity) a typesafe heterogeneous container. The things it contains are facets. A facet object defines a single facet of an overall locale. The standard defines a number of facets for everything from reading and writing numbers (num_get, num_put) to classifying characters (the ctype facet).

By default, a stream will use the "C" locale. This is pretty basic--numbers are just converted as a stream of digits, the only things it recognizes as letters are the 26 lower case and 26 upper case English letters, and so on. You can, however, imbue a stream with a different locale of your choice. You can choose locales to use by names specified in strings. One that's particularly interesting is one that's selected by an empty string. Using an empty string basically tells the runtime library to select the locale it "thinks" is the most suitable, usually based on how the user has configured the operating system. This allows code to deal with data in a localized format without being written explicitly for any particular locale.

So, the basic difference between an istream_iterator and an istreambuf_iterator is that the data coming out of an istreambuf_iterator hasn't gone through (most of the) transformations done by the locale, but data coming out of an istream_iterator has been transformed by the locale.

For what it's worth, that "most of the" in the previous paragraph is referring to the fact that when you read data from an istreambuf (via an iterator or otherwise) one little bit of locale-based transformation is done: along with various "formatting" kinds of things, the locale contains a codecvt facet, which is what's used to convert from some external representation to some internal representation (e.g., UTF-8 to UTF-32).

It may make more sense to ignore the fact that they're both stored in a locale, and think only of the individual facets involved:

So that's the real difference between a istream_iterator and an istreambuf_iterator. A little bit of transformation is (at least potentially) done to the data from either one, but substantially less is done to the data coming from an istreambuf_iterator.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!