What's the difference between a sentinel and an end iterator?

问题

While reading Eric Niebler's range proposal,
I've come across the term sentinel as replacement for the end iterator.
I'm having a difficult time understanding the benefits of sentinel over an end iterator.
Could someone provide a clear example of what sentintel brings to the table that cannot be done with standard iterator pairs?

"A sentinel is an abstraction of a past-the-end iterator. Sentinels are Regular types that can be used to denote the end of a range. A sentinel and an iterator denoting a range shall be EqualityComparable. A sentinel denotes an element when an iterator i compares equal to the sentinel, and i points to that element." -- N4382

I think sentinels work as functions in determining the end of a range, instead of just the position?

回答1:

Sentinel simply allows the end iterator to have a different type.

The allowed operations on a past-the-end iterator are limited, but this is not reflected in its type. It is not ok to * a .end() iterator, but the compiler will let you.

A sentinel does not have unary dereference, or ++, among other things. It is generally as restricted as the weakest iterators one past the end iterator, but enforced at compile time.

There is a payoff. Often detecting the end state is easier than finding it. With a sentinel, == can dispatch to "detect if the other argument is past the end" at compile time, instead of run time.

The result is that some code that used to be slower than the C equivalent now compiles down to C level speed, such as copying a null terminated string using std::copy. Without sentinels, you either had to scan to find the end before the copy, or pass in iterators with a bool flag saying "I am the end sentinel" (or equivalent), and check it on ==.

There are other similar advantages when working with counting based ranges. In addition, some things like zip ranges¹ become easier to express (the end zip sentinel could hold both source sentinels, and return equal if either sentinel does: zip iterators either only compare the first iterator, or compare both).

Another way of thinking about it is that algorithms tend to not use the full richness of the iterator concept on the parameter passed as the past the end iterator, and that iterator is handled way differently in practice. Sentinel means that the caller can exploit that fact, which in turn lets the compiler exploit it easier.

¹ A zip range is what you get when you start with 2 or more ranges, and "zip" them together like a zipper. The range is now over tuples of the individual range elements. Advancing a zip iterator advances each of the "contained" iterators, and ditto for dereferencing and comparing.

回答2:

Sentinels and end iterators are similar in that they mark the end of a range. They differ in how that end is detected; either you're testing the iterator itself, or you're testing the data value at the iterator. If you're already performing tests on the data, a sentinel can allow your algorithm to finish "for free" without any additional tests. This can either simplify the code, or make it faster.

A very common sentinel is the zero byte that is used to mark the end of a string. There's no need to keep a separate iterator for the end of the string, it can be determined as you work with the characters of the string itself. The downside of this convention is that a string cannot contain a zero character.

Note that I wrote this answer before reading the proposal in the link; this is the classic definition of a sentinel which may not agree with the definition proposed there.

回答3:

The central motivation for introducing a sentinel is that there are a lot of iterator operations which are supported but usually never needed for the end-iterator end(). For example, there is hardly any point in dereferencing it via *end(), in incrementing it via ++end(), and so on (*).

In contrast, the main usage of end() is merely to compare it with an iterator it in order to signalize whether it is at the end of the thing it just iterates. And, as usual in programming, different requirements and different application suggest a new type.

The range-v3 library turns this observation into an assumption (that is implemented through a concept): it introduces a new type for end() and only requires that it is equality-comparable with the corresponding iterator -- but does not require the usual iterator operations). This new type of end() is called a sentinel.

The main advantage here is the gained abstraction and better separation of concerns, based on which the compiler is possibly able to perform a better optimization. In code, the basic idea is this (this is just for explanation and has nothing to do with the range-v3 library):

struct my_iterator;    //some iterator
struct my_sentinel
{
     bool is_at_end(my_iterator it) const
     {
         //here implement the logic when the iterator is at the end
     }
};

auto operator==(my_iterator it, my_sentinel s)  //also for (my_sentinel s, my_iterator it)
{
    return s.is_at_end(it); 
}

See the abstraction? Now, you can implement any check you want in the is_at_end function, e.g.:

stop never (get an infinite range)
stop after N increments (to get a counted range)
stop when a \0 is encountered, i.e. *it = '\0' (for looping over C-strings)
stop when it's 12'o clock (for having lunch), and so on.

Moreover, regarding performance, one is able to make use of compile time-information in the check (e.g., think of the N above as a compile-time parameter). In this case, the compiler might possibly be able to better optimize the code.

(*) Note that this does not mean there is in general no use for this kind of operations. For example, --end() can be useful in some places, see e.g. this question. However, it is seemingly possible to implement the standard library without these -- this is what the range-v3 library has done.

来源：https://stackoverflow.com/questions/32900557/whats-the-difference-between-a-sentinel-and-an-end-iterator

标签

c++

iterator

range

Sentinel

c++17