From http://www.cplusplus.com/reference/utility/pair/, we know that std::pair
has two member variables, first
and second
.
Why did
For the original C++03 std::pair
, functions to access the members would serve no useful purpose.
As of C++11 and later (we're now at C++14, with C++17 coming up fast) std::pair
is a special case of std::tuple
, where std::tuple
can have any number of items. As such it makes sense to have a parameterized getter, since it would be impractical to invent and standardize an arbitrary number of item names. Thus you can use std::get also for a std::pair
.
So, the reasons for the design are historical, that the current std::pair
is the end result of an evolution towards more generality.
In other news:
regarding
” As far as I know, it will be better if encapsulating two member variables above and give a
getFirst();
andgetSecond()
no, that's rubbish.
That's like saying a hammer is always better, whether you're driving in nails, fastening with screws, or trimming a piece of wood. Especially in the last case a hammer is just not a useful tool. Hammers can be very useful, but that doesn't mean that they're “better” in general: that's just nonsense.
It could be argued that std::pair
would be better off having accessor functions to access its members! Notably for degenerated cases of std::pair
there could be an advantage. For example, when at least one of the types is an empty, non-final class, the objects could be smaller (the empty base could be made a base which wouldn't need to get its own address).
At the time std::pair
was invented these special cases were not considered (and I'm not sure if the empty base optimization was allowed in the draft working paper at that time). From a semantic point there isn't much reason to have accessor functions, though: clearly, the accessors would need to return a mutable reference for non-const
objects. As a result the accessor does not provide any form of encapsulation.
On the other hand, it makes it [slightly] harder on the optimizer to see what's going on when accessor functions are used e.g. because additional sequence points are introduced. I could imagine that Meng Lee and Alexander Stepanov even measured whether there is a difference (nor did I). Even if they didn't, providing access to the members directly is certainly not slower than going through an accessor function while the reverse is not necessarily true.
I wasn't part of the decision and the C++ standard doesn't have a rationale but I guess it was a deliberate decision to make the members public data members.
I was appalled by the number of comments that show no basic understanding of object-oriented design (does that prove c++ is not an OO-language?). Yes the design of std::pair has some historical traits, but that does not make a bad design good; nor should it be used as an excuse to deny the fact. Before I rant on it, let me answer some of the questions in the comments:
Don't you think int should also have a setter and getter
Yes, from a design point of view we should use accessors because by doing so we lose nothing but gain additional flexibility. Some newer algorithms may want to pack additional bits into the key/values, and you cannot encode/decode them without accessors.
Why wrap something in a getter if there is no logic in the getter?
How do you know there would be no logic in the getter/setter? A good design should not limit the possibility of implementation based on guess. It should offer as much flexibility as possible. Remember the design of std:pair also decides the design of iterator, and by requiring users to directly access member variables, the iterator has to return structures that actually store key/values together. That turns out to be a big limitation. There are algorithms that need to keep them separate. There are algorithms that don't store key/values explicitly at all. Now they have to copy the data during iteration.
Contrary to popular belief, having objects that do nothing but store member variables with getters and setters is not "the way things should be done"
Another wild guess.
OK, I would stop here.
To answer the original question: std::pair chose to expose member variables because whoever designed it did not recognize and/or prioritize the importance of a flexible contract. They obviously had a very narrow idea/vision about how key-value pairs in a map/hashtable should be implemented, and to make it worse, they let such a narrow view on implementation spill over the top to compromise the design. For example, what if I want to implement a replacement of std:unordered_map that stores key and values in separate arrays based on an open addressing scheme with linear probing? This can greatly boost cache performance for pairs with small keys and large values, as you don't need long-jump across the spaces occupied by values to probe the keys. Had std::pair chosen accessors, it would be trivial to write an STL-style iterator for this. But now it is simply impossible to achieve this without eliciting additional data copying.
I noticed that they also mandate the use of open hashing (i.e., closed chaining) for the implementation of std::unordered_map. This is not only strange from a design point of view (why you want to restrict how things are implemented???), but also pretty dumb in terms of implementation - chained hashtables using linked list is perhaps the slowest of all categories. Go google the web, we can easily find that std:unordered_map is often the doormat of a hashtable benchmark. It even tends to be slower than Java's HashMap (I don't know how they managed to lag behind in this case, as HashMap is also a chained hashtable). An old excuse is that chained hashtable tend to perform better when the load_factor approaches 1, which is totally invalid because 1) there are plenty of techniques in open addressing family to deal with this problem - ever heard of hopscotching or robin-hood hashing, and the latter has actually been there for 30 freakish years; 2) a chained hashtable adds the overhead of a pointer (a good 8 bytes on 64 bit machines) for each entry, so when we say the load_factor of an unordered_map approaches 1, it is not 100% memory usage! We should take that into consideration and compare the performance of unordered_map with alternatives with same memory usage. And it turns out that alternatives like Google Dense HashMap is 3-4 times faster than std::unordered_map.
Why these are relevant? Because interestingly, mandating open hashing does make the design of std::pair look less bad, now that we do not need the flexibility of an alternative storage structure. Moreover, the presence of std::pair makes it almost impossible to adopt newer/better algorithms to write a drop-in replacement of std::unordered_map. Sometimes you wonder whether they did that intentionally so that the poor design of std::pair and the pedestrian implementation of std::unordered_map can survive longer together. Of course I am kidding, so whoever wrote those, don't get offended. In fact people using Java or Python (OK, I admit Python's a stretch) would want to thank you for making them feel good about being "as fast as C++".
Getters and setters are usually useful if one thinks that getting or setting the value requires extra logic (changing some internal state). This can then be easily added into the method. In this case std::pair
is only used to provide 2 data values. Nothing more, nothing less. And thus, adding the verbosity of a getter and setter would be pointless.
The reason is that no real invariant needs to be imposed on the data structure, as std::pair
models a general-purpose container for two elements. In other words, an object of type std::pair<T, U>
is assumed to be valid for any possible first
and second
element of type T
and U
, respectively. Similarly, subsequent mutations in the value of its elements cannot really affect the validity of the std::pair
per se.
Alex Stepanov (the author of the STL) explicitly presents this general design principle during his course Efficient Programming with Components, when commenting on the singleton
container (i.e., a container of one element).
Thus, albeit the principle in itself can be a source of debate, this is the reason behind the shape of std::pair
.
Getters and setters are useful if one believes that abstraction is warranted to insulate users from design choices and changes in those choices, now or in the future.
The typical example for "now" is that the setter/getter might have logic to validate and/or calculate the value - e.g., use a setter for a phone number, instead of directly exposing the field, so that you can check the format; use a getter for a collection so that the getter can provide a read-only view of the member's value (a collection) to the caller.
The canonical (though bad) example for "changes in the future" is Point
- should you expose x
and y
or getX()
and getY()
? The usual answer is to use getters/setters because at some time in the future you might want to change the internal representation from Cartesian to polar and you don't want your users to be impacted (or to have them depend on that design decision).
In the case of std::pair
- it is the intent that this class now and forever represent two and exactly two values (of arbitrary type) directly, and provide their values on demand. That's it. And that's why the design uses direct member access, rather than go through a getter/setter.