I am learning STL now. I read about set
container. I have question when you want to use set
? After reading description of set it looks like it is u
They are different things: you decide how vectors are ordered, and you can also put as many equal things into a vector as you please. Sets are ordered in accordance to that set's internal rules (you may set the rules, but the set will deal with the ordering), and you cannot put multiple equal items into a set.
Of course you could maintain a vector of unique items, but your performance would suffer a lot when you do set-oriented operations. For example, assume that you have a set of 10000 items and a vector of 10000 distinct unordered items. Now suppose that you need to check if a value X is among the values in the set (or among the values in the vector). When X is not among the items, searching the vector would be some 100 times slower. You would see similar performance differences on calculating set unions and intersections.
To summarize, sets and vectors have different purposes. You can use a vector instead of a set, but it would require more work, and would likely hurt the performance rather severely.
form cpluplus.com set:
Sets are containers that store unique elements following a specific order.
so the set is ordered AND item are uniquely represented
while vect:
Vectors are sequence containers representing arrays that can change in size.
so vector is in the order you fill it AND can hold multiple identical items
prefer set:
prefer vector:
it is faster to search an item against a set than a vector (O(log(n)) vs O(n)). To search an item against a vector, you need to iterate all items in the vector, but the set use red-black tree to optimize the search, only a few item will be looked to find a match.
The set is ordered, it means you can only iterate it from smallest one to biggest one by order, or the reversed order.
But the vector is unordered, you can travel it by the insert order.
The simple difference is that set can contain only unique values, and it is sorted. So you can use it for the cases where you need to continuously sort the values after every insert / delete.
set<int> a;
vector<int> b;
for (int i = 0; i < 10; ++i)
{
int val = rand() % 10;
a.insert(val);
b.push_back(val);
}
cout << "--SET---\n"; for (auto i : a) cout << i << ","; cout << endl;
cout << "--VEC---\n"; for (auto j : b) cout << j << ","; cout << endl;
The output is
--SET---
0,1,2,4,7,8,9,
--VEC---
1,7,4,0,9,4,8,8,2,4,
A set
is ordered. It is guaranteed to remain in a specific ordering, according to a functor that you provide. No matter what elements you add or remove (unless you add a duplicate, which is not allowed in a set
), it will always be ordered.
A vector
has exactly and only the ordering you explicitly give it. Items in a vector
are where you put them. If you put them in out of order, then they're out of order; you now need to sort
the container to put them back in order.
Admittedly, set
has relatively limited use. With proper discipline, one could insert items into a vector
and keep it ordered. However, if you are constantly inserting and removing items from the container, vector
will run into many issues. It will be doing a lot of copying/moving of elements and so forth, since it is effectively just an array.
The time it takes to insert an item into a vector
is proportional to the number of items already in the vector
. The time it takes to insert an item into a set
is proportional to the log₂ of the number of items. If the number of items is large, that's a huge difference. log₂(100,000) is ~16; that's a major speed improvement. The same goes for removal.
However, if you do all of your insertions at once, at initialization time, then there's no problem. You can insert everything into the vector
, sort it (paying that price once), and then use standard algorithms for sorted vectors
to find elements and iterate over the sorted list. And while iteration over the elements of a set
isn't exactly slow, iterating over a vector
is faster.
So there are cases where a sorted vector
beats a set
. That being said, you really shouldn't bother with the expense of this kind of optimization unless you know that it is necessary. So use a set
unless you have experience with the kind of system you're writing (and thus know that you need that performance) or have profiling data in hand that tells you that you need a vector
and not a set
.