问题
Unless I am missing something or missunderstand the mechanism (very likely) Shouldn't the "1" duplicate not exist in this vector ?
chunks.erase( std::unique ( chunks.begin(), chunks.end(),
[]( std::string &s1, std::string &s2 ){
return ( s1.compare(s2) == 0 ? true : false );}),
chunks.end() );
Before Executing the above:
1 l:1
1+ l:2
1+1 l:3
1+1= l:4
+ l:1
+1 l:2
+1= l:3
1 l:1
1= l:2
= l:1
After executing the above code:
1 l:1
1+ l:2
1+1 l:3
1+1= l:4
+ l:1
+1 l:2
+1= l:3
1 l:1
1= l:2
= l:1
I have tried without a predicate (assuming std::strings that are identical would be removed). For some reason the "ones" are identified as identical? I have looked at their length (assuming a whitespace was stuck as a prefix or postfix) but they have the same length.
Am I missing something ?
回答1:
You are (probably) misunderstanding something.
std::unique
only removes contiguous duplicates, so if you wish to remove all duplicates a precondition to applying std::unique
is to sort your range using the same predicate.
回答2:
std::unique
assumes that non-unique elements are adjacent, as they would be if (for one example) chunks
were sorted. This allows std::unique
to have O(n) complexity.
If you want to maintain a specific order in your vector
and remove duplicates, that's a problem with O(n2) complexity. You can use the logic provided here to do that.
// Create a new vector without the duplicates
std::vector<string> unique_chunks;
for (std::vector<string>::iterator x = chunks.begin(); x != chunks.end();) {
if ( unique_chunks.find(*x) != unique_chunks.end() ) {
unique_chunks.push_back( *x );
}
}
// Make chunks hold this new vector (allowing the old vector to be destroyed)
std::swap( chunks, unique_chunks );
And no, you didn't need that predicate.
回答3:
As mentioned in other answer, unique
deletes contiguous blocks of duplicate, if you need remove duplicates and preserve order of rest elements(order of first occurrence, here) in O(N log N)
time you may do the following:
template<typename T>
bool bySecond(const pair<T, int>& a, const pair<T, int>& b) {
return a.second < b.second;
}
template<typename T>
bool firstEqual(const pair<T, int>& a, const pair<T, int>& b) {
return a.first == b.first;
}
template<typename it>
it yourUnique(it begin, it end){
typedef typename std::iterator_traits<it>::value_type value_t;
vector<pair<value_t, int>> v;
for(it c = begin; c != end; ++c){
v.push_back(make_pair(*c, v.size())); // second is start index;
}
sort(v.begin(), v.end()); // sort by value then by index
v.erase(unique(v.begin(), v.end(), firstEqual<value_t>), v.end());
sort(v.begin(), v.end(), bySecond<value_t>); // restore order.
it c = begin;
for(const auto& x: v){
*(c++) = x.first;
}
return it;
}
Possibility to have own predicate isn't implemented. It's possible but one disadvantage is that you will have to provide less-than
function, not equality
one, that's maybe impossible in some cases.
回答4:
The std::unique
algorithm assumes that the input range is in order and removes duplicates by comparing two consecutive values. To be able to use the algorithm you will need to sort the input first.
来源:https://stackoverflow.com/questions/15413943/stdunique-with-predicate-comparing-stdstring-not-removing-duplicate