问题
I wrote an function to determine if string a
is a permutation of string b
. The definition is as follows:
bool isPermutation(std::string a, std::string b){
if(a.length() != b.length())
return false;
int a_sum, b_sum;
a_sum = b_sum = 0;
for(int i = 0; i < a.length(); ++i){
a_sum += a.at(i);
b_sum += b.at(i);
}
return a_sum == b_sum;
}
The issue with my approach is that if a = 600000
and b = 111111
, the function returns true.
Is there any way I can keep my general approach to this problem (as opposed to sorting the strings then doing strcmp
) and maintain correctness?
回答1:
You can count characters separately:
bool isPermutation(std::string a, std::string b)
{
if(a.length() != b.length())
return false;
assert(a.length() <= INT_MAX);
assert(b.length() <= INT_MAX);
int counts[256] = {};
for (unsigned char ch : a)
++counts[ch];
for (unsigned char ch : b)
--counts[ch];
for (int count : counts)
if (count)
return false;
return true;
}
回答2:
A simple approach if you don't need UTF-8 support
The solution to this problem is surprisingly easy. There is a function in standard library handling this.
Assume that a
and b
are two string
s:
return is_permutation(a.begin(), a.end(), b.begin(), b.end());
Or, if you don't have access to C++14 yet:
return a.size() == b.size() && is_permutation(a.begin(), a.end(), b.begin());
Note though the complexity of this is only guaranteed to be no worse than quadratic in the size of the string. So, if this matters, sorting both strings could indeed be a better solution:
string aa(a); sort(aa.begin(), aa.end());
string bb(b); sort(bb.begin(), bb.end());
return (aa == bb);
And if this is also to slow, use John Zwinck's answer above, which is linear in complexity.
Link to the documentation for is_permutation
: http://en.cppreference.com/w/cpp/algorithm/is_permutation
Link to the documentation for sort
:
http://en.cppreference.com/w/cpp/algorithm/sort
A (little) more complex approach if UTF-8 support is required
The above may fail on UTF-8 strings. The issue here is that UTF-8 is a multibyte character encoding, that is, a single character may be encoded in multiple char
variables. None of the approaches mentioned above are aware of this, and all assume that a single character is also a sigle char
variable. An example of two UTF-8 strings were these approaches fail is here: http://ideone.com/erfNmC
The solution may be to temporarily copy our UTF-8 string to a fixed-length UTF-32 encoded string. Assume that a
and b
are two UTF-8 encoded string
s:
u32string a32 = wstring_convert<codecvt_utf8<char32_t>, char32_t>{}.from_bytes(a);
u32string b32 = wstring_convert<codecvt_utf8<char32_t>, char32_t>{}.from_bytes(b);
Then you can correctly use the aforemented functions on those UTF-32 encoded strings:
return is_permutation(a32.begin(), a32.end(), b32.begin(), b32.end()) << '\n';
or:
sort(a32.begin(), a32.end());
sort(b32.begin(), b32.end());
return (aa == bb);
The downside is that now John Zwinck's approach becomes a little bit less practical. You'd have to declare the array for 1114112 elements, as this is how many possible Unicode characters actually exist.
More about conversions to UTF-32: http://en.cppreference.com/w/cpp/locale/wstring_convert/from_bytes
回答3:
std::sort( strOne.begin(), strOne.end() );
std::sort( strTwo.begin(), strTwo.end() );
return strOne == strTwo;
will be sufficient.
My suggestion is to use std::unordered_map
i.e.
std::unordered_map< char, unsigned > umapOne;
std::unordered_map< char, unsigned > umapTwo;
for( char c : strOne ) ++umapOne[c];
for( char c : strTwo ) ++umapTwo[c];
return umapOne == umapTwo;
As an optimization you can add at the top for a solution
if( strOne.size() != strTwo.size() ) return false;
Better std::unordered_map
solution,
if( strOne.size() != strTwo.size() ) return false; // required
std::unordered_map< char, int > umap;
for( char c : strOne ) ++umap[c];
for( char c : strTwo ) if( --umap[c] < 0 ) return false;
return true;
If you need to just solve a problem without knowing how to do it, you may use std::is_permutation
return std::is_permutation( strOne.begin(), strOne.end(), strTwo.begin(), strTwo.end() );
来源:https://stackoverflow.com/questions/36818877/determine-if-a-is-permutation-of-b-using-ascii-values