Determine if A is permutation of B using ASCII values

前端 未结 3 1056
爱一瞬间的悲伤
爱一瞬间的悲伤 2021-01-22 02:40

I wrote an function to determine if string a is a permutation of string b. The definition is as follows:

bool isPermutation(std::string         


        
3条回答
  •  心在旅途
    2021-01-22 03:05

    A simple approach if you don't need UTF-8 support

    The solution to this problem is surprisingly easy. There is a function in standard library handling this.

    Assume that a and b are two strings:

    return is_permutation(a.begin(), a.end(), b.begin(), b.end());
    

    Or, if you don't have access to C++14 yet:

    return a.size() == b.size() && is_permutation(a.begin(), a.end(), b.begin());
    

    Note though the complexity of this is only guaranteed to be no worse than quadratic in the size of the string. So, if this matters, sorting both strings could indeed be a better solution:

    string aa(a); sort(aa.begin(), aa.end());
    string bb(b); sort(bb.begin(), bb.end());
    return (aa == bb);
    

    And if this is also to slow, use John Zwinck's answer above, which is linear in complexity.

    Link to the documentation for is_permutation: http://en.cppreference.com/w/cpp/algorithm/is_permutation

    Link to the documentation for sort: http://en.cppreference.com/w/cpp/algorithm/sort

    A (little) more complex approach if UTF-8 support is required

    The above may fail on UTF-8 strings. The issue here is that UTF-8 is a multibyte character encoding, that is, a single character may be encoded in multiple char variables. None of the approaches mentioned above are aware of this, and all assume that a single character is also a sigle char variable. An example of two UTF-8 strings were these approaches fail is here: http://ideone.com/erfNmC

    The solution may be to temporarily copy our UTF-8 string to a fixed-length UTF-32 encoded string. Assume that a and b are two UTF-8 encoded strings:

    u32string a32 = wstring_convert, char32_t>{}.from_bytes(a);
    u32string b32 = wstring_convert, char32_t>{}.from_bytes(b);
    

    Then you can correctly use the aforemented functions on those UTF-32 encoded strings:

    return is_permutation(a32.begin(), a32.end(), b32.begin(), b32.end()) << '\n';
    

    or:

    sort(a32.begin(), a32.end());
    sort(b32.begin(), b32.end());
    return (aa == bb);
    

    The downside is that now John Zwinck's approach becomes a little bit less practical. You'd have to declare the array for 1114112 elements, as this is how many possible Unicode characters actually exist.

    More about conversions to UTF-32: http://en.cppreference.com/w/cpp/locale/wstring_convert/from_bytes

提交回复
热议问题