How do I find all the positions of a substring in a string?

后端 未结 4 1441
臣服心动
臣服心动 2020-12-17 17:07

I want to search a large string for all the locations of a string.

相关标签:
4条回答
  • 2020-12-17 17:55

    The two other answers are correct but they are very slow and have O(N^2) complexity. But there is the Knuth-Morris-Pratt algorithm, which finds all substrings in O(N) complexity.

    Edit:

    Also, there is another algorithm: the so-called "Z-function" with O(N) complexity, but I couldn't find an English source for this algorithm (maybe because there is also another more famous one with same name - the Z-function of Riman), so I will just put its code here and explain what it does.

    void calc_z (string &s, vector<int> & z)
    {
        int len = s.size();
        z.resize (len);
    
        int l = 0, r = 0;
        for (int i=1; i<len; ++i)
            if (z[i-l]+i <= r)
                z[i] = z[i-l];
            else
            {
                l = i;
                if (i > r) r = i;
                for (z[i] = r-i; r<len; ++r, ++z[i])
                    if (s[r] != s[z[i]])
                        break;
                --r;
            }
    }
    
    int main()
    {
        string main_string = "some string where we want to find substring or sub of string or just sub";
        string substring = "sub";
        string working_string = substring + main_string;
        vector<int> z;
        calc_z(working_string, z);
    
        //after this z[i] is maximal length of prefix of working_string
        //which is equal to string which starting from i-th position of
        //working_string. So the positions where z[i] >= substring.size()
        //are positions of substrings.
    
        for(int i = substring.size(); i < working_string.size(); ++i)
            if(z[i] >=substring.size())
                cout << i - substring.size() << endl; //to get position in main_string
    }
    
    0 讨论(0)
  • 2020-12-17 17:58

    Simply use std::string::find() which returns the position at which the substring was found, or std::string::npos if none was found.

    Here is the documentation.

    An here is the example taken from this documentation:

    // string::find
    #include <iostream>
    #include <string>
    using namespace std;
    
    int main ()
    {
      string str ("There are two needles in this haystack with needles.");
      string str2 ("needle");
      size_t found;
    
      // different member versions of find in the same order as above:
      found=str.find(str2);
      if (found!=string::npos)
        cout << "first 'needle' found at: " << int(found) << endl;
    
      found=str.find("needles are small",found+1,6);
      if (found!=string::npos)
        cout << "second 'needle' found at: " << int(found) << endl;
    
      found=str.find("haystack");
      if (found!=string::npos)
        cout << "'haystack' also found at: " << int(found) << endl;
    
      found=str.find('.');
      if (found!=string::npos)
        cout << "Period found at: " << int(found) << endl;
    
      // let's replace the first needle:
      str.replace(str.find(str2),str2.length(),"preposition");
      cout << str << endl;
    
      return 0;
    }
    
    0 讨论(0)
  • 2020-12-17 18:05

    I'll add for completeness, there is another approach that is possible with std::search, works like std::string::find, difference is that you work with iterators, something like:

    std::string::iterator it(str.begin()), end(str.end());
    std::string::iterator s_it(search_str.begin()), s_end(search_str.end());
    
    it = std::search(it, end, s_it, s_end);
    
    while(it != end)
    {
      // do something with this position..
    
      // a tiny optimisation could be to buffer the result of the std::distance - heyho..
      it = std::search(std::advance(it, std::distance(s_it, s_end)), end, s_it, s_end);
    }
    

    I find that this sometimes outperforms std::string::find, esp. if you represent your string as a vector<char>.

    0 讨论(0)
  • 2020-12-17 18:13

    Using std::string::find. You can do something like:

    std::string::size_type start_pos = 0;
    while( std::string::npos != 
              ( start_pos = mystring.find( my_sub_string, start_pos ) ) )
    {
        // do something with start_pos or store it in a container
        ++start_pos;
    }
    

    EDIT: Doh! Thanks for the remark, Nawaz! Better?

    0 讨论(0)
提交回复
热议问题