How to find repeating sequence of characters in a given array?

后端未结

关注

 14  776

My problem is to find the repeating sequence of characters in the given array. simply, to identify the pattern in which the characters are appearing.


                      
              相关标签:


      
      
        
          14条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  不思量自难忘°        
                
              
                            
                2020-12-02 13:07
              
            
            
                                                                       
Using C++:

//Splits the string into the fragments of given size
//Returns the set of of splitted strings avaialble
set<string> split(string s, int frag)
{
    set<string> uni;
    int len = s.length();
    for(int i = 0; i < len; i+= frag)
    {
        uni.insert(s.substr(i, frag));
    }

    return uni;
}

int main()
{

    string out;
    string s = "carpentercarpenter";
    int len = s.length();

      //Optimistic approach..hope there are only 2 repeated strings
      //If that fails, then try to break the strings with lesser number of
      //characters
    for(int i = len/2; i>1;--i)
    {
        set<string> uni = split(s,i);
        if(uni.size() == 1)
        {
            out = *uni.begin();
            break;
        }
    }

    cout<<out;
    return 0;

}

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  礼貌的吻别        
                
              
                            
                2020-12-02 13:07
              
            
            
                                                                       
The first idea that comes to my mind is trying all repeating sequences of lengths that divide length(S) = N. There is a maximum of N/2 such lengths, so this results in a O(N^2) algorithm.

But i'm sure it can be improved...
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  我寻月下人不归        
                
              
                            
                2020-12-02 13:11
              
            
            
                                                                       
Pseudocode

len = str.length
for (i in 1..len) {
   if (len%i==0) {
      if (str==str.substr(0,i).repeat(len/i)) {
         return str.substr(0,i)
      }
   }
}


Note: For brevity, I'm inventing a "repeat" method for strings, which isn't actually part of Java's string; "abc".repeat(2)="abcabc"
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  慢半拍i        
                
              
                            
                2020-12-02 13:12
              
            
            
                                                                       
Tongue-in-cheek O(NlogN) solution

Perform an FFT on your string (treating characters as numeric values).  Every peak in the resulting graph corresponds to a substring periodicity.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  渐次进展        
                
              
                            
                2020-12-02 13:13
              
            
            
                                                                       
For your examples, my first approach would be to


get the first character of the array (for your last example, that would be C)
get the index of the next appearance of that character in the array (e.g. 9)
if it is found, search for the next appearance of the substring between the two appearances of the character (in this case CARPENTER)
if it is found, you're done (and the result is this substring).


Of course, this works only for a very limited subset of possible arrays, where the same word is repeated over and over again, starting from the beginning, without stray characters in between, and its first character is not repeated within the word. But all your examples fall into this category - and I prefer the simplest solution which could possibly work :-)

If the repeated word contains the first character multiple times (e.g. CACTUS), the algorithm can be extended to look for subsequent occurrences of that character too, not only the first one (so that it finds the whole repeated word, not only a substring of it).

Note that this extended algorithm would give a different result for your second example, namely RONRON instead of RON.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  没有蜡笔的小新        
                
              
                            
                2020-12-02 13:14
              
            
            
                                                                       
Put all your character in an array e.x. a[]

i=0; j=0;
for( 0 < i < count ) 
{
if (a[i] == a[i+j+1])
    {++i;}
else
    {++j;i=0;}
}


Then the ratio of (i/j) = repeat count in your array.
You must pay attention to limits of i and j, but it is the simple solution.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     上一页
1
2
3
下一页
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复