More efficient algorithm for shortest superstring search

前端 未结 3 1159
Happy的楠姐
Happy的楠姐 2021-01-31 12:32

My problem below is NP-complete, however, I\'m trying to find at least a marginally faster string search function or module that might help in reducing some of the computation t

3条回答
  •  后悔当初
    2021-01-31 12:59

    Just backtracking, but always check most overlapped first. After get a good candidate answer, later when current path result in a string has length big or equal to this candidate answer, we do not need to go further with this path.

    Tested in my Jupyter notebook. It seems to be much faster than the other two answers here (11/18/2018)

    def shortestSuperstring(A):
        """
        :type A: List[str]
        :rtype: str
        """
    
        if len(A)==1:
            return A[0]
        dic={}
        for i in xrange(len(A)):
            for j in xrange(len(A)):
                if i!=j:
                    ol=0
                    for k in xrange(1,min(len(A[i]),len(A[j]))):
                        if A[j][:k]==A[i][-k:]:
                            ol=k
                    dic[(i,j)]=ol
        if max(dic.values())==0:
            return "".join(A)
        else:
            ret="".join(A)
            l=len(ret)
            stack=[]
            for i,wd in enumerate(A):
                tmp=set(range(len(A)))
                tmp.remove(i)
                stack.append((wd,i,tmp))
            while stack:
                ans,cur,remain=stack.pop()
                if len(ans)

    The test case in the problem takesL

    1.93 s ± 160 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
    

    to run and gives the answer:

    'CCGTGGTAGGAGT'
    

    Some other test cases (longer strings and that start to beat the other two methods, all about 1~5 seconds):

        ****************************************************************************************************
    
    
        case: 
    
         ['mftpvodataplkewcouz', 'krrgsoxpsnmzlhprsl', 'qhbfymytxzbmqma', 'hunjgeaolcuznhpodi', 'kewcouzbwlftz', 'xzbmqmahunjgeaolcu', 'zlhprslqurnqbhsjr', 'rrgsoxpsnmzlhprslqur', 'diqukrrgsoxpsnmz', 'sjrxzavamftpvoda']
    
    
        ****************************************************************************************************
    
    
        ans:  qhbfymytxzbmqmahunjgeaolcuznhpodiqukrrgsoxpsnmzlhprslqurnqbhsjrxzavamftpvodataplkewcouzbwlftz
    
    
        ****************************************************************************************************
    
    
        case: 
    
         ['cedefifgstkyxfcuajfa', 'ooncedefifgstkyxfcua', 'assqjfwarvjcjedqtoz', 'fcuajfassqjfwarvjc', 'fwarvjcjedqtozctcd', 'zppedxfumcfsngp', 'kyxfcuajfassqjfwa', 'fumcfsngphjyfhhwkqa', 'fassqjfwarvjcjedq', 'ppedxfumcfsngphjyf', 'dqtozctcdk']
    
    
        ****************************************************************************************************
    
    
        ans:  zppedxfumcfsngphjyfhhwkqaooncedefifgstkyxfcuajfassqjfwarvjcjedqtozctcdk
    
    
        ****************************************************************************************************
    
    
        case: 
    
         ['ekpijtseahvmprvefkgn', 'yyevvcmeekpijtseahvm', 'vsfcyyevvcmeekp', 'xwmkoqhxvrovlmmvsfcy', 'cmeekpijtseahvmpr', 'oqhxvrovlmmvsfcyy', 'zpuemtclxbxwsypfxevx', 'clxbxwsypfxevxw', 'fkgnjgdvfygnlckyiju', 'xevxwmkoqhxvrovlmm']
    
    
        ****************************************************************************************************
    
    
        ans:  zpuemtclxbxwsypfxevxwmkoqhxvrovlmmvsfcyyevvcmeekpijtseahvmprvefkgnjgdvfygnlckyiju
    
    
        ****************************************************************************************************
    
    
        case: 
    
         ['ppgortnmsy', 'czmysoeeyugbiylso', 'nbfzpppvhbjydtx', 'rnzynedhoiunkpon', 'ornzynedhoiunkpo', 'ylsomoktkyfgljcf', 'jtvkrornzynedhoiunk', 'hvhhihwdffmxnczmyso', 'ktkyfgljcfbkqcpp', 'nzynedhoiunkponbfz', 'nedhoiunkponbfzpppvh']
    
    
        ****************************************************************************************************
    
    
        ans:  hvhhihwdffmxnczmysoeeyugbiylsomoktkyfgljcfbkqcppgortnmsyjtvkrornzynedhoiunkponbfzpppvhbjydtx
    
    
        ****************************************************************************************************
    
    
        case: 
    
         ['amefulhsdgvjvoab', 'giqxpqszaitzfzvtalx', 'cyqeolfgkihssycmiodg', 'glhhcfuprwazet', 'cmiodgiqxpqszaitzf', 'lhsdgvjvoabdviglhhcf', 'ssycmiodgiqxpqsza', 'bxtdqnamefulhsdg', 'namefulhsdgvjvo', 'ihssycmiodgiqxp', 'itzfzvtalxfybxtdqn']
    
    
        ****************************************************************************************************
    
    
        ans:  cyqeolfgkihssycmiodgiqxpqszaitzfzvtalxfybxtdqnamefulhsdgvjvoabdviglhhcfuprwazet
    
    
        ****************************************************************************************************
    
    
        case: 
    
         ['yobbobwqymlordokxka', 'jllfoebgbsrguls', 'rgulsnatnpuuwiyba', 'ordokxkamymamofefr', 'wqymlordokxkamy', 'fycxifzsjllfoebgbsrg', 'lordokxkamymamofe', 'kxkamymamofefrmfycx', 'frmfycxifzsjllf', 'srgulsnatnpuuwiy']
    
    
        ****************************************************************************************************
    
    
        ans:  yobbobwqymlordokxkamymamofefrmfycxifzsjllfoebgbsrgulsnatnpuuwiyba
    
    
        ****************************************************************************************************
    
    
        case: 
    
         ['jnbbbbsczcscxawcze', 'bsczcscxawczeumyyr', 'lyofvbhvjmquhkgz', 'quhkgzyzdwtjnbbb', 'kgzyzdwtjnbbbbsczc', 'uouxnfplptpkgnronf', 'pqgyfqglyofvbhv', 'kgnronftgswvpqgy', 'marvhdxtbmkcpnli', 'qgyfqglyofvbhvjmquhk', 'xtbmkcpnliz']
    
    
        ****************************************************************************************************
    
    
        ans:  marvhdxtbmkcpnlizuouxnfplptpkgnronftgswvpqgyfqglyofvbhvjmquhkgzyzdwtjnbbbbsczcscxawczeumyyr
    
    
        ****************************************************************************************************
    
    
        case: 
    
         ['qrwpawefqzfjsan', 'jsanzdukfkdlmyox', 'neaxnkedjxbpgsyq', 'nqjvzryhfjdsxmwolwo', 'hfjdsxmwolwomeeewvi', 'lmyoxbpvkneaxnkedjxb', 'qbhpqrwpawefqzfjsa', 'pawefqzfjsanzdukfk', 'bqbhpqrwpawefqzfj', 'dlmyoxbpvkneaxnk', 'xnkedjxbpgsyqovvh']
    
    
        ****************************************************************************************************
    
    
        ans:  bqbhpqrwpawefqzfjsanzdukfkdlmyoxbpvkneaxnkedjxbpgsyqovvhnqjvzryhfjdsxmwolwomeeewvi
    
    
        ****************************************************************************************************
    
    
        case: 
    
         ['vgrikrnwezryimj', 'umwgwvzpsfpmctzt', 'pjourlpgeemdjor', 'urlpgeemdjorpzbkbz', 'jorpzbkbzcqyewih', 'xuwkzvoczozhhvf', 'ihbumoogibirbsvch', 'nwezryimjivvpjourlp', 'kzvoczozhhvfwgeplv', 'ezryimjivvpjourlpgee', 'zhhvfwgeplvqngglu', 'rikrnwezryimjivvp']
    
    
        ****************************************************************************************************
    
    
        ans:  xuwkzvoczozhhvfwgeplvqngglumwgwvzpsfpmctztvgrikrnwezryimjivvpjourlpgeemdjorpzbkbzcqyewihbumoogibirbsvch
    
    
        ****************************************************************************************************
    
    
        case: 
    
         ['nbsgonqmpreelpbr', 'hnysjajtiguehrokus', 'udgzbzmevnkzzba', 'axtbmcpbmoubyoscn', 'vqnbsgonqmpreel', 'xvqnbsgonqmpree', 'ajtiguehrokustktudgz', 'brgkgihuetpqrhhbhn', 'dgzbzmevnkzzbaxtbmcp', 'ehrokustktudgzbzmevn', 'uetpqrhhbhnysjaj', 'vnkzzbaxtbmcpbmo']
    
    
        ****************************************************************************************************
    
    
        ans:  xvqnbsgonqmpreelpbrgkgihuetpqrhhbhnysjajtiguehrokustktudgzbzmevnkzzbaxtbmcpbmoubyoscn
    
    
        ****************************************************************************************************
    
    
        case: 
    
         ['orugbsuuxowmhjh', 'zjyxzmpduthlsioor', 'qtxocgehmhfqnstl', 'tlrlcnnrsyryfrywuebq', 'hozjyxzmpduthlsio', 'hjhdmnqtxocgehm', 'mjhzwdudlnbfkjawqacf', 'hfqnstlrlcnnrsyryfry', 'yfrywuebqhvwewzmq', 'zzieemjhzwdudlnbfkj', 'nnrsyryfrywuebqhvw', 'acfgaihbhozjyxzmpdut']
    
    
        ****************************************************************************************************
    
    
        ans:  zzieemjhzwdudlnbfkjawqacfgaihbhozjyxzmpduthlsioorugbsuuxowmhjhdmnqtxocgehmhfqnstlrlcnnrsyryfrywuebqhvwewzmq
    
    
        ****************************************************************************************************
    
    
        case: 
    
         ['phuutlgczfspygaljkv', 'fspygaljkvahvuii', 'csywjodtnkynkjckq', 'poyykqyrhbvcwvjl', 'xijupvzzwphuutlg', 'aljkvahvuiivqbqrw', 'vahvuiivqbqrwryd', 'wjodtnkynkjckqurgu', 'ecdmbshotqbxjqgbou', 'hvuiivqbqrwrydgnr', 'ivqbqrwrydgnrubcsywj', 'wphuutlgczfspyga']
    
    
        ****************************************************************************************************
    
    
        ans:  ecdmbshotqbxjqgbouxijupvzzwphuutlgczfspygaljkvahvuiivqbqrwrydgnrubcsywjodtnkynkjckqurgupoyykqyrhbvcwvjl
    

    Also see the dynamic programming approach: https://leetcode.com/problems/find-the-shortest-superstring/solution/

提交回复
热议问题