Algorithm for checking if a string was built from a list of substrings

前端 未结 10 1972
醉酒成梦
醉酒成梦 2021-02-02 14:35

You are given a string and an array of strings. How to quickly check, if this string can be built by concatenating some of the strings in the array?

This is a theoretica

相关标签:
10条回答
  • 2021-02-02 14:54

    Inspired by @cnicutars answer:

    • function Possible(array A, string s)
      • If s is empty, return true.
      • compute the array P of all strings in A that are a prefix of s.
      • If P is empty, return false.
      • for each string p in P:
        • if Possible(A with p removed, s with prefix p removed) return true
      • return false
    0 讨论(0)
  • 2021-02-02 15:03

    What you are looking for is a parser. A parser will check whether a certain word belongs to a certain language. I am not sure of the exact computattional complexity of your problem. Some of the above seems to be correct (there is no need at all for exhaustive search). One thing for sure, it s not NP-Complete.

    The alphabet of your language would be all the small substrings. The word you are looking for is the string you have. A regular expression can be a simple Kleene star, or a a very simply context free grammar that is nothing but Or's.

    The main issue in the algorithm is: what if the some of the substrings are actually substrings to other substrings ... that is, what if we have substrings: "ab", "abc", "abcd", ... , In this case, the order of checking the substrings will change the complexity. For this, we have LR-parsers. I guess they are the best in solving such problems.

    I will find you the exact solution soon.

    0 讨论(0)
  • 2021-02-02 15:05

    It seems to me a problem can be solved by simple linearly traversing of array and comparison. However there could be multiple pass. You can devise a strategy to minimize the passes. For example constructing a sub array of all the substrings of the original string in first pass. Then try out different variations linearly.

    0 讨论(0)
  • 2021-02-02 15:06

    This is how I would do it.

    1. Determine the length of the target string.
    2. Determine the length of each string in the substring array
    3. Determine which combination of substrings would yield a string with the same length as the target string (If any, if not you're done)
    4. Generate all permutations of the substring combinations determined in step 3. Check if any of them match the target string.

    Generating all permutations is a processor heavy task, so if you can cut down on your 'n' (input size), you'll gain some considerable efficiency.

    0 讨论(0)
  • 2021-02-02 15:06

    Here is a rough idea that should work.

    1. Copy the source string into a new string
    2. While the copy string still has data and there are still substrings a. Grab a sub string, if copy.contains(substr) copy.remove(substr)
    3. If the copy is now empty then yes, you could construct the string
    4. If copy is not empty, throw out the first substr that was removed from the string and repeat.
    5. If all substrings are gone and copy is still not empty then no, you can't construct it.

    Edit: A way to possibly improve this would be to first iterate all of the substrings and throw out any that are not contained in the main string. Then go through the above steps.

    0 讨论(0)
  • 2021-02-02 15:06

    Let me suggest using Suffix Trees (using Ukkonen's online algorithm to build it) which seems to be suitable in terms of searching common substrings in two texts. You could find more information in wikipedia/special sources. The task is

    Find all z occurrences of the patterns P1..Pn of total length m
    enter code hereas substrings in O(m + z) time.
    

    so you see there exists very cool solution. Hope this will work for you. This is actually more suitable for repeating scans, rather than a single scan.

    0 讨论(0)
提交回复
热议问题