Checking if a list of strings can be chained

后端 未结 8 790
庸人自扰
庸人自扰 2021-01-31 23:45

Question

Implement a function bool chainable(vector v), which takes a set of strings as parameters and returns true if they

相关标签:
8条回答
  • 2021-01-31 23:53

    Isn't that similar to the infamous traveling salesman problem?

    If you have n strings, you can construct a graph out of them, where each node corresponds to one string. You construct the edges the following way:

    • If string (resp. node) a and b are chainable, you introduce an edge a -> b with weight 1.
    • For all unchainable strings (resp. nodes) a and b, you introduce an edge a -> b with weight n.

    Then, all your strings are chainable (without repetition) if and only if you can find an optimal TSP route in the graph whose weight is less than 2n.

    Note: Your problem is actually simpler than TSP, since you always can transform string chaining into TSP, but not necessarily the other way around.

    0 讨论(0)
  • 2021-01-31 23:56

    Here's a case where your algorithm doesn't work:

    ship
    pass
    lion
    nail
    

    Your start and end lists are both s, p, l, n, but you can't make a single chain (you get two chains - ship->pass and lion->nail).

    A recursive search is probably going to be best - pick a starting word (1), and, for each word that can follow it (2), try to solve the smaller problem of creating a chain starting with (2) that contains all of the words except (1).

    0 讨论(0)
  • 2021-01-31 23:56

    This can be solved by a reduction to the Eulerian path problem by considering a digraph G with N(G) = Σ and E(G) = a->e for words aWe.

    0 讨论(0)
  • 2021-02-01 00:00

    As phimuemue pointed out, this is a graph problem. You have a set of strings (vertices), with (directed) edges. Clearly, the graph must be connected to be chainable -- this is easy to check. Unfortunately, the rules beyond this are a little unclear:

    If strings may be used more than once, but links can't, then the problem is to find an Eulerian path, which can be done efficiently. An Eulerian path uses each edge once, but may use vertices more than once.

    // this can form a valid Eulerian path
    yard
    dog
    god
    glitter
    
    yard -> dog -> god -> dog -> glitter
    

    If the strings may not be used more than once, then the problem is to find a Hamiltonian path. Since the Hamiltonian path problem is NP-complete, no exact efficient solution is known. Of course, for small n, efficiency isn't really important and a brute force solution will work fine.

    However, things are not quite so simple, because the set of graphs that can occur as inputs to this problem are limited. For example, the following is a valid directed graph (in dot notation) (*).

    digraph G {
        alpha -> beta;
        beta -> gamma;
        gamma -> beta;
        gamma -> delta;
    }
    

    However, this graph cannot be constructed from strings using the rules of this puzzle: Since alpha and gamma both connect to beta, they must end with the same character (let's assume they end with 'x'), but gamma also connects to delta, so delta must also start with 'x'. But delta cannot start with 'x', because if it did, then there would be an edge alpha -> delta, which is not in the original graph.

    Therefore, this is not quite the same as the Hamiltonian path problem, because the set of inputs is more restricted. It is possible that an efficient algorithm exists to solve the string chaining problem even if no efficient algorithm exists to solve the Hamiltonian path problem.

    But... I don't know what that algorithm would be. Maybe someone else will come up with a real solution, but in the mean time I hope someone finds this answer interesting.

    (*) It also happens to have a Hamiltonian path: alpha -> beta -> gamma -> delta, but that's irrelevant for what follows.

    0 讨论(0)
  • 2021-02-01 00:03

    seperatedly check for "Is chainable" and is "cylcic"

    if it's to be cyclic it must be chainable first. you could do something like this:

    if (IsChainable)
    {
      if (IsCyclic() { ... }
    }
    

    Note: That's the case if you check only the first and last element of the chain for "cylic".

    0 讨论(0)
  • 2021-02-01 00:08

    The problem is to check if a Eulerian path exists in the directed graph whose vertices are the letters occurring as first or last letter of at least one of the supplied words and whose edges are the supplied words (each word is the edge from its first letter to its last).

    Some necessary conditions for the existence of Eulerian paths in such graphs:

    1. The graph has to be connected.
    2. All vertices with at most two exceptions have equally many incoming and outgoing edges. If exceptional vertices exist, there are exactly two, one of them has one more outgoing edge than incoming, the other has one more incoming edge than outgoing.

    The necessity is easily seen: If a graph has Eulerian paths, any such path meets all vertices except the isolated vertices (neither outgoing nor incoming edges). By construction, there are no isolated vertices in the graph under consideration here. In a Eulerian path, every time a vertex is visited, except the start and end, one incoming edge and one outgoing edge is used, so each vertex with the possible exception of the starting and ending vertex has equally many incoming and outgoing edges. The starting vertex has one more outgoing edge than incoming and the ending vertex one more incoming edge than outgoing unless the Eulerian path is a cycle, in which case all vertices have equally many incoming and outgoing edges.

    Now the important thing is that these conditions are also sufficient. One can prove that by induction on the number of edges.

    That allows for a very efficient check:

    • record all edges and vertices as obtained from the words
    • use a union find structure/algorithm to count the connected components of the graph
    • record indegree - outdegree for all vertices

    If number of components > 1 or there is (at least) one vertex with |indegree - outdegree| > 1 or there are more than two vertices with indegree != outdegree, the words are not chainable, otherwise they are.

    0 讨论(0)
提交回复
热议问题