Find first common child of two commits

后端 未结 3 1783
旧时难觅i
旧时难觅i 2021-01-12 07:29
           :
           A
T         / \\
i        B   C
m        :   :
e        D   E
          \\ /
|          F
V          :

git merge-base

3条回答
  •  悲&欢浪女
    2021-01-12 07:37

    There isn't necessarily a unique answer to this problem, so you have to decide on a few constraints and/or heuristics, or accept the possibility more than one "downstream" merge. The heart of the problem is the same as the problem of multiple merge base candidates—use git merge-base --all to list them all, otherwise Git just picks whichever one pops up first in its algorithm. We can do the same, or find all best merge candidates.

    You've drawn what I usually prefer to render sideways as, e.g.:

      B--...--D
     /         \
    A           F--G--H   <-- branch1
     \         /
      C--...--E   <-- branch2
    

    but we might have this:

      B--C---D--E--...   <-- branch1
     /    \ /
    A      X
     \    / \
      F--G---H--I--...   <-- branch2
    

    In this case both merges D and H are equally good candidates for "the place where the branches re-merge" if you allow both branch1 and branch2 to be considered. Even if you don't, if branch2 merges back into branch1 later:

      B--C---D--E---J--...   <-- branch1
     /    \ /      /
    A      X      /
     \    / \    /
      F--G---H--I--...   <-- branch2
    

    then just starting from (or ending at) branch1, both D and H are equally good candidates.

    In any case, what we need here is to enumerate commits that end in one or all of the branches you want to consider. To do that, we can use, e.g.:

    git rev-list --ancestry-path ^B ^E branch1 branch2
    

    This finds commits that are ancestors of branch1 or branch2, and are also descendants of commit B or of commit E.

    To really get the right answer, we want to add --children. That way we'll get the hash ID of each commit, along with the children of that commit that go in this same direction. Git achieves the --children by reversing the backwards connections from the children to the parents as it traverses the links, which is good enough; but we won't see commits B or E. This is kind of a problem. To get them shown, we can add --boundary. This is not ideal: --boundary sometimes includes some commits we don't want. Fortunately, they're all marked with - so we can exclude extra boundary commits by knocking out ones that aren't the commits we care about.

    I'm not going to show any of that, but if you did that, you would now have a list, one entry per line, of each node (vertex) and its edges that connect to its children. You can now ask What is the LCA of the DAG formed by these (V,E) sets?

    It would be nice if we could just use Git's LCA algorithm, but Git does not have a way to invoke it on arbitrary graphs—we can only invoke it on commits, and the actual commits have parents, not children. So you will have to write your own. See Algorithm to find lowest common ancestor in directed acyclic graph? (which, unfortunately, has no accepted answer). This algorithm looks correct at first blush; it has one of the two standard definitions for LCA in a graph.

    If we're willing to settle for a not-nearly-as-good answer, though, we can get something that's probably sufficient in most cases by adding --topo-order (to make sure parents come out after all their children) and --merges (to omit everything that's not a merge commit). This will get a list of all merges.

    I have made here a test repository with a simple case:

    $ git log --all --decorate --oneline --graph
    * 91fcef6 (HEAD -> master) J
    * d1e5905 I
    *   5bf18a0 merge
    |\  
    | * 49b2ba7 (sidebr) D
    | * 725e5ea C
    | * 36b830d (tag: B) B
    * | 198a982 (tag: G) G
    * | 216bc01 F
    * | e905e59 E
    |/  
    * 5df9428 initial
    

    So I can now name commits B and G using B and G, and the branch I want for a "move in this direction" is just master. So:

    $ git rev-list --topo-order --merges --ancestry-path ^B ^G master
    5bf18a0797dfd78107928a9a4095f357cfabe914
    

    The last line here is the merge that's "closest" to the two commits. In this case, that's also the only line, and that's the merge we want.

    The flaw here is clear enough once we draw it. Suppose I had a more complex graph, such as:

          I--J
         /    \
        H      M--N
       / \    /    \
      /   K--L      \
     /               \
    A                 P--Q  <-- master
     \               /
      \   C--D      /
       \ /    \    /
        B      G--O
         \    /
          E--F
    

    If I now run git rev-list --topo-order --merges --ancestry-path ^B ^H master, I'll enumerate commit P, then both G and M in some order. So the last line will either be commit G or commit M, and while both of these are merges, they don't meet the right criterion: they don't merge B and H. Only commit P does that.

    Hence, to check whether you have a right answer—without handling the multiple LCA issue—you should take each of the output lines from this git rev-list command, probably in reverse order (consider adding --reverse), and see if both commits are ancestors of each. "Internal" merges like G and M will have only one commit as an ancestor. To do the is-ancestor test, use git merge-base --is-ancestor:

    if git merge-base --is-ancestor $commit1 $mergecommit &&
           git merge-base --is-ancestor $commit2 $mergecommit; then
        ... we've found a correct candidate
    else
        ... move on to another candidate
    fi
    

提交回复
热议问题