`git log --follow --graph` skips commits

前端 未结 1 1027
北恋
北恋 2021-01-06 08:37

Setup

git version 2.11.0.windows.1

Here is a bash snippet to reproduce my test repository:

git init

# Create a file
echo         


        
相关标签:
1条回答
  • 2021-01-06 09:07

    You're being bitten by git log's cheap and sleazy implementation of --follow, plus the fact that git log often doesn't even look inside merges.

    Fundamentally, --follow works internally by changing the name of the file it's looking for. It does not remember both names, so when the linearization algorithm (breadth first search via priority queue) goes down the other leg of the merge, it has the wrong name. You are correct that the order of commit visits matters since it's when Git deduces a rename that Git changes the name of the file it's searching for.

    In this graph (it looks like you ran the script several times because the hashes changed—the hashes here are from the first sample):

    *   06b5bb7 Merge branch 'feature'
    |\
    | * 07ccfb6 Change
    * | 448ad99 Move
    |/
    * 31eae74 First commit
    

    git log will visit commit 06b5bb7, and put 448ad99 and 07ccfb6 on the queue. With the default topo order it will next visit 448ad99, examine the diff, and see the rename. It is now looking for a.txt instead of b.txt. Commit 448ad99 is selected, so git log will print it to the output; and Git adds 31eae74 to the visit queue. Next, Git visits 07ccfb6, but it is now looking for a.txt so this commit is not selected. Git adds 31eae74 to the visit queue (but it's already there so this is a no-op). Finally, Git visits 31eae74; comparing that commit's tree to the empty tree, Git finds an added a.txt so this commit gets selected.

    Note that had Git visited 07ccfb6 before 448ad99, it would have selected both, because at the start it is looking for b.txt.

    The -m flag works by "splitting" a merge into two separate internal "virtual commits" (with the same tree, but with the (from ...) added to their "names" so as to be able to tell which virtual commit resulted from which parent). This has the side effect of retaining both of the split merges and looking at their diffs (since the result of splitting this merge is two ordinary non-merge commits). So now—note that this uses your new repository with its new different hashes in the second sample—Git visits commit 36c80a8 (from 1a07e48), diffs 1a07e48 vs 36c80a8, sees a change to b.txt and selects the commit, and puts 1a07e48 on the visit queue. Next, it visits commit 36c80a8 (from 05116f1), diffs 05116f1 vs 36c80a8, and puts 05116f1 on the visit queue. The rest is fairly obvious from here.

    How can I display cleanly all of the commits that changed a file, following through renames?

    The answer for Git is that you can't, at least not using what is built in to Git.

    You can (sometimes) get a little closer by adding --cc or -c to your git log command. This makes git log look inside merge commits, doing what Git calls a combined diff. But this doesn't necessarily work anyway, because, hidden away in a different part of the documentation is this key sentence:

    Note that combined diff lists only files which were modified from all parents.

    Here is what I get with --cc added (note, the ... is literally there, in git log's output):

    $ git log --graph --oneline --follow --cc -- b.txt
    *   e5a17d7 (HEAD -> master) Merge branch 'feature'
    |\  
    | | 
    ... 
    * | 52e75c9 Move
    |/  
    |   diff --git a/a.txt b/b.txt
    |   similarity index 100%
    |   rename from a.txt
    |   rename to b.txt
    * 7590cfd First commit
      diff --git a/a.txt b/a.txt
      new file mode 100644
      index 0000000..e965047
      --- /dev/null
      +++ b/a.txt
      @@ -0,0 +1 @@
      +Hello
    

    Fundamentally, though, you'd need git log to be much more aware of file renames at merge commits, and to have it look for the old name down any leg using the old file name, and the new name down any leg using the new name. This would require that git log use (most of) the -m option internally on each merge—i.e., split each merge into N separate diffs, one per parent, so as to find which legs have what renames—and then keep a list of which name to use down which branches of merges. But when the forks come back together, i.e., when the multiple legs of the merge (which becomes a fork in our reverse direction) rejoin, it's not clear which name is the correct name to use!

    0 讨论(0)
提交回复
热议问题