What happens to orphaned commits?

后端 未结 3 405
失恋的感觉
失恋的感觉 2020-12-31 06:04

I have a repo with four commits:

$ git log --oneline --decorate
6c35831 (HEAD, master) C4
974073b C3
e27b22c C2
9f2d694 C1

I reset --

相关标签:
3条回答
  • 2020-12-31 06:32

    Short answer: Commits C3 and C4 will remain in the Git object database until they are garbage collected.

    Long answer: Garbage collection will occur automatically by different Git porcelain commands or when explicitly garbage collected. There are many scenarios that could trigger an automatic garbage collection; take a look at the gc.* configuration settings to get an idea. You can explicitly gabage collect using the git gc builtin command. Let's look at an example to see what happens.

    First, let's set up our environment (I am using Linux; make changes as necessary for your environment) so we hopefully get the same object hashes in different Git repositories.

    export GIT_AUTHOR_NAME='Wile E. Coyote'
    export GIT_AUTHOR_EMAIL=coyote@acme.com
    export GIT_AUTHOR_DATE=2015-01-01T12:00:00
    export GIT_COMMITTER_NAME='Roadrunner'
    export GIT_COMMITTER_EMAIL=roadrunner@acme.com
    export GIT_COMMITTER_DATE=2015-01-01T12:00:00
    

    Since commit object hashes are generated using this information, if we use the same author and committer values, we should all now get the same hashes.

    Now let's initialize a function to log object information using git log, git reflog, git count-objects, git rev-list and git fsck.

    function git_log_objects () {
        echo 'Log ...'
        git log --oneline --decorate
        echo 'Reflog ...'
        git reflog show --all
        echo 'Count ...'
        git count-objects -v
        echo 'Hashes ...'
        # See: https://stackoverflow.com/a/7350019/649852
        {
            git rev-list --objects --all --reflog
            git rev-list --objects -g --no-walk --all
            git rev-list --objects --no-walk $(
                git fsck --unreachable 2>/dev/null \
                    | grep '^unreachable commit' \
                    | cut -d' ' -f3
            )
        } | sort | uniq
    }
    

    Now let's initialize a Git repository.

    git --version
    git init
    git_log_objects
    

    Which, for me, outputs:

    git version 2.4.0
    Initialized empty Git repository in /tmp/test/.git/
    Log ...
    fatal: bad default revision 'HEAD'
    Reflog ...
    fatal: bad default revision 'HEAD'
    Count ...
    count: 0
    size: 0
    in-pack: 0
    packs: 0
    size-pack: 0
    prune-packable: 0
    garbage: 0
    size-garbage: 0
    Hashes ...
    

    As expected, we have an initialized repository with no objects in it. Let's make some commits and take a look at the objects.

    git commit --allow-empty -m C1
    git commit --allow-empty -m C2
    git tag T1
    git commit --allow-empty -m C3
    git commit --allow-empty -m C4
    git commit --allow-empty -m C5
    git_log_objects
    

    Which gives me the following output:

    [master (root-commit) c11e156] C1
     Author: Wile E. Coyote <coyote@acme.com>
    [master 10bfa58] C2
     Author: Wile E. Coyote <coyote@acme.com>
    [master 8aa22b5] C3
     Author: Wile E. Coyote <coyote@acme.com>
    [master 1abb34f] C4
     Author: Wile E. Coyote <coyote@acme.com>
    [master d1efc10] C5
     Author: Wile E. Coyote <coyote@acme.com>
    Log ...
    d1efc10 (HEAD -> master) C5
    1abb34f C4
    8aa22b5 C3
    10bfa58 (tag: T1) C2
    c11e156 C1
    Reflog ...
    d1efc10 refs/heads/master@{0}: commit: C5
    1abb34f refs/heads/master@{1}: commit: C4
    8aa22b5 refs/heads/master@{2}: commit: C3
    10bfa58 refs/heads/master@{3}: commit: C2
    c11e156 refs/heads/master@{4}: commit (initial): C1
    Count ...
    count: 6
    size: 24
    in-pack: 0
    packs: 0
    size-pack: 0
    prune-packable: 0
    garbage: 0
    size-garbage: 0
    Hashes ...
    10bfa58a7bcbadfc6c9af616da89e4139c15fbb9
    1abb34f82523039920fc629a68d3f82bc79acbd0
    4b825dc642cb6eb9a060e54bf8d69288fbee4904 
    8aa22b5f0fed338dd13c16537c1c54b3496e3224
    c11e1562835fe1e9c25bf293279bff0cf778b6e0
    d1efc109115b00bac9d4e3d374a05a3df9754551
    

    Now we have six objects in the repository: five commits and one empty tree. We can see Git has branch, tag and/or reflog references to all five commit objects. As long as Git references an object, that object will not be garbage collected. Explicitly running a gabage collection will result in no objects being removed from the repository. (I'll leave verifying this as an exercise for you to complete.)

    Now let's remove Git references to the C3, C4 and C5 commits.

    git reset --soft T1
    git reflog expire --expire=all --all
    git_log_objects
    

    Which outputs:

    Log ...
    10bfa58 (HEAD -> master, tag: T1) C2
    c11e156 C1
    Reflog ...
    Count ...
    count: 6
    size: 24
    in-pack: 0
    packs: 0
    size-pack: 0
    prune-packable: 0
    garbage: 0
    size-garbage: 0
    Hashes ...
    10bfa58a7bcbadfc6c9af616da89e4139c15fbb9
    1abb34f82523039920fc629a68d3f82bc79acbd0
    4b825dc642cb6eb9a060e54bf8d69288fbee4904 
    8aa22b5f0fed338dd13c16537c1c54b3496e3224
    c11e1562835fe1e9c25bf293279bff0cf778b6e0
    d1efc109115b00bac9d4e3d374a05a3df9754551
    

    Now we see only two commits are being referenced by Git. However, all six objects are still in the repository. They will remain in the repository until they are automatically or explicitly garbage collected. You could even, for example, revive an unreferenced commit with git cherry-pick or look at it with git show. For now though, let's explicitly garbage collect the unreferenced objects and see what Git does behind the scenes.

    GIT_TRACE=1 git gc --aggressive --prune=now
    

    This will output a bit of information.

    11:03:03.123194 git.c:348               trace: built-in: git 'gc' '--aggressive' '--prune=now'
    11:03:03.123625 run-command.c:347       trace: run_command: 'pack-refs' '--all' '--prune'
    11:03:03.124038 exec_cmd.c:129          trace: exec: 'git' 'pack-refs' '--all' '--prune'
    11:03:03.126895 git.c:348               trace: built-in: git 'pack-refs' '--all' '--prune'
    11:03:03.128298 run-command.c:347       trace: run_command: 'reflog' 'expire' '--all'
    11:03:03.128635 exec_cmd.c:129          trace: exec: 'git' 'reflog' 'expire' '--all'
    11:03:03.131322 git.c:348               trace: built-in: git 'reflog' 'expire' '--all'
    11:03:03.133179 run-command.c:347       trace: run_command: 'repack' '-d' '-l' '-f' '--depth=250' '--window=250' '-a'
    11:03:03.133522 exec_cmd.c:129          trace: exec: 'git' 'repack' '-d' '-l' '-f' '--depth=250' '--window=250' '-a'
    11:03:03.136915 git.c:348               trace: built-in: git 'repack' '-d' '-l' '-f' '--depth=250' '--window=250' '-a'
    11:03:03.137179 run-command.c:347       trace: run_command: 'pack-objects' '--keep-true-parents' '--honor-pack-keep' '--non-empty' '--all' '--reflog' '--indexed-objects' '--window=250' '--depth=250' '--no-reuse-delta' '--local' '--delta-base-offset' '.git/objects/pack/.tmp-8973-pack'
    11:03:03.137686 exec_cmd.c:129          trace: exec: 'git' 'pack-objects' '--keep-true-parents' '--honor-pack-keep' '--non-empty' '--all' '--reflog' '--indexed-objects' '--window=250' '--depth=250' '--no-reuse-delta' '--local' '--delta-base-offset' '.git/objects/pack/.tmp-8973-pack'
    11:03:03.140367 git.c:348               trace: built-in: git 'pack-objects' '--keep-true-parents' '--honor-pack-keep' '--non-empty' '--all' '--reflog' '--indexed-objects' '--window=250' '--depth=250' '--no-reuse-delta' '--local' '--delta-base-offset' '.git/objects/pack/.tmp-8973-pack'
    Counting objects: 3, done.
    Delta compression using up to 4 threads.
    Compressing objects: 100% (2/2), done.
    Writing objects: 100% (3/3), done.
    Total 3 (delta 1), reused 0 (delta 0)
    11:03:03.153843 run-command.c:347       trace: run_command: 'prune' '--expire' 'now'
    11:03:03.154255 exec_cmd.c:129          trace: exec: 'git' 'prune' '--expire' 'now'
    11:03:03.156744 git.c:348               trace: built-in: git 'prune' '--expire' 'now'
    11:03:03.159210 run-command.c:347       trace: run_command: 'rerere' 'gc'
    11:03:03.159527 exec_cmd.c:129          trace: exec: 'git' 'rerere' 'gc'
    11:03:03.161807 git.c:348               trace: built-in: git 'rerere' 'gc'
    

    And finally, let's look at the objects.

    git_log_objects
    

    Which outputs:

    Log ...
    10bfa58 (HEAD -> master, tag: T1) C2
    c11e156 C1
    Reflog ...
    Count ...
    count: 0
    size: 0
    in-pack: 3
    packs: 1
    size-pack: 1
    prune-packable: 0
    garbage: 0
    size-garbage: 0
    Hashes ...
    10bfa58a7bcbadfc6c9af616da89e4139c15fbb9
    4b825dc642cb6eb9a060e54bf8d69288fbee4904 
    c11e1562835fe1e9c25bf293279bff0cf778b6e0
    

    Now we see we only have three objects: the two commits and one empty tree.

    0 讨论(0)
  • 2020-12-31 06:44

    Run git show 6c35831 to see that C4, for instance, is still there. Run git reflog master to see (lots of) what master used to reference. One of the entries (master^{1} mostly likely, but perhaps one older if you have made other changes as well) should correspond to 6c35831, and git show master^{1} (or whichever entry it is) should show the same output of the first git show command I mentioned.

    0 讨论(0)
  • 2020-12-31 06:46

    Orphaned commits just stay there until they are garbage collected by explicitly running git gc.

    0 讨论(0)
提交回复
热议问题