git: can't find blob - want to get rid of it from pack

前端 未结 5 439
广开言路
广开言路 2021-01-02 01:32

I\'ve a large blob that I want to get rid of! I thought I removed the file using this solution: http://dound.com/2009/04/git-forever-remove-files-or-folders-from-history/ (I

相关标签:
5条回答
  • 2021-01-02 01:36

    Having the same issue. Discovered my troublesome blob is referenced by an unreachable tree. Adding to the git-find-blob script:

    git fsck --full --unreachable | \
    while read unreachable obj tree
    do
        if [[ ! $obj == "tree" ]]; then
            continue
        fi
        if git ls-tree -r $tree | grep -q "$obj_name" ; then
            echo "$unreachable $obj $tree"
        fi
    done
    

    I was able to remove the blob using BFG Repo-Cleaner but I'd be much happier solving the problem using native git commands.

    0 讨论(0)
  • 2021-01-02 01:40

    The blob doesn't appear on the other side of a clean push, so this will be my solution (push to a new location, then clone from that location). Any easier way of doing it?

    0 讨论(0)
  • 2021-01-02 01:43

    You want to use the BFG Repo-Cleaner, a faster, simpler alternative to git-filter-branch designed for removing large files from Git repos.

    Download the Java jar (requires Java 6 or above) and run this command:

    $ java -jar bfg.jar  --strip-blobs-bigger-than 20M  my-repo.git
    

    Any blob over 20M in size (that isn't in your latest commit) will be totally removed from your repository's history. You can then use git gc to clean away the dead data:

    $ git gc --prune=now --aggressive
    

    The BFG is typically 10-50x faster than running git-filter-branch and the options are tailored around these two common use-cases:

    • Removing Crazy Big Files
    • Removing Passwords, Credentials & other Private data

    Full disclosure: I'm the author of the BFG Repo-Cleaner.

    0 讨论(0)
  • 2021-01-02 01:54

    You can use git repack -Ad to force git to reconstruct your packs, and to unpack any unreachable objects into loose objects. At this point you can use git gc --prune=now to discard the unreachable objects.

    You should also double-check that you actually expired your reflogs. I believe git reflog expire --all will default to 90 days (or 30 for unreachable objects), so you may want to use git reflog expire --expire-unreachable=now --all instead (this needs to be done before the repack+gc).

    0 讨论(0)
  • 2021-01-02 01:59

    Firstly, in your git gc invocation, you should use --prune=now, since the default is to keep objects which are less than 2 weeks old.

    Secondly, the git-find-blob command you've used by default only looks in the history of HEAD for commits, so if the blob is on another branch then that script will miss it. Try invoking it as:

    ./git-find-blob ba9d1d27ee64154146b37dfaf42ededecea847e1 --all
    
    0 讨论(0)
提交回复
热议问题