Git repo still huge after large files removed from repository history

五迷三道 提交于 2019-12-01 05:56:38
Roberto Tyley

Use --prune=now on git gc

Although you'd successfully written your unwanted objects out of history, it looks like those unwanted objects were not being pruned because they were too young to be pruned by default (see the configuration docs on git gc for a bit more detail). Using git gc --prune=now should handle that, or you could see this answer for a more nuclear option.

Although that should fix your final problem, an underlying problem was the difficulty of finding big blobs in order to remove them using git filter-branch - to which I would say:

...don't use git filter-branch

git filter-branch is painful to use for a task like this, and there's a much better, less well-known tool called The BFG, specifically designed for removing Large Files from Git repos.

The core command to remove big files looks just like this:

$ bfg  --strip-blobs-bigger-than 10MB  my-repo.git

Any blob over 10MB in size (that isn't in your latest commit) will be totally removed from your repository's history - you don't have to manually find the files yourself, and files in protected commits are safe.

You can then use git gc to clean away the dead data:

$ git gc --prune=now --aggressive

The BFG is typically hundreds of times faster than running git-filter-branch on a big repo and the options are tailored around these two common use-cases:

  • Removing Crazy Big Files
  • Removing Passwords, Credentials & other Private data

Full disclosure: I'm the author of the BFG Repo-Cleaner.

You need to run David Underhill's script on each branch in the repository to ensure the references are removed from all branches.

Then, as in the further discussion, initialize a new repository with git init and either git pull from the original or git remote add origin <original> and then pull all branches.

$ du -sh ./BIG
299M ./BIG
$ cd BIG
$ git checkout master
$ git-remove-history REMOVE_ME
....
$ git checkout branch2
$ git-remove-history REMOVE_ME
...
$ cd ../SMALL
$ git init
$ git remote add origin ../BIG
$ git fetch --all
$ git checkout master
$ cd ..
$ du -sh ./SMALL ./BIG
26M ./SMALL
244M ./BIG

I had accidentally stored large .jpa backups of my site in git -

git filter-branch --prune-empty --index-filter 'git rm -rf --cached --ignore-unmatch MY_BIG_DIRECTORY_OR_FILE' --tag-name-filter cat -- --all

Relpace MY_BIG_DIRECTORY_OR_FILE with the folder in question to completely rewrite your history, including tags.

source:

http://naleid.com/blog/2012/01/17/finding-and-purging-big-files-from-git-history

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!