I have a codebase that (until now) used git to store its dependencies. The repository itself is available here (warning: it\'s HUGE). Needless to say, I need to remove the d
--prune=now
on git gcAlthough you'd successfully written your unwanted objects out of history, it looks like those unwanted objects were not being pruned because they were too young to be pruned by default (see the configuration docs on git gc
for a bit more detail). Using git gc --prune=now
should handle that, or you could see this answer for a more nuclear option.
Although that should fix your final problem, an underlying problem was the difficulty of finding big blobs in order to remove them using git filter-branch
- to which I would say:
git filter-branch
is painful to use for a task like this, and there's a much better, less well-known tool called The BFG, specifically designed for removing Large Files from Git repos.
The core command to remove big files looks just like this:
$ bfg --strip-blobs-bigger-than 10MB my-repo.git
Any blob over 10MB in size (that isn't in your latest commit) will be totally removed from your repository's history - you don't have to manually find the files yourself, and files in protected commits are safe.
You can then use git gc
to clean away the dead data:
$ git gc --prune=now --aggressive
The BFG is typically hundreds of times faster than running git-filter-branch
on a big repo and the options are tailored around these two common use-cases:
Full disclosure: I'm the author of the BFG Repo-Cleaner.