Remove large commits from git

前端 未结 5 1398
再見小時候
再見小時候 2021-02-14 05:37

We\'re running a central git repository (gforge) that everyone pulls from and pushes to. Unfortunately, some inept co-workers have decided that pushing several 10-100Mb jar file

相关标签:
5条回答
  • 2021-02-14 05:42

    Check this out https://help.github.com/articles/remove-sensitive-data . Here they write about removing sensitive data from your Git repository but you can very well use it for removing the large files from your commits.

    0 讨论(0)
  • 2021-02-14 05:42

    Use filter-branch!

    git filter-branch --tree-filter 'find . -name "*.jar" -exec rm {} \;'
    

    Then just purge all the commits that don't have any files in them with:

    git filter-branch -f --prune-empty -- --all
    
    0 讨论(0)
  • 2021-02-14 05:42

    GForge guy here. Even thought this is primarily a git question, I'd like to offer two things:

    1. Starting in GForge 6.3, site admins can identify projects that are using too much disk, as well as old and orphaned projects. This might help you avoid full-disk situations, especially if you have lots of separate teams and projects.
    2. Implementing git hooks (SCM hooks in general) in easy to do in GForge. Site administrators can configure any number of hook commands, and project-level people can then select which hooks they want for their project. Adding a hook that prevents certain types (or sizes?) of file would be a good fit for this feature.
    0 讨论(0)
  • 2021-02-14 05:52

    In addition to the other answers, you may want to consider adding some pre-emptive protection against future giant jar files, in the form of a pre-receive hook in the repo that forbids users (or at least "non-admin users") from pushing very large files, or files named *.jar, or whatever seems best.

    We've done this sort of thing before, including forbidding specific commit IDs because of certain users who just couldn't get the hang of "save your work on a temp branch, reset and pull, and re-apply your work, minus the giant file".

    Note that the pre-receive hook runs in a rather interesting context: the files have actually been uploaded, it's just that the references (usually branch heads) have not actually changed yet. You can prevent the branch heads from changing but you'll still be using (temporary, until gc'ed) disk space and network bandwidth.

    0 讨论(0)
  • 2021-02-14 06:02

    The easiest way to avoid chaos is to give the server more disk.

    This is a tough one. Removing the files requires removing them from the history, too, which can only be done with git filter-branch. This command, for example, would remove <file> from the history:

    git filter-branch --index-filter 'git rm --cached --ignore-unmatch <file>' \
    --prune-empty --tag-name-filter cat -- --all
    

    The problem is this rewrites SHA1 hashes, meaning everyone on the team will need to reset to the new branch version or risk some serious headache. That's all fine and good if no one has work in progress and you all use topic branches. If you're more centralized, your team is large, or many of them keep dirty working directories while they work, there's no way to do this without a little bit of chaos and discord. You could spend quite a while getting everyone's local working correctly. That written, git filter-branch is probably the best solution. Just make sure you've got a plan, your team understands it, and you make sure they back up their local repositories in case some vital work in progress gets lost or munged.

    One possible plan would be:

    1. Get the team to generate patches of their work in progress, something like git diff > ~/my_wip.
    2. Get the team to generate patches for their committed but unshared work: git format-patch <branch>
    3. Run git filter-branch. Make sure the team knows not to pull while this is happening.
    4. Have the team issue git fetch && git reset --hard origin/<branch> or have them clone the repository afresh.
    5. Apply their previously committed work with git am <patch>.
    6. Apply their work in progress with git apply, e.g. git apply ~/my_wip.
    0 讨论(0)
提交回复
热议问题