Managing large binary files with Git

后端未结

关注

 12  854

不思量自难忘° 2020-11-22 05:47

I am looking for opinions of how to handle large binary files on which my source code (web application) is dependent. We are currently discussing several alternatives:

12条回答

[愿得一人] (楼主)

2020-11-22 06:23

I would use submodules (as Pat Notz) or two distinct repositories. If you modify your binary files too often, then I would try to minimize the impact of the huge repository cleaning the history:

I had a very similar problem several months ago: ~21 GB of MP3 files, unclassified (bad names, bad id3's, don't know if I like that MP3 file or not...), and replicated on three computers.

I used an external hard disk drive with the main Git repository, and I cloned it into each computer. Then, I started to classify them in the habitual way (pushing, pulling, merging... deleting and renaming many times).

At the end, I had only ~6 GB of MP3 files and ~83 GB in the .git directory. I used git-write-tree and git-commit-tree to create a new commit, without commit ancestors, and started a new branch pointing to that commit. The "git log" for that branch only showed one commit.

Then, I deleted the old branch, kept only the new branch, deleted the ref-logs, and run "git prune": after that, my .git folders weighted only ~6 GB...

You could "purge" the huge repository from time to time in the same way: Your "git clone"'s will be faster.

0 讨论(0)

查看其它12个回答
发布评论:

提交评论
- 加载中...