Git - repository and file size limits

泄露秘密 提交于 2019-12-13 05:50:19

问题


I've read at various internet resources that Git is handling large files not very well, also, Git seems to have problems with large overall repository sizes. This seems to have initiated projects like git-annex, git-media, git-fat, git-bigfiles, and probably even more...

However, after reading Git-Internals it looks to me, like Git's pack file concept should solve all the problems with large files.

Q1: What's the fuss about large files in Git?

Q2: What's the fuss about Git and large repositories?

Q3: If we have a project with two binary dependencies (e.g. around 25 DLL files with each around 500KB to 1MB) which are updated on a monthly basis. Is this really going to be a problem for Git? Is only the initial cloning going to be a long process, or is working with the repository (e.g. branch change, commits, pulling, pushing, etc.) going to be everyday problem?


回答1:


In a nutshell, today's computers are bad with large files. Moving megabytes around is pretty fast but gigabytes take time. Only specialized tools are ready to handle gigabytes of data and Git isn't one of those.

More related to Git: Git compares files all the time. If the files are small (a few KB), then these operations are fast. If they are huge, then git has to compare many, many bytes and that takes time, memory and nerves.

The projects which you list add special handling for large files, like saving them in individual blobs without trying to compare them to previous versions. That makes every day operations faster but at the cost of repository size. And Git needs free disk space in the order of the repo size for some operations or you'll get errors (and maybe a corrupted repo since this code is prone to be tested least).

Lastly, the initial clone will take a long time.

Regarding Q3: Git isn't a backup tool. You probably don't want to be able to get the DLL from ten years ago, ever.

Put the sources for those libraries under Git and then use a backup/release process to handle the binaries (like keeping the last 12 months worth on some network drive).



来源:https://stackoverflow.com/questions/24382257/git-repository-and-file-size-limits

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!