Version control for large binary files and >1TB repositories?

后端未结

关注

 10  1040

Sorry to come up with this topic again, as there are soo many other questions already related - but none that covers my problem directly.

What I\'m searching is a good v

相关标签:

10条回答

既然无缘

2021-02-01 04:59

You might be much better off by simply relying on some NAS device that would provide a combination of filesystem-accessible snapshots together with single instance store / block level deduplication, given the scale of data you are describing ...

(The question also mentions .cab & .msi files: usually the CI software of your choice has some method of archiving builds. Is that what you are ultimately after?)

0 讨论(0)
发布评论:

提交评论
- 加载中...
醉酒成梦

2021-02-01 05:04

Take a look at Boar, "Simple version control and backup for photos, videos and other binary files". It can easily handle huge files and huge repositories.

0 讨论(0)
发布评论:

提交评论
- 加载中...
不知归路

2021-02-01 05:04

Update May 2017:

Git, with the addition of GVFS (Git Virtual File System), can support virtually any number of files of any size (starting with the Windows repository itself: "The largest Git repo on the planet" (3.5M files, 320GB).
This is not yet >1TB, but it can scale there.

The work done with GVFS is slowly proposed upstream (that is to Git itself), but that is still a work in progress.
GVFS is implement on Windows, but will soon be done for Mac (because the team at Windows developing Office for Mac demands it), and Linux.

April 2015

Git can actually be considered as a viable VCS for large data, with Git Large File Storage (LFS) (by GitHub, april 2015).

git-lfs (see git-lfs.github.com) can be tested with a server supporting it: lfs-test-server (or directly with github.com itself):
You can store metadata only in the git repo, and the large file elsewhere.

0 讨论(0)
发布评论:

提交评论
- 加载中...
我寻月下人不归

2021-02-01 05:04

When you really have to use a VCS, i would use svn, since svn does not require to copy the entire repository to the working copy. But it still needs about the duplicate amount of disk space, since it has a clean copy for each file.

With these amount of data I would look for a document management system, or (low level) use a read-only network share with a defined input process.

0 讨论(0)
发布评论:

提交评论
- 加载中...

上一页 1 2