Does anyone know what are the Git limits for number of files and size of files?
git has a 4G (32bit) limit for repo.
http://code.google.com/p/support/wiki/GitFAQ
There is no real limit -- everything is named with a 160-bit name. The size of the file must be representable in a 64 bit number so no real limit there either.
There is a practical limit, though. I have a repository that's ~8GB with >880,000 files and git gc takes a while. The working tree is rather large so operations that inspect the entire working directory take quite a while. This repo is only used for data storage, though, so it's just a bunch of automated tools that handle it. Pulling changes from the repo is much, much faster than rsyncing the same data.
%find . -type f | wc -l
791887
%time git add .
git add . 6.48s user 13.53s system 55% cpu 36.121 total
%time git status
# On branch master
nothing to commit (working directory clean)
git status 0.00s user 0.01s system 0% cpu 47.169 total
%du -sh .
29G .
%cd .git
%du -sh .
7.9G .
It depends on what your meaning is. There are practical size limits (if you have a lot of big files, it can get boringly slow). If you have a lot of files, scans can also get slow.
There aren't really inherent limits to the model, though. You can certainly use it poorly and be miserable.
Back in Feb 2012, there was a very interesting thread on the Git mailing list from Joshua Redstone, a Facebook software engineer testing Git on a huge test repository:
The test repo has 4 million commits, linear history and about 1.3 million files.
Tests that were run show that for such a repo Git is unusable (cold operation lasting minutes), but this may change in the future. Basically the performance is penalized by the number of stat()
calls to the kernel FS module, so it will depend on the number of files in the repo, and the FS caching efficiency. See also this Gist for further discussion.