Is there any way to estimate the size of a public Git repository without having to clone it?
I\'d like to use this information to make sure the reposito
Short answer: "no."
If space is a concern at all, clone the repo to your largest available freespace and if it's dinky enough to put elsewhere moving it will be cheap.
A really brute-force way to get it: put this in e.g. your post-receive hook on the server
git for-each-ref refs/size | while read . . ref; do git update-ref --delete $ref; done
set -- $(du -sh .git/objects)
git update-ref refs/size/$1-as-of-$(date +%Y%m%dT%H%M%S%Z) HEAD
and you can just ls-remote for it.
Short answer: Nnn...maybe.
Long answer: There are some heuristics, and you can poke around with the Git transfer protocols to glean some information.
My personal observation is that for most text-based projects, the .git size is rarely more than the checkout size, even for very old projects.
Fetching info/refs
will tell you how many tags and branches are in the repository.
Fetching objects/info/packs
will tell you what packfiles the project has. You can then do a HEAD request (assuming it's HTTP) on objects/pack/pack-WHATEVERTHEIDIS.pack
to see how big the pack files are. That will give you a lower bound for repository size.
If disk space is the problem (disk is cheap, buy a new one), you can do a git clone --bare
to save you the space of a checkout. You can then clone that local, bare version to get a full checkout.
Finally, if you're clever, you could walk the object tree doing a HEAD request to get the size of each object and cancelling the object GET after you've received just the header (ignoring the data part). That would give you the size of the repository without having to download the whole repository.