I\'m working with a repository with a very large number of files that takes hours to checkout. I\'m looking into the possibility of whether Git would work well with this kin
Yes, Possible to download a folder instead of downloading the whole repository. Even any/last commit
Nice way to do this
D:\Lab>git svn clone https://github.com/Qamar4P/LolAdapter.git/trunk/lol-adapter -r HEAD
-r HEAD will only download last revision, ignore all history.
Note trunk and /specific-folder
Copy and change URL before and after /trunk/
. I hope this will help someone. Enjoy :)
Updated on 26 Sep 2019
Sadly none of the above worked for me so I spent very long time trying different combination of sparse-checkout
file.
In my case I wanted to skip folders with IntelliJ IDEA configs.
Here is what I did:
Run git clone https://github.com/myaccount/myrepo.git --no-checkout
Run git config core.sparsecheckout true
Created .git\info\sparse-checkout
with following content
!.idea/*
!.idea_modules/*
/*
Run 'git checkout --' to get all files.
Critical thing to make it work was to add /*
after folder's name.
I have git 1.9
Based on this answer by apenwarr and this comment by Miral I came up with the following solution which saved me nearly 94% of disk space when cloning the linux git repository locally while only wanting one Documentation subdirectory:
$ cd linux
$ du -sh .git .
2.1G .git
894M .
$ du -sh
2.9G .
$ mkdir ../linux-sparse-test
$ cd ../linux-sparse-test
$ git init
Initialized empty Git repository in /…/linux-sparse-test/.git/
$ git config core.sparseCheckout true
$ git remote add origin ../linux
# Parameter "origin master" saves a tiny bit if there are other branches
$ git fetch --depth=1 origin master
remote: Enumerating objects: 65839, done.
remote: Counting objects: 100% (65839/65839), done.
remote: Compressing objects: 100% (61140/61140), done.
remote: Total 65839 (delta 6202), reused 22590 (delta 3703)
Receiving objects: 100% (65839/65839), 173.09 MiB | 10.05 MiB/s, done.
Resolving deltas: 100% (6202/6202), done.
From ../linux
* branch master -> FETCH_HEAD
* [new branch] master -> origin/master
$ echo "Documentation/hid/*" > .git/info/sparse-checkout
$ git checkout master
Branch 'master' set up to track remote branch 'master' from 'origin'.
Already on 'master'
$ ls -l
total 4
drwxr-xr-x 3 abe abe 4096 May 3 14:12 Documentation/
$ du -sh .git .
181M .git
100K .
$ du -sh
182M .
So I got down from 2.9GB to 182MB which is already quiet nice.
I though didn't get this to work with git clone --depth 1 --no-checkout --filter=blob:none file:///…/linux linux-sparse-test
(hinted here) as then the missing files were all added as removed files to the index. So if anyone knows the equivalent of git clone --filter=blob:none
for git fetch
, we can probably save some more megabytes. (Reading the man page of git-rev-list
also hints that there is something like --filter=sparse:path=…
, but I didn't get that to work either.
(All tried with git 2.20.1 from Debian Buster.)
I'm new to git but it seems that if I do git checkout for each directory then it works. Also, the sparse-checkout file needs to have a trailing slash after every directory as indicated. Someone more experience please confirm that this will work.
Interestingly, if you checkout a directory not in the sparse-checkout file it seems to make no difference. They don't show up in git status and git read-tree -m -u HEAD doesn't cause it to be removed. git reset --hard doesn't cause the directory to be removed either. Anyone more experienced care to comment on what git thinks of directories that are checked out but which are not in the sparse checkout file?
In git 2.27, it looks like git sparse checkout has evolved. Solution in this answer does not work exactly the same way (compared to git 2.25)
git clone <URL> --no-checkout <directory> cd <directory> git sparse-checkout init --cone # to fetch only root files git sparse-checkout set apps/my_app libs/my_lib # etc, to list sub-folders to checkout # they are checked out immediately after this command, no need to run git pull
These commands worked better:
git clone --sparse <URL> <directory>
cd <directory>
git sparse-checkout init --cone # to fetch only root files
git sparse-checkout add apps/my_app
git sparse-checkout add libs/my_lib
See also : git-clone --sparse and git-sparse-checkout add
In 2020 there is a simpler way to deal with sparse-checkout without having to worry about .git files. Here is how I did it:
git clone <URL> --no-checkout <directory>
cd <directory>
git sparse-checkout init --cone # to fetch only root files
git sparse-checkout set apps/my_app libs/my_lib # etc, to list sub-folders to checkout
# they are checked out immediately after this command, no need to run git pull
Note that it requires git version 2.25 installed. Read more about it here: https://github.blog/2020-01-17-bring-your-monorepo-down-to-size-with-sparse-checkout/
UPDATE:
The above git clone
command will still clone the repo with its full history, though without checking the files out. If you don't need the full history, you can add --depth parameter to the command, like this:
# create a shallow clone,
# with only 1 (since depth equals 1) latest commit in history
git clone <URL> --no-checkout <directory> --depth 1