Is it possible to do a sparse checkout without checking out the whole repository first?

后端 未结 14 1632
醉梦人生
醉梦人生 2020-11-22 09:32

I\'m working with a repository with a very large number of files that takes hours to checkout. I\'m looking into the possibility of whether Git would work well with this kin

相关标签:
14条回答
  • 2020-11-22 10:04

    Yes, Possible to download a folder instead of downloading the whole repository. Even any/last commit

    Nice way to do this

    D:\Lab>git svn clone https://github.com/Qamar4P/LolAdapter.git/trunk/lol-adapter -r HEAD
    
    1. -r HEAD will only download last revision, ignore all history.

    2. Note trunk and /specific-folder

    Copy and change URL before and after /trunk/. I hope this will help someone. Enjoy :)

    Updated on 26 Sep 2019

    0 讨论(0)
  • 2020-11-22 10:07

    Sadly none of the above worked for me so I spent very long time trying different combination of sparse-checkout file.

    In my case I wanted to skip folders with IntelliJ IDEA configs.

    Here is what I did:


    Run git clone https://github.com/myaccount/myrepo.git --no-checkout

    Run git config core.sparsecheckout true

    Created .git\info\sparse-checkout with following content

    !.idea/*
    !.idea_modules/*
    /*
    

    Run 'git checkout --' to get all files.


    Critical thing to make it work was to add /* after folder's name.

    I have git 1.9

    0 讨论(0)
  • 2020-11-22 10:08

    Based on this answer by apenwarr and this comment by Miral I came up with the following solution which saved me nearly 94% of disk space when cloning the linux git repository locally while only wanting one Documentation subdirectory:

    $ cd linux
    $ du -sh .git .
    2.1G    .git
    894M    .
    $ du -sh 
    2.9G    .
    $ mkdir ../linux-sparse-test
    $ cd ../linux-sparse-test
    $ git init
    Initialized empty Git repository in /…/linux-sparse-test/.git/
    $ git config core.sparseCheckout true
    $ git remote add origin ../linux
    # Parameter "origin master" saves a tiny bit if there are other branches
    $ git fetch --depth=1 origin master
    remote: Enumerating objects: 65839, done.
    remote: Counting objects: 100% (65839/65839), done.
    remote: Compressing objects: 100% (61140/61140), done.
    remote: Total 65839 (delta 6202), reused 22590 (delta 3703)
    Receiving objects: 100% (65839/65839), 173.09 MiB | 10.05 MiB/s, done.
    Resolving deltas: 100% (6202/6202), done.
    From ../linux
     * branch              master     -> FETCH_HEAD
     * [new branch]        master     -> origin/master
    $ echo "Documentation/hid/*" > .git/info/sparse-checkout
    $ git checkout master
    Branch 'master' set up to track remote branch 'master' from 'origin'.
    Already on 'master'
    $ ls -l
    total 4
    drwxr-xr-x 3 abe abe 4096 May  3 14:12 Documentation/
    $  du -sh .git .
    181M    .git
    100K    .
    $  du -sh
    182M    .
    

    So I got down from 2.9GB to 182MB which is already quiet nice.

    I though didn't get this to work with git clone --depth 1 --no-checkout --filter=blob:none file:///…/linux linux-sparse-test (hinted here) as then the missing files were all added as removed files to the index. So if anyone knows the equivalent of git clone --filter=blob:none for git fetch, we can probably save some more megabytes. (Reading the man page of git-rev-list also hints that there is something like --filter=sparse:path=…, but I didn't get that to work either.

    (All tried with git 2.20.1 from Debian Buster.)

    0 讨论(0)
  • 2020-11-22 10:14

    I'm new to git but it seems that if I do git checkout for each directory then it works. Also, the sparse-checkout file needs to have a trailing slash after every directory as indicated. Someone more experience please confirm that this will work.

    Interestingly, if you checkout a directory not in the sparse-checkout file it seems to make no difference. They don't show up in git status and git read-tree -m -u HEAD doesn't cause it to be removed. git reset --hard doesn't cause the directory to be removed either. Anyone more experienced care to comment on what git thinks of directories that are checked out but which are not in the sparse checkout file?

    0 讨论(0)
  • 2020-11-22 10:14

    In git 2.27, it looks like git sparse checkout has evolved. Solution in this answer does not work exactly the same way (compared to git 2.25)

    git clone <URL> --no-checkout <directory>
    cd <directory>
    git sparse-checkout init --cone # to fetch only root files
    git sparse-checkout set apps/my_app libs/my_lib # etc, to list sub-folders to checkout
    # they are checked out immediately after this command, no need to run git pull
    

    These commands worked better:

    git clone --sparse <URL> <directory>
    cd <directory>
    git sparse-checkout init --cone # to fetch only root files
    git sparse-checkout add apps/my_app
    git sparse-checkout add libs/my_lib
    

    See also : git-clone --sparse and git-sparse-checkout add

    0 讨论(0)
  • 2020-11-22 10:16

    In 2020 there is a simpler way to deal with sparse-checkout without having to worry about .git files. Here is how I did it:

    git clone <URL> --no-checkout <directory>
    cd <directory>
    git sparse-checkout init --cone # to fetch only root files
    git sparse-checkout set apps/my_app libs/my_lib # etc, to list sub-folders to checkout
    # they are checked out immediately after this command, no need to run git pull
    

    Note that it requires git version 2.25 installed. Read more about it here: https://github.blog/2020-01-17-bring-your-monorepo-down-to-size-with-sparse-checkout/

    UPDATE:

    The above git clone command will still clone the repo with its full history, though without checking the files out. If you don't need the full history, you can add --depth parameter to the command, like this:

    # create a shallow clone,
    # with only 1 (since depth equals 1) latest commit in history
    git clone <URL> --no-checkout <directory> --depth 1
    
    0 讨论(0)
提交回复
热议问题