Ways to improve git status performance

后端 未结 10 812
遇见更好的自我
遇见更好的自我 2020-12-02 06:51

I have a repo of 10 GB on a Linux machine which is on NFS. The first time git status takes 36 minutes and subsequent git status takes 8 minutes. Se

相关标签:
10条回答
  • 2020-12-02 07:23

    It is a pretty old question. Though, I am surprised that no one commented about binary file given the repository size.

    You mentioned that your git repo is ~10GB. It seems that apart from NFS issue and other git issues (resolvable by git gc and git configuration change as outline in other answers), git commands (git status, git diff, git add) might be slow because of large number of binary file in the repository. git is not good at handling binary file. You can remove unnecessary binary file using following command (example is given for NetCDF file; have a backup of git repository before):

    git filter-branch --force --index-filter \  
    'git rm --cached --ignore-unmatch *.nc' \   
    --prune-empty --tag-name-filter cat -- --all
    

    Do not forget to put '*.nc' to gitignore file to stop git from recommit the file.

    0 讨论(0)
  • 2020-12-02 07:28

    To be more precise, git depends on the efficiency of the lstat(2) system call, so tweaking your client’s “attribute cache timeout” might do the trick.

    The manual for git-update-index — essentially a manual mode for git-status — describes what you can do to alleviate this, by using the --assume-unchanged flag to suppress its normal behavior and manually update the paths that you have changed. You might even program your editor to unset this flag every time you save a file.

    The alternative, as you suggest, is to reduce the size of your checkout (the size of the packfiles doesn’t really come into play here). The options are a sparse checkout, submodules, or Google’s repo tool.

    (There’s a mailing list thread about using Git with NFS, but it doesn’t answer many questions.)

    0 讨论(0)
  • 2020-12-02 07:28

    If your git repo makes heavy use of submodules, you can greatly speed up the performance of git status by editing the config file in the .git directory and setting ignore = dirty on any particularly large/heavy submodules. For example:

    [submodule "mysubmodule"]
    url = ssh://mysubmoduleURL
    ignore = dirty
    

    You'll lose the convenience of a reminder that there are unstaged changes in any of the submodules that you may have forgotten about, but you'll still retain the main convenience of knowing when the submodules are out of sync with the main repo. Plus, you can still change your working directory to the submodule itself and use git status within it as per usual to see more information. See this question for more details about what "dirty" means.

    0 讨论(0)
  • 2020-12-02 07:29

    Leftover index.lock files

    git status can be pathologically slow when you have leftover index.lock files.

    This happens especially when you have git submodules, because then you often don't notice such lefterover files.

    Summary: Run find .git/ -name index.lock, and delete the leftover files after checking that they are indeed not used by any currently running program.


    Details

    I found that my shell git status was extremely slow in my repo, with git 2.19 on Ubuntu 16.04.

    Dug in and found that /usr/bin/time git status in my assets git submodule took 1.7 seconds.

    Found with strace that git read all my big files in there with mmap. It doesn't usually do that, usually stat is enough.

    I googled the problem and found the Use of index and Racy Git problem.

    Tried git update-index somefile (in my case gitignore in the submodule checkout) shown here but it failed with

    fatal: Unable to create '/home/niklas/src/myproject/.git/modules/assets/index.lock': File exists.
    
    Another git process seems to be running in this repository, e.g.
    an editor opened by 'git commit'. Please make sure all processes
    are terminated then try again. If it still fails, a git process
    may have crashed in this repository earlier:
    remove the file manually to continue.
    

    This is a classical error. Usually you notice it at any git operation, but for submodules that you don't often commit to, you may not notice it for months, because it only appears when adding something to the index; the warning is not raised on read-only git status.

    Removing the index.lock file, git status became fast immediately, mmaps disappeared, and it's now over 1000x faster.

    So if your git status is unnaturally slow, check find .git/ -name index.lock and delete the leftovers.

    0 讨论(0)
提交回复
热议问题