Git is really slow for 100,000 objects. Any fixes?

后端 未结 11 1630
忘了有多久
忘了有多久 2020-11-27 13:23

I have a \"fresh\" git-svn repo (11.13 GB) that has over a 100,000 objects in it.

I have preformed

git fsck
git gc

on the repo afte

相关标签:
11条回答
  • 2020-11-27 13:54

    I'd create a partition using a different file system. HFT+ has always been sluggish for me compared to doing similar operations on other file systems.

    0 讨论(0)
  • 2020-11-27 13:56

    maybe spotlight is trying to index the files. Perhaps disable spotlight for your code dir. Check Activity Monitor and see what processes are running.

    0 讨论(0)
  • 2020-11-27 13:58

    It came down to a couple of items that I can see right now.

    1. git gc --aggressive
    2. Opening up file permissions to 777

    There has to be something else going on, but this was the things that clearly made the biggest impact.

    0 讨论(0)
  • 2020-11-27 14:02

    git status has to look at every file in the repository every time. You can tell it to stop looking at trees that you aren't working on with

    git update-index --assume-unchanged <trees to skip>
    

    source

    From the manpage:

    When these flags are specified, the object names recorded for the paths are not updated. Instead, these options set and unset the "assume unchanged" bit for the paths. When the "assume unchanged" bit is on, git stops checking the working tree files for possible modifications, so you need to manually unset the bit to tell git when you change the working tree file. This is sometimes helpful when working with a big project on a filesystem that has very slow lstat(2) system call (e.g. cifs).

    This option can be also used as a coarse file-level mechanism to ignore uncommitted changes in tracked files (akin to what .gitignore does for untracked files). Git will fail (gracefully) in case it needs to modify this file in the index e.g. when merging in a commit; thus, in case the assumed-untracked file is changed upstream, you will need to handle the situation manually.

    Many operations in git depend on your filesystem to have an efficient lstat(2) implementation, so that st_mtime information for working tree files can be cheaply checked to see if the file contents have changed from the version recorded in the index file. Unfortunately, some filesystems have inefficient lstat(2). If your filesystem is one of them, you can set "assume unchanged" bit to paths you have not changed to cause git not to do this check. Note that setting this bit on a path does not mean git will check the contents of the file to see if it has changed — it makes git to omit any checking and assume it has not changed. When you make changes to working tree files, you have to explicitly tell git about it by dropping "assume unchanged" bit, either before or after you modify them.

    ...

    In order to set "assume unchanged" bit, use --assume-unchanged option. To unset, use --no-assume-unchanged.

    The command looks at core.ignorestat configuration variable. When this is true, paths updated with git update-index paths… and paths updated with other git commands that update both index and working tree (e.g. git apply --index, git checkout-index -u, and git read-tree -u) are automatically marked as "assume unchanged". Note that "assume unchanged" bit is not set if git update-index --refresh finds the working tree file matches the index (use git update-index --really-refresh if you want to mark them as "assume unchanged").


    Now, clearly, this solution is only going to work if there are parts of the repo that you can conveniently ignore. I work on a project of similar size, and there are definitely large trees that I don't need to check on a regular basis. The semantics of git-status make it a generally O(n) problem (n in number of files). You need domain specific optimizations to do better than that.

    Note that if you work in a stitching pattern, that is, if you integrate changes from upstream by merge instead of rebase, then this solution becomes less convenient, because a change to an --assume-unchanged object merging in from upstream becomes a merge conflict. You can avoid this problem with a rebasing workflow.

    0 讨论(0)
  • 2020-11-27 14:03

    Try running Prune command it will get rid off, loose objects

    git remote prune origin

    0 讨论(0)
提交回复
热议问题