How can I resume a git history rewrite?

喜欢而已 提交于 2019-11-29 07:14:07

git filter-branch doesn't itself support a suspend/resume pattern of use - although it writes temporary data out to a .git-rewrite folder, there's no actual support for resuming based on the contents of this directory. If you run git filter-branch on a repository that's had a previously aborted filter-branch operation, it'll either ask you to delete that temp folder, or, with the --force option, do it itself.

The underlying problem is that git-filter-branch is slow running on big repos - if the process was much faster, there'd be no motivation to attempt a resume. So you've got a few options:

Make git-filter-branch go a bit faster...

  • use a RAM-disk - git-filter-branch is very IO-intensive, and will run faster with your repository sitting in RAM.
  • use --index-filter rather than --tree-filter - it's similar to tree filter but doesn't check out the file-tree, which makes it faster, but does require you to rewrite your file alterations in terms of git index commands.
  • use cloud computing and hire a machine with fast ram and high clock-speed (don't bother with multiple cores unless your own commands are multi-threaded, as git-filter-branch itself is single-threaded)

...or use The BFG (way faster)

The BFG Repo-Cleaner is a simpler, faster alternative to git-filter-branch - on large repos it's 50-150x faster. That turns your job that takes several hours into one that takes just a few minutes.

Full disclosure: I'm the author of the BFG Repo-Cleaner.

Roberto mentioned this in his answer, but I want to give a benchmark for it: If your git filter-branch operation is taking to long to complete, consider an AWS high memory instance.

I once had to filter-branch and merge together 35 different repositories, each with two years of dozens-of-commits-per-day history. My script failed to complete in 25 hours on my laptop. It completed in 45 minutes on an m2.4xlarge instance in Amazon.

Total cost?

$1.64 -- less than I spend on a 20oz soda.

BFG sounds like a great tool and I'd encourage anyone who routinely rewrites history to try it out. But if you just need something to work and have easy access to AWS, filter-branch is trivially easy.

In 2016 this is even cheaper. Just mosey on over to the Spot Advisor and find yourself something of the "cluster compute for $0.30 / hour variety.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!