What I have:
---A----B-----C-----D--------*-----E-------> (master)
\\ /
1----2 (foo)
While what I am proposing will give you a clean, linear history; that's what rebase is supposed to do essentially. However, am hoping this gives you a way to remove B and B' from the commit history. Here goes the explanation:
Repo recreation output:
---A----B-----B'-----C--------D-------> (master)
\ /
1----2 (foo)
git log --graph --all --oneline --decorate #initial view the git commit graph
* dfa0f63 (HEAD -> master) add E
* 843612e Merge branch 'foo'
|\
| * 3fd261f (foo) add 2
| * ed338bb add 1
|/
* bf79650 add C
* ff94039 modify B
* 583110a add B
* cd8f6cd add A
git rebase -i HEAD~5 #here you drop 583110a/add B and ff94039/modify B from
foo branch.
git log --graph --all --oneline --decorate
$ git rebase -i HEAD~5
* 701d9e7 (HEAD -> master) add E
* 5a4be4f add 2
* 75b43d5 add 1
* 151742d add C
| * 3fd261f (foo) add 2
| * ed338bb add 1
| * bf79650 add C
| * ff94039 modify B
| * 583110a add B
|/
* cd8f6cd add A
$ git rebase -i master foo #drop 583110a/add B and ff94039/modify B again
$ git log --graph --all --oneline --decorate #view the git commit graph
* 701d9e7 (HEAD -> foo, master) add E
* 5a4be4f add 2
* 75b43d5 add 1
* 151742d add C
* cd8f6cd add A
Lastly, the final out might not be in the order you'd expected A--C--1---2---E. However, you can re-arrange the order within the interactive mode again. Try git rebase -i HEAD~n.
Note: It's best to avoid changing commit/publishing history. I am a newbie and exploring git, hopefully the above solution should stick. That said am sure there are tonnes of other easier solutions available online. I found this article quite helpful, for future reference for all.
The first thing to understand is that commits are immutable objects. When you rewrite history as you propose, you will end up with a completely different set of commits. The parent is part of each commit's immutable hash, among other things that you can't change. If you do what you propose, your history will look like this:
D'-----E'-----> (master)
/
---A----B-----C-----D--------E-------> (abandoned)
\ /
1----2 (foo)
To acheive this, you would simply rebase D..E
onto A
and reset master
to E'
. You can (but really don't have to) then rebase 1..foo
onto D'
.
A much simpler, and in my opinion correct, way would be to just delete the file in a new commit:
---A----B-----C-----D--------E-----F-----> (master)
\ /
1----2 (foo)
Here F
is the result of git rm that_file
. The purpose of git is to maintain history. Pruning it just because it doesn't look pretty isn't productive (again, my opinion). The only time I would recommend the former option is of the file in question has sensitive information like passwords in it.
If, on the other hand, scrubbing the file is what you want, you will have to take more extreme measures. For example: How to remove file from Git history?
So I use
rebase -i f0e0796
and remove B5ccb371
and and Ca46df1c
, correct? If I interpret the result correctly, this is whatgitk
shows me for my repo, althoughgit branches
still lists the second branch....A---1---2---E master
Can anyone tell me what happened here?
That's what it's built to do: produce a merge-free linear history from a single tip to a single base, preserving all the parts that might still need a mergeback to the new base.
The rebase docs could be clearer about this: "commits which are clean cherry-picks (as determined by git log --cherry-mark …) are always dropped." is mentioned only as an aside in an option for how to treat empty commits, and "by default, a rebase will simply drop merge commits from the todo list, and put the rebased commits into a single, linear branch." is only mentioned farther along, in the description of another option. But that's what it's for, to automate the tedious identification and elimination of already-applied fixes and noise merges from an otherwise-straightforward cherry-pick.
Is git rebase the feature I am looking for my problem?
Not really. The --rebase-merges
option is being beefed up, and Inigo's answer works well for your specific case, but see the warnings in its docs: it has real limitations and caveats. As Inigo's answer points out, "[t]hese steps assume the exact repo you show in your question", and "git rebase
just automates a series of steps that you can just as well do manually". The reason for this answer is, for one-off work it's generally better to just do it.
Rebase was built to automate a workflow where you have a branch you're merging from or otherwise keeping in sync with during development, and at least for the final mergeback (and maybe a few times before that) you want to clean up your history.
It's handy for lots of other uses (notably carrying patches), but again: it's not a cure-all. You need lots of hammers. Many of them can be stretched to serve in a pinch, and I'm a big fan of "whatever works", but I think that's best for people who are already very well acquainted with their tools.
What you want isn't to produce a single, clean linear history, you want something different.
The general way to do it with familiar tools is easy, starting from your demo script it'd be
git checkout :/A; git cherry-pick :/D :/1 :/2; git branch -f foo
git checkout foo^{/D}; git merge foo; git cherry-pick :/E; git branch -f master
and you're done.
Yes, you could get git rebase -ir
to set this up for you, but when I looked at the pick list that produces, editing in the right instructions did not seem simpler or easier than the above sequence. There's figuring out what exact result you want, and figuring out how to get git rebase -ir
to do it for you, and there's just doing it.
git rebase -r --onto :/A :/C master
git branch -f foo :/2
is the "whatever works" answer I'd probably use for, as Inigo says "the exact repo you show in your question". See the git help revisions docs for the message-search syntax.
git rebase
by default only rebases to a single lineage of commit history, because that is more commonly what people want. If you don't tell it otherwise, it will do it for the branch you have checked out (in your case that was master
). That is why you ended up with a rebased master
branch with the foo
commits grafted on rather than merged in, and with foo
itself unchanged and no longer connected.
If you have git version 2.18 or greater you can use the --rebase-merges
option* to tell git to recreate the merge history rather than linearize it as it does by default. The rebased history will have the same branch-offs and merges-back in. Below I'll walk you through the steps for acheiving what you want using --rebase-merges
.
These steps assume the exact repo you show in your question.
git checkout master
git rebase -i --rebase-merges f0e0796
todo
file:
pick
to drop
or d
) label foo
, add the following:exec git branch -f foo head
(see below for explanation)todo
file explainedgit rebase
just automates a series of steps that you can just as well do manually. This sequence of steps is represented in the todo
file. git rebase --interactive
allows you to modify the sequence before it executes.
I'll annotate it with an explanation including how you would do it manually (good learning experience). It's important to get a feel for this if you do a lot of rebases in the future, so you have good bearings when merge conflicts occur, or when you tell the rebase to pause at points so you can do some manual mods.
label onto // labels "rebase onto" commit (f0e0796)
// this is what you would do in your head
// if doing this manually
# Branch foo
reset onto // git reset --hard <onto>
drop 5ccb371 add B // skip this commit
drop a46df1c modify B // skip this commit
pick 8eb025b add C // git cherry-pick 8eb025b
label branch-point // label this commit so we can reset back to it later
pick f5b0116 add 1 // git cherry-pick f5b0116
pick 175e01f add 2 // git cherry-pick 175e01f
label foo // label this commit so we can merge it later
// This is just a rebase internal label.
// It does not affect the `foo` branch ref.
exec git branch -f foo head // point the `foo` branch ref to this commit
reset branch-point # add C // git reset --hard <branch-point>
merge -C b763a46 foo # Merge branch 'foo' // git merge --no-ff foo
// use comment from b763a46
exec git branch -f foo head
explainedAs I mentioned above, git rebase only operates on one branch. What this exec
command does is change the ref foo
to point to the current head
. As you can see in the sequence in the todo file, you are telling it to do this right after it has committed the last commit of the foo
branch ("add 2"), which is conveniently labeled label foo
in the todo file.
If you don't need the foo
ref anymore (e.g. it's a feature branch and this is its final merge) you can skip adding this line to the todo file.
You can also skip adding this line and separately repoint foo
to the commit you want it to after the rebase is done:
git branch -f foo <hash of the rebased commit that should be the new head of `foo`>
Let me know if you have any questions.
*If you have an older version of git, you can use the now deprecated --preserve-merges
option, though it isn't compatible with rebase's interactive mode.
To rearrange the commit history, there are several ways.
The problem with rebase
, when you want to change an entire repo's history, is that it only moves one branch at a time. Additionally it has problems dealing with merges, so you cannot simply rebase D
and E
onto A
while preserving the more recent history as it exists now (because E
is a merge).
You can work around all that, but the method is complicated and error-prone. There are tools that are designed for full-repo rewrites. You might want to look at filter-repo
(a tool that replaces filter-branch
) - but it sounds like you're just trying to scrub a partiular file from your history, which (1) might be a good job for the BFG Repo Cleaner, or (2) is actually an easy enough task with filter-branch
(If you want to look into BFG, https://rtyley.github.io/bfg-repo-cleaner/ ; if you want to look into filter-repo
, https://github.com/newren/git-filter-repo)
To use filter-branch
for this purpose
git filter-branch --index-filter 'git rm --cached --ignore-unmatch path/to/file' --prune-empty -- --all
However - you indicated that you need the file not to be in the repo (as a counter to someone's suggestion to just delete it from the next commit). So you need to understand that git doens't give up information quite that easily. After using any of these technique, you could still extract the file from the repo.
This is a kind of a big topic and has been discussed a nubmer of times in various questions/answers on SO, so I suggest searching for what you really need to be asking: how to permanently remove a file that should never have been under source control.
A few notes:
1 - If there are passwords and they were ever pushed to a shared remote, those passwords are compromised. There is nothing you can do about it; change the passwords.
2 - Each repo (the remote and each and every clone) has to be deliberately scrubbed, or thrown away and replaced. (The fact that you can't force someone to do that if they don't want to cooperate is one of the reaosns for (1).)
3 - In the local repo where you made the repairs, you have to get rid of the reflogs (as well as backup refs that may have been created if you used a tool like filter-branch
) and then run gc
. Or, it may be easier to re-clone to a new repo that only fetches the new verisons of the branches.
4 - Cleaning up the remote may not even be possible, depending on how it's hosted. Sometimes the best you can do is nuke the remote and then recreate it from scratch.