How to make a git rebase and keep the commit timestamp?

前端 未结 2 1661
长发绾君心
长发绾君心 2021-02-02 04:11

I want to make a rebase to remove a certain commit from my history. I know how to do that. However if I do it, the commit timestamp is set to the moment I completed the rebase.

相关标签:
2条回答
  • 2021-02-02 04:59

    So, here is a tedious way to do it (depending on how many commits you need to rebase), but I tried it out and it works. When you do an interactive rebase, mark each commit with "e" so that you can edit it. This will cause git to pause after every commit. At each pause, you can specify which date to use and continue to the next commit with:

    GIT_COMMITTER_DATE="Wed Feb 16 14:00 2011 +0100" git commit --amend   
    git rebase --continue
    

    This is, of course, a major pain in the rear, and you have to know all of the commit dates before hand, but if you can't do it any other way, it at least should work.

    0 讨论(0)
  • The setup

    Let's say this is the history around the commit you want to remove

    ... o - o - o - o ...       ... o
            ^   ^   ^               ^
            |   |   +- next         |
            |   +- bad              +-- master (HEAD)
          start
    

    where:

    • bad is the commit you want to remove;
    • start is the parent of the commit you want to remove;
    • next is the next commit after bad; it is good, you want to keep it and all the timeline after it; it will replace bad after rebase.

    Prerequisites

    In order to be able to safely remove bad, it's important that no other branch existing at the time when bad was created was merged into the main timeline after bad. I.e. by removing bad and its connections with its parent and child commits from the history graph, you get two disconnected timeline pieces.

    It is probably possible to remove bad even if another existing branch was merged after bad. I didn't check this situation but I expect some impediments because of the merge commit.

    The idea

    Each git commit is identified by a hash that is computed using the commit's properties: content, message, author and committer date and email.

    A rebase always changes the committer date. It can also change committer email, commit message and content too.

    In order to restore the original committer dates after a rebase we need to save them together with some information that can identify each commit after the rebase.

    Because you want to modify a commit, the commit contents change during the rebase. Adding or removing files or commits change the contents all future commits.

    This leave us without a property that uniquely identifies the commits and does not change during the desired rebase. We can try to use two or more properties that do not change during the rebase.

    The emails (author and committer) are of almost no use. If there is a single person that worked on the project, they are the same for all commits and cannot be used. The properties that remains (are different on most commits, are not affected by the rebase) are author date and commit message (the first line).

    If the pair (author date, commit message) provides unique values for all the commits affected by the rebase then we can restore the commit dates afterwards without errors.

    Verify if it can be done safely

    There is a simple way to verify if the (author date, commit message) pairs are unique for the affected commits.

    Run the following two commands:

    $ git log --format="%aI %s" start...master | uniq | wc -l
    $ git log --oneline start...master | wc -l
    

    If they display the same number then you are lucky: the pair (author date, commit message) can be used to uniquely identify the commits. Read on.

    If the numbers are different (the first command will always produce a number smaller than or equal to the one produced by the second command) then you are out of luck.

    Extract the information needed to fix the commit dates after the rebase

    This command

    $ git log --format="%H %cI %aI %s" start...master > /tmp/hashlist
    

    extracts the commit hash, committer date (the payload), author date and commit message (the key) for all the commits starting with start and stores them in a file.

    Backup the current master

    While it is a common misconception that git "rewrites history", in fact it just generates an alternative history line and decides it is the correct history. It does not change or remove the "rewritten" commits; they are still present for some time in its database and can be restored in case the operation fails.

    We can proactively backup the current history line to easily restore it if needed. All we have to do is to create a new branch that points to master. This way, when git rebase moves master to the new timeline, the old one is still accessible using the new branch.

    $ git branch old_master
    

    The command above creates a branch named old_master that keeps the current timeline in focus until we complete all the changes and are satisfied with the new world order.

    Do the rebase

    Removing the commit bad from the history is as simple as:

    $ git rebase --preserve-merges --onto start bad
    

    Fix the commit dates

    The following command "rewrites" the history and changes the committer date using the values we saved before:

    $ git filter-branch --env-filter 'export GIT_COMMITTER_DATE=$(fgrep -m 1 "$(git log -1 --format="%aI %s" $GIT_COMMIT)" /tmp/hashlist | cut -d" " -f2)' -f start...master
    

    How it works:

    git walks the history between the commits labelled start and master and for each commit it runs the command provided as argument to --env-filter before rewriting the commit. It sets the environment variable GIT_COMMIT with the hash of the commit being rewritten.

    Since we already did a rebase that modified the hashes of all the commits we cannot use $GIT_COMMIT directly to identify the original commit date of the commit (because $GIT_COMMIT is a commit generated by git rebase and we are not interested in their committer dates).

    The command we provide to --env-filter

    export GIT_COMMITTER_DATE=$(fgrep -m 1 "$(git log -1 --format="%aI %s" $GIT_COMMIT)" /tmp/hashlist | cut -d" " -f2)
    

    runs git log -1 --format="%aI %s" $GIT_COMMIT to generate the key pair (author date, commit message) discussed above. Its output is passed as argument to the command fgrep -m 1 "..." /tmp/hashlist | cut -d" " -f2 that finds the pair in the list of previously saved hashes (fgrep) and extracts the original commit date from the saved line (cut). Finally, the value of the commit date is stored in the environment variable GIT_COMMITTER_DATE that is used by git to rewrite the commit.

    Verification

    Using the git log command again

    $ git log --format="%cI %aI %s" start...master
    

    you can verify that the rewritten history matches the original history. If you use a graphical git client you can check the results easier by visual inspection. The branch old_master keeps the old history line visible in the client and you can easily compare the dates of each commit of old_master branch with the corresponding one of master branch.

    If something didn't go well or you need to modify the procedure you can easily start over by:

    $ git reset --hard old_master
    

    Cleanup

    When you are satisfied by the result you can remove the backup branch and the file used to store the original commit dates:

    $ git branch -D old_master
    $ rm /tmp/hashlist
    

    That's all!

    0 讨论(0)
提交回复
热议问题