问题
This is a very small project with only a master branch.
I (lightweight) tagged a version of the source which was in production and pushed tag to origin
Then I committed some changes to the master (which triggers a build onto our dev system so we can test it) and pushed these to origin.
Now I want master to contain the tagged version, something like "revert/reset", but I don't want to lose the changes I have made which may be useful at some point.
This answer: How do I revert master branch to a tag in git?
Is to do the following:
git checkout master
git reset --hard tag_ABC
git push --force origin master
I have no idea what this does, but it looks dangerous/drastic, and I am looking for a simpler (less likely to go wrong) solution.
Presumably I need to something like checkout the master branch, and merge in the tag, or checkout the tagged version, and merge in master?
E.g.
$ git checkout master
$ git pull
$ git merge mytag
$ git push
Or would this get confused because the changes I want to backout are newer?
I have seen you can set the master branch "position tag" to a commit. So I am guessing I just need to do
git reset XXX
Where XXX is the commit number of the commit from the tag. If this method works, how do I get the commit number of a tag (git hist or git history does not work on the mac)? If this is so easy, why the force and hard stuff?
If I checkout my tag, and do git status, it says "HEAD detached at mytag"
If I don't follow a cook book recipe, it usually ends in disaster, so hoping someone has done this before.
UPDATE
I got several replies, which was great, but none have a complete recipe unfortunately. For lack of better solution, I did this:
- checked out old tagged version.
- cut and paste the contents of the files I changed into onenote.
- checkout the head.
- opened the modified files in an editor, and pasted the contents from onenote.
- committed the changes to master.
I am sure there are better ways.
回答1:
To keep a pointer to the current head commit of your master
branch : just create a branch there :
git branch new/features master
git push origin new/features
after that, you can reset --hard
and push -f
master all you want.
回答2:
This is a very small project with only a
master
branch.
This is, perhaps, the real error here. 😀 Branches in Git are cheap and good, and you should start using them. See LeGEC's answer for a recipe (but do read through all of this for caveats and enlightenment!).
One thing to realize is that branches in Git don't actually mean anything. That is, there's nothing special about master
except that people start with it.1 This is why they're so cheap. The only real meaning to any branch name is whatever meaning you give it.
If I don't follow a cook book recipe, it usually ends in disaster ...
What this means is that you should learn what Git really does here. It's only a little bit complicated!
1Well, the fact that Git uses it by default, as the starting name, is something you could call "special". Note that this is about to change, though—GitHub in particular are reported to be switching to main
, and Git is growing a feature in which you can set the name (to main
like GitHub, for instance) in your system or per-user configuration (this is more or less what GitHub plan to use to set their new default). There are a few other minor quirks with this, and it has taken a few rounds of review to find and tweak all the places where something funky happens due to a built-in comparison of the six letter sequence master
. But other than that there's nothing special about master
.
Git is all about commits
As a casual user of Git, you probably think of Git as being about branches and/or files. That's where you get led astray, and why things end in disaster: Git isn't about files, and is not about branches either. Git stores files, and uses branch names, but it stores the files in commits, and uses branch names to find the commits. In the end, everything is about the commits themselves.
There are some things to know about commits:
Every commit has a unique number. The numbers aren't simple counting numbers though: they don't start with commit #1 and count up to #2, #3, and so on. Instead, each commit gets a random-looking—but not actually random at all—big ugly hash ID like 385c171a018f2747b329bcfa6be8eda1709e5abd.
These numbers have to be so big and ugly because that number means that commit, from now on. None of your commits can use that number.2 A commit's number—its hash ID—is actually a cryptographic checksum of the contents of that commit. This means that no commit, once made, can ever be changed, which is hugely consequential.
Git can look up a commit—or any internal Git object, but we'll just worry about commits here—by the big ugly number. The Git repository is mostly just a big key-value database with the hash ID "numbers" being the keys (well, plus a second database of names-to-hash-IDs, which we'll get to in a bit).
Each commit stores two things:
It stores a full snapshot of every file that Git knows about (or knew about at the time you, or whoever, made the commit, that is). The files inside a commit are stored in a compressed and de-duplicated form. Since the files are all read-only—every part of a commit is read-only, as we just noted—it's OK to share them with other commits, with this de-duplication. So it doesn't actually hurt to store the same file a thousand times in a thousand commits: they all just re-use the one version of that file. It's only when files change that a new commit has to store a new version.
It stores some metadata, or information about the commit itself. That includes the name and email address of whoever makes the commit, for instance. There's a date-and-time-stamp—actually two of these—in each commit as well, and your log message goes here. Most important for Git itself, though, each commit stores, in this metadata, the big ugly hash ID of the previous commit.
It's this last bit that makes everything work—or break, when you have a disaster. 😀 To understand what's going on here, let's draw a simple, and very small, Git repository, in which we have just three commits. Because actual hash IDs are too big and ugly, let's call these commits A
, B
, and C
, and draw them like this:
A <-B <-C
Commit C
(whatever its hash ID really is) is the last commit, and it stores a bunch of files and some metadata, all of which are frozen for all time now. Inside commit C
's metadata, Git has stored B
's hash ID. So from C
, Git can work backwards one hop, to find B
. Meanwhile B
has files and metadata, and in B
's metadata, Git stored A
's actual hash ID. So from B
, Git can step back to commit A
.
Git calls these backwards-pointing links parents. The parent of the last commit C
is B
, and the parent of B
is A
. As you can see, Git actually works backwards. We start with the last commit—which so far, is C
—and go back one commit at a time. Commit A
, being the very first commit, is special in exactly one way: its metadata doesn't list a previous commit. It has no parent (it's an orphan, sort of). That's how Git knows that it can stop going backwards. But there is one hitch here: how do we find the actual hash ID of commit C
?
2Technically, your Git could re-use that number, as long as you never introduce your Git to a Git holding the repository for Git itself. For much more about this, see How does the newly found SHA-1 collision affect Git?
A branch name stores one commit hash ID
This is where branch names like master
come in. Each name stores one hash ID. The hash ID inside a branch name is, by definition, the last commit in that chain. So we can re-draw the above as:
A--B--C <-- master
The name master, which is easy for a human to remember, holds the actual hash ID of commit C
. We'll check this commit out, and that will be our current commit, with master
being our current branch. So now that we're on master
, if we add a new commit—by the usual means that we haven't described here—what Git will do is:
- package up a new snapshot;
- add some metadata: name, email address, and so on;
- include in that metadata, the hash ID of the current commit
C
; and - write all that out as a new commit, which will acquire a new unique big ugly hash ID, but we'll just call that
D
.
Let's draw that:
A--B--C <-- master
\
D
As you can see, D
's parent is C
. Now git commit
performs its special trick: the last step of git commit
is to write D
's new hash ID into the name master
. The result is:
A--B--C
\
D <-- master
which we can just draw out as:
A--B--C--D <-- master
Using more than one branch name
Let's go back to our three-commit repository, before we make D
:
A--B--C <-- master
Now, before we do make D
, let's create a new branch name, dev
for develop. A Git branch name must select some commit, so which of the three commits should we select? Well, the latest one makes a lot of sense, so let's use C
, the commit we're using through the name master
:
A--B--C <-- master, dev
Now all three commits are on both branches. But now we have a problem with our drawing: which name are we using? We have two names! We need a way to tell which one we're actually using. Right now it's not super-important, because both names hold the hash ID of commit C
, but we're about to make a new commit D
.
Let's pick the name dev
to use, with git checkout dev
, and draw it like this:
A--B--C <-- master, dev (HEAD)
Here, we've used the special name HEAD
, in all uppercase, and attached it to one of the branch names. That tells us—and Git—which name we're using.
Now let's make commit D
while we're using this dev
name. Git will write out a new commit as before, but this time, the name it updates is dev
, not master
. So we end up with:
A--B--C <-- master
\
D <-- dev (HEAD)
New commit D
is now the last commit in the dev
branch. Commits A-B-C
are now on both branches, with commit C
being the last commit in the master
branch.
That's all there is to it! Well, OK, almost all. There are several more wrinkles that will come up in a moment. But that's what branch names are all about: A branch name just holds the hash ID of the last commit in the chain. Git will start here, and work backwards whenever it needs to.
Short sidebar: the index and your work-tree
To keep this answer shorter, I won't go into a lot of detail here, but think about the fact that every commit is frozen for all time. The files inside each commit are in a special Git-only de-duplicated format, that only Git can read. How can these files actually be any use? To be of use, files have to be readable by other programs, and usually at least a few of them need to be writable too.
All version control systems have this problem, and all of them use similar approaches: there's the version controlled "file", frozen for all time, and then there's a separate file that's actually usable. The usable files go in a work area. Git calls this work area your working tree or work-tree.
This means that the files you see and work with, when you're working with a Git repository, are not actually in the repository at all. They were copied out of the repository (by git checkout
or git switch
) so that you could use them, but now that they're out, they are literally outside the repository. Those aren't Git's files: they're yours.
Where Git departs from most version control systems, though, is that Git keeps a third copy—well, sort of a copy—of each file. This extra copy sits "between" the frozen file, in Git's commit, and the usable file in your work-tree. It's in Git's frozen and de-duplicated format, but it's not actually frozen, because it's not in a commit. This extra copy is in what Git calls, variously, the index, or the staging area, or sometimes—rarely these days—the cache. Because it's pre-de-duplicated, it's not really a copy (and what's inside Git's index is another one of those big ugly hash IDs, for an internal blob object, rather than the actual file data directly). But thinking of it as a copy works well.
When you run git add
on some file you've changed in your work-tree, what you are really doing is telling Git: make the index copy of this file match the work-tree copy. Git will remove and replace the de-duplicated frozen-format file, making a new copy (but already de-duplicated, if it matches any previous version) of the file, ready to be committed.
Because the index holds the de-duplicated, ready-to-commit copies of each file, a good way to think of Git's index is that it holds your proposed next commit. Running git add
is your way of telling Git: Change my proposed next commit now, using the updates I've made in my work-tree.
Tags
Now that we have a good way to draw what's going on with commits, let's draw what a tag does. A tag name is a whole lot like a branch name: it holds one hash ID. In this case, that's a commit hash ID.
There are several key differences between branch names and tag names:
Branch names are forced to hold only commit hash IDs. Tag names can hold other kinds of hash IDs, and that's what an annotated tag is about. You get an internal Git object that holds extra information—the annotation—and then holds a hash ID: normally, a commit hash ID. So the annotated tag gets you a commit, but lets you add information first. You mentioned that you're using a lightweight tag, and those just hold commit hash IDs directly, so that's what I will draw here.
Branch names move, as we saw above when we made new commit
D
. Whatever branch name you have as your attached-HEAD, that's the name that moves. Tag names don't move.3Branch and tag names are in different namespaces. We won't go into any detail here, but tag names are meant to be more "global" than branch names: every Git repository gets its own branch names, but in general, when you connect two clone and have them share, they tend to share their tag names so that everyone has the same ones.
Since tag names don't move, let's draw that. We'll start with this:
...--G--H <-- master (HEAD)
and then we'll add a tag name, tag:ABC
for instance, like this:
...--G--H <-- master (HEAD)
^
|
tag:ABC
If we now create a new commit, we'll get:
...--G--H--I <-- master (HEAD)
^
|
tag:ABC
Note that we could draw this like this:
I <-- master (HEAD)
/
...--G--H <-- tag:ABC
which emphasizes that tag names and branch names are a whole lot alike. You could have used a branch name, where you actually used a tag name. The distinctions—that tag names don't move, but branches do, and so on—are mostly for human use. Git itself doesn't really care: Git cares about the hash IDs.
3You can move a tag. There are several ways to do that, with the most obvious being: delete the tag, then create one that's spelled the same but that selects a different commit. This is often a bad idea, and the reason is that both humans and Git repositories don't expect tags to move. Anyone who grabbed the "wrong" tag earlier is likely to hang on to this wrong value: you'll have to convince them to delete-and-re-create, or otherwise move, their copy of the tag, too.
Detached HEAD mode
You noticed that when you run git checkout tag_ABC, or whatever the actual spelling is for your ABC tag, you wind up in detached HEAD mode. That's because HEAD
itself can only be attached to a branch name.
Branch names move, and the method by which they most often move is by having HEAD
attached to them. Tag names are not supposed to move (see footnote 3 again), and to enforce that, Git won't attach HEAD
to a tag name.
In general, you can also check out any historic commit, to view or use it in some way. For instance, suppose you decide you want to look at commit G
for a while, or build it, or whatever. You can just direct Git to check out that commit by its raw hash ID—as seen in git log
output, for instance—and you'll get this:
I <-- master
/
H <-- tag:ABC # drawn on right to save space
/
...--G <-- HEAD
A "detached HEAD" just means that the special name HEAD
points directly to some commit. So if you now git checkout ABC
, you get:
I <-- master
/
...--G--H <-- HEAD
^
|
tag:ABC
Your index and work-tree are full of files from commit H
. Your HEAD
identifies commit H
, as does your tag. Meanwhile your name master
still identifies commit I
.
To get out of detached HEAD mode, you simply git checkout master
or git switch master
. This re-attaches HEAD
to the branch name, and extracts the commit identified by the branch name—commit I
in our drawings here—into Git's index and your work-tree, so that you see the files from that version.
Drastic? Perhaps
The other answer you linked includes:
git checkout master git reset --hard tag_ABC git push --force origin master
I have no idea what this does, but it looks dangerous/drastic ...
Dangerous, yes: in particular that --hard
tells git reset
to remove all seat belts and disable all the air bags, as it were, and the --force
is similar. It's perhaps less drastic than it looks though.
The git reset
command is terribly complicated, but we'll just look at the --hard
mode here.4 What this does is actually three things:
First, it moves the current branch name. For this to have any effect,
HEAD
has to be attached to a branch name. That's why we have thegit checkout master
.Then, it resets Git's index, so that the proposed next commit matches the commit you just moved to.
Last, it resets your work-tree, so that the files you see are those from the commit you just moved to. It does this without asking whether some of those files have stuff you never saved anywhere, and since those files are not in Git, any data that get overwritten, Git can't recover, either. That's the most dangerous or drastic part, right there.
The commit you choose—tag_ABC
here—is the one that the name now selects, so after this git reset --hard
, we have this picture:
I ???
/
...--G--H <-- master (HEAD)
^
|
tag:ABC
You might wonder: What happened to commit I
? The answer is: Nothing at all. It's still there. But how will you find it?
If you jotted down the commit number—the hash ID—before your git reset
, you could find commit I
that way. Git also has various "recover from a mistake" logs and commands that will let you find commit I
again. These keep track of the commit for at least another 30 days, by default. So the commit is still there. You can get it back!
The git push --force
is actually more drastic, but to see why, we need to talk about multiple Git repositories, and this part really does get a little complicated.
4I view this much as the same as git checkout
was before git checkout
got split into git switch
and git restore
: it has too many modes. The new split-up commands are simpler because each one only does a few things. Reset probably should be split up as well.
Other Git repositories
We say that Git is a distributed version control system (DVCS). What this means is perhaps unclear. It might be better to refer to it as a replicated VCS: it's not distributed in the way that distributed computing is, for instance. In short, though, the way this works is that different Git repositories will connect to each other, and having connected, can now share—replicate—commits.
Since each commit has a unique number, the two Gits can decide whether one has a commit that the other has, just by passing around the numbers. That's what the main phase of a git fetch
or git push
is all about: one of the two Git repositories has some commits that the other one maybe doesn't. The sending Git offers the receiving Git the hash IDs. The receiving Git looks in its big database of Git objects, and tells the sender: please send that or no thanks, I already have it.
Each commit, of course, remembers the hash ID of its parent commit. The sending Git is obligated to offer the parent (or for merge commits, parents plural) of each commit it sends. So if you have three new commits that they don't, all in a row, and you tell your Git to send the last of these three, your Git will actually send all three.
Having received some new commits, though, the receiving Git now needs some way to find these commits. We already noted that one way we find commits is with branch names. So the receiving Git could set some branch name(s) to remember any new commits.
The git fetch
and git push
commands differ here in the way they work: when you run git fetch
, your Git is the one receiving, and their Git is the one sending. Your Git doesn't set your branch names. Instead, your Git sets some other names. This is fancier (and nicer in many ways) than git push
, but we'll skip right over this and consider git push
instead.
When you run git push
, your Git is the sender and their Git is the receiver. You send any new commits that you have, that they don't, that they will need. At the end of this process, your Git normally now sends a polite request: Now, if it's OK, please set your branch name ______ to ______. Let me know if that was OK. Your Git fills in the first blank with a branch name, and the second one with a hash ID.
The branch name your Git asks them to set comes from your git push
command. If you run:
git push origin HEAD:master
the master
here means their master
branch. The HEAD
here means that the commit you'll ask them to set is whatever commit is your current commit. (The origin
part is the way you specify the Git you are sending to.)
When you use:
git push origin master
you're really saying master:master
, i.e., you want your Git to find your master
commit—the last one on the chain ending at your master
—and send that commit, and then ask them to set their master
.
So, suppose you and they both start out with:
...--G--H <-- master
Your Git and their Git are in sync. But now you create a new commit I
on your own master
(whether or not you create a tag). You now have:
...--G--H--I <-- master
If you run git push origin master
, your Git calls up their Git and offers commit I
. They don't have that one, so they say please send it. Your Git now offers H
, because I
's parent is H
; theirs says no thanks, I have that already. Your Git now knows that they have G
and everything earlier too, because H
is the last commit in a chain, and they must have the entire chain.5 All of this fancy footwork essentially allows your Git to send, not the entire I
commit, but only the parts of the I
commit that they don't already have. It's remarkably efficient, and it all comes about by exchanging just two hash IDs.
Anyway, your Git sends over commit I
—or just enough to let them reconstruct it—and they now have I
. Now your Git asks their Git to please, if it's OK, have them set their master
to remember commit I
.
They will say that this is OK, and the reason they will say that is that this just adds to their collection. Starting from I
, they can go back to H
, so they won't lose H
, nor G
, nor anything earlier.
Note that when you git push
a tag, your Git ends the conversation with a polite request that they create or update their tag of the same name. Except for the fact that they should not move a tag, but should move a branch name if it just adds on, this is all pretty much the same.
5This papers over the way shallow repositories work, but let's not worry about that now.
How, when, and why git push --force
is dangerous
Suppose you've sent commit I
to the other Git, and now decide to retract it, by using git reset
as we saw above. You have, in your repository:
I <-- new/features
/
...--G--H <-- master (HEAD)
because you cleverly saved the hash ID of commit I
in a new branch (see LeGEC's answer) before you did the git reset
. Moreover, you also did a git push origin new/features
, which had your Git call up their Git, offer them commit I
—which they already have—and ask them to set their new/features
to remember commit I
. They said OK to that too. So right at that moment, they have:
...--G--H--I <-- master, new/features
But we just said that they're taking git push
commands. What if some third user has a third Git repository?
Suppose this third user has grabbed commit I
and has used that to create a new commit J
. This third user—let's call him Bob—made his repository have:
...--G--H--I--J <-- master (HEAD)
He then runs git push origin master
to send commit J
to the repository you're about to git push --force
. They accept commit J
and add it to their master
.
They now have:
...--G--H--I <-- new/features
\
J <-- master
Bob thinks: Great, my work is done and for some reason, Bob removes his entire repository.6 Commit J
is, after all, safe somewhere that's all backed up and everything, maybe on GitHub or whatever.7
Now you come along and offer the second repository commit H
, which they already have, then ask them to set their master
to point to H
. By default, they will say no, and the reason is that this causes their master
to drop commits I
and J
.
You, of course, know that they already have commit I
, and you want them to drop it. So you use git push --force
. This changes the last operation from Please, if it's OK to Do this now! I command it! If they obey this command—that's up to them, but usually they're set up to obey—they will dutifully change their master
to point to H
:
...--G--H <-- master
\
I <-- new/features
\
J ???
In your Git repository, above, we noted that there are some ways for you to find commit I
if you forgot to save its hash ID somewhere first. These methods rely on what Git calls reflogs. Server repositories normally have reflogs disabled, which means they don't have a way to find commit J
any more.
Without a way to find commit J
, they may quickly remove commit J
entirely. Their repository drops commit J
. Bob had commit J
, but we just said Bob removed his repository too.
What happens here, then, is that Bob's commit J
is lost, perhaps forever. If Bob keeps his repository, Bob still has his commit, and can restore it to this shared Git repository (on GitHub, or wherever it might be).
6This is probably a mistake. 😀
7It Ain’t What You Don’t Know That Gets You Into Trouble. It’s What You Know for Sure That Just Ain’t So.
Is git push --force
really dangerous?
Well, maybe: if we know for sure there's no Bob, or that Bob is careful to keep his repository, Bob can restore the lost commit. As someone who has used shared repositories like this (and occasionally taken on the Bob role but without having removed the repository), I will say that being Repository Janitor is not all that much fun. As an occasional rescue, sure, it's OK. Just don't make me do it all the time. 😀
There is a less-dangerous alternative though. Instead of git push --force
, consider using git push --force-with-lease
. This rather odd name really means that the last request-or-command—please, if it's OK, set _____ to _____ or set _____ to _____!—changes to: I think your _____ is set to _____. If so, change it to _____. In any case, let me know. Your Git fills in all of these blanks:
The branch name comes from your
git push remote mine:theirs
command. Thetheirs
after the colon—or the one name, if you omit the colon, that provides everything—is the branch name you ask their Git to set.The I think yours is _____ blank gets filled in from your own Git's remote-tracking name. For instance, if you're pushing to
master
onorigin
, this is filled in from your ownorigin/master
. You can inspect this value (withgit log
, typically) before you start thegit push
. That way you know exactly what commits you're going to ask them to throw away.The set it to _____ blank gets filled in from the hash ID of the
mine
side of the colon, or from your branch name if you use just the one name for everything.
So git push --force-with-lease origin master
means call up origin
, then ask them to forcibly set their master
, but only if it matches what I can see in my origin/master
right now. So you can check before you force-push. If the force-push fails because your check was wrong, that means Bob (or whoever) managed to sneak a git push
in between, and you'd best pick up Bob's new commit and figure out what to do about that, before you go force-pushing again.
回答3:
A short answer, which should do the job:
git checkout -b my_tagged_branch tagname
checkout -b
creates a new branch and checks it out. From the docs here
$ git checkout v2.0 # or $ git checkout master^^
HEAD (refers to commit 'b')
|
v a---b---c---d branch 'master' (refers to commit 'd')
^
| tag 'v2.0' (refers to commit 'b')
Notice that regardless of which checkout command we use, HEAD now refers directly to commit b. This is known as being in detached HEAD state. It means simply that HEAD refers to a specific commit, as opposed to referring to a named branch.
You can then work on my_tagged_branch, commit. If necessary, you checkout master again and then git merge my_tagged_branch
.
If you work with a remote, don't forget to push, if you like to see that workflow later on use git merge --no-ff my_tagged_branch
(the result is of course the same, just check with git log --graph --oneline
.
For details see @toreks answer.
来源:https://stackoverflow.com/questions/63982737/git-how-get-an-old-tagged-version-into-master-without-losing-history