Composer package conflict in git repository; how to untrack files but avoid deletion of files when pushing to remote

≡放荡痞女 提交于 2021-02-17 06:22:09

问题


I installed a package on my web application via composer. And added the package folder to .gitignore, whilst committing composer.json and composer.lock

To deploy to our server, we Push to a bare Git remote on the server which in turn pushes the modified files to the relevant location on the server.

This workflow was all working fine.

At a later date, someone else working on the repository added the package files to the repository and removed the package from gitignore.

We want the package version to be managed purely by composer and not by the git repository, as it was before.

My only idea so far is to do the following:

  1. Remove the files from the repo and add the package folder back to gitignore. Commit this.
  2. Push to the remote (which will obviously push the removed files)
  3. run composer update quickly on the server once pushed, to reinstall the removed package.

BUT the problem here is that this will remove the package for a few seconds from the server, and we want to avoid that if possible as it is a core plugin on the site. We don't want to cause something to break.

Is there any way I can remove the package folder from being tracked, whilst NOT causing the package to be deleted from the remote when the commit is pushed?

I have read about assume-unchanged and skip-worktree here (Git - Difference Between 'assume-unchanged' and 'skip-worktree'), but I am unsure which to use and what effect either of these commands will have (if any) specifically on the remote?


回答1:


Is there any way I can remove the package folder from being tracked, whilst NOT causing the package to be deleted from the remote when the commit is pushed?

No.

Fortunately, you may not need to.

Unfortunately, whatever you do here will be somewhat ugly and painful to use.

I have read about assume-unchanged and skip-worktree ... but I am unsure which to use and what effect either of these commands will have (if any) specifically on the remote?

Either will work but --skip-worktree is the one that you are supposed to use here. Neither will have any effect on any other Git repository.


To understand all of this, you need a correct model of what Git actually does.

Remember first that the basic unit of storage in Git is the commit. Each commit has a unique, big ugly hash ID, such as 083378cc35c4dbcc607e4cdd24a5fca440163d17. That hash ID is the "true name" of the commit. Every Git repository everywhere agrees that that hash ID is reserved for that commit, even if the Git repository in question does not have that commit yet. (This is where all the real magic in Git comes from: the uniqueness of these seemingly-random, but actually totally-not-random, hash IDs.)

What a commit stores comes in two parts: the data, which consist of a snapshot of all of your files; plus the metadata, where Git stores information such as who made the commit, when (date-and-time stamps), and why (log message). As a crucial piece of metadata, each commit also stores some set of previous commit hash IDs, as raw hash IDs in text. This lets Git go from any given commit, backwards, to some previous commit.

The actual hash ID for any Git commit is simply a checksum of all of its data. (Technically it's just a checksum of the metadata, because the snapshot itself is stored as a separate Git object whose hash ID goes into the commit object. However, this separate object's hash ID is a checksum as well, so through the mathematics of Merkle trees, it all works out.) This is why everything in a commit is totally read-only, frozen for all time. If you try to change anything inside a commit, you don't actually change the commit. Instead, you get a new commit, with a new and different hash ID. The old commit still exists, with its unchanged hash ID.

So: Git is all about commits, and Git finds commits by their hash IDs. But we humans can't deal with hash IDs (quick, was that 08337-something or 03887-something?). We would like to have names, like master. Meanwhile, Git would like a quick way to find the last commit in some chain of commits that ends at some point. So Git offers us names, by letting us create branch names.

A branch name simply holds the hash ID of the last commit in some chain. That commit holds, as its parent, the hash ID of the previous commit in the chain. The parent commit holds, as its parent—our last commit's grandparent—the hash ID of the commit one step further back, and so on:

... <-F <-G <-H   <-- master

If commit hash IDs were single letters like H, this might be an accurate drawing: the name master would hold hash ID H, commit H would hold hash ID G as its parent, commit G would hold hash ID F as its parent, and so on.

The act of making a new commit consists of:

  • writing out a snapshot of all files; and
  • adding the appropriate metadata: you as author and committer, "now" as the date-and-time-stamps, and so on. The parent of this new commit should be whatever the current commit is, as recorded in the current branch name. If master points to H then the parent of the new commit—which we'll call I—will be H, so that I points back toH`.

Having actually made this commit (and found its hash ID in the process), Git simply writes the new hash ID I into the branch name master:

... <-F <-G <-H <-I   <-- master

and we have a new commit.

To see what happened in a commit such as I, Git extracts the commit—all its files—to a temporary area, then extracts the previous commit H's files to a temporary area, and compares them. For those that are the same, Git says nothing. For those that are different, Git shows the difference. For those that are new, Git says they are "added", and for those that are in the previous commit but not in this commit, git says that they are "deleted".

Now, doing a git checkout of some particular commit means writing that commit's content—i.e., data—out in a form you can use. The frozen-for-all-time copies of files inside the commit are in a Git-only format, which is fine for archival, but useless for getting new work done. So Git has to extract the commit to a work area, where you can see and work with your files. Git calls this work area your work-tree or working tree (or some variant of these names). Aside from writing files into it when you ask, Git is mostly hands-off of this work area: that's your playground, not Git's.

But where does the new snapshot, in a new commit, come from? In some version control systems, the new snapshot comes from the files in your work-tree. This is not the case in Git. Instead, Git makes new commits from whatever is in Git's index. You can't see these files—at least, not easily—but when Git first extracts some commit, it effectively copies all of that commit's saved, frozen files into Git's index. Only once they're in the index does Git copy (and defrost / rehydrate) them into your work-tree so that you can work with them.

The crucial difference between the frozen copies in a commit, and the "soft-frozen" copies in the index, is that you can overwrite the index copy.1 You can't overwrite the committed copy, but that's OK: commits cannot be changed, but you can make new and better commits, and that's what version control is about anyway.

Whenever you run git commit, what Git does in that first step—making the snapshot—is that it simply packages up all the pre-frozen index copies of each file. So we can think of the index as the proposed next commit. This is also why you have to git add files all the time, even if they're already in the previous commit. What git add is doing is copying the work-tree file over top of whatever was in the index for that file (though see footnote 1 again for technical details).

What this means is there are three "live" copies of each file at all times. One is frozen in the current commit. One is semi-frozen, in the index, which Git also calls the staging area. The last one is your copy, in your work-tree, which you can do whatever you want with: it's a normal file, not in a special Git-only format.

When you run git status, Git runs two separate comparisons:

  • First, git status compares all the files in the current (HEAD) commit to all the files in the index. For every file that is the same, Git says nothing. For every file that is different, Git says that this file is staged for commit. If a file in the index is new—isn't in HEAD—Git calls it new; and if a file is gone from the index, Git says it's deleted.

  • Then, git status compares all the files in the index to all the files in the work-tree. For every file that is the same, Git says nothing. For every file that is different, Git says that this file is not staged for commit. If a file in the work-tree is new—isn't in the index—Git complains that the file is untracked. If a file is gone from the work-tree, Git says it's deleted.

This last case is where untracked files come from. It also gives us the very definition of untracked: a file that exists in the work-tree is untracked if it does not also exist in the index. Since we can't see the index, we only see this is the case when git status whines about these untracked files.

Listing an untracked file in a .gitignore file makes Git shut up: git status won't whine any more. It also makes git add not add the file to the index if it's not already there, but it has no effect on files that are in the index. If the file is in the index, it is, by definition, tracked, and git add will happily add it.

This, at last, is where --assume-unchanged and --skip-worktree come in. These are flags that you can set on files that are in the index. Setting either flag tells Git: Hey, when you're about to consider the work-tree copy of this file ... you can maybe just skip it now. That is, git add looks through the index and work-tree, and checks .gitignore files, to see what's tracked, what's untracked, what's newer in the work-tree and needs updating in the proposed next commit, and so on. If some file is untracked and listed in .gitignore, git add will skip it. If it's tracked, Git will add it if the work-tree copy is different ... unless the skipping flags are set. If the --assume-unchanged flag is set, Git will assume it's not changed, and not add it. If the --skip-worktree flag is set, Git knows it definitely should not add it, even if the file is actually changed.

So --skip-worktree means what we want here: don't git add this file, even if it's changed. The --assume-unchanged flag works as well, because Git assumes it's not changed and hence doesn't git add it either. There's no difference in actual operation today, but "skip worktree" expresses the right intent.

Note that because these flags are set on an index (aka staging-area) copy of the file, they only work on tracked files. Tracked files are those in the index / staging-area. The file has to be in the index before you can set the flags. And, if the file is in the index, that copy of the file—the one that's in the index right now—is the one that will be in the next commit you make.

But where did this copy of the file come from? The answer is in our git checkout earlier: git checkout copied all the files from the commit we chose, to the index. It got into the index, and then into our work-tree, by our first git checkout. If we've fussed with the work-tree copy since then, well, the flag we set means that git add never copied the work-tree copy back into the index copy, so it's still the same as the old commit. We've been making new commits, perhaps for days or months or whatever, using the old copy of the file, as saved in the index.

What makes this a pain in the butt is that if we git checkout some other commit, and the other commit has a different copy of the file in it, Git is going to want to replace our index copy with the one from the commit we are trying to switch to. Copying that to the index won't remove the flag we set, but it will overwrite the work-tree copy. If we've changed the work-tree copy, Git will either overwrite it without asking (this is probably bad) or say: I can't check out that commit, it will overwrite your (assumed/skipped, but I won't mention that) work-tree copy of that file. In practice, Git takes the latter approach.

To work around it, every time you git checkout a commit that would overwrite your flagged file, you'll have to move or copy your work-tree copy out of the way, let git checkout overwrite the index and work-tree copies, then move or copy your work-tree copy back into place. It's clearly better never to get into this situation in the first place.

But, if you git rm these files, what happens to someone else who moves from a commit that has the files, to a commit that doesn't? For instance, perhaps the remote you're pushing-to has that file checked out right now, and they're going to then git checkout a new commit you make that doesn't have those files. Of course their Git will dutifully remove those files from their Git's index, and from their Git's user's work-tree. That's what you don't want, so now you're stuck with keeping their copy of that file in your Git's index, so that it goes into your new commits.

That's what this complicated dance is all about. Every commit is a snapshot and in your new commits, you want your snapshots to have their copy of some particular file(s). So you have to get their copy into your Git's index. You get that from some commit, copying it into your index. Then you keep it in place, in your Git's index / staging-area, even though you don't use it in your own work-tree. While working with the three copies, you keep the right copy—which is not your work-tree one—in your own Git's index.


1Technically, what's in the index is a reference to the frozen copy. Updating the index copy consists of making a new frozen copy, ready for commit, and writing the new reference into the index. These details matter if you start using git update-index directly to put new files in, or use git ls-files --stage to view the index: you'll see Git's internal blob object hash IDs here. But you can just think of the index as holding a full copy of each file, in the internal, frozen format: that mental model works well enough for the level at which you normally work with Git.



来源:https://stackoverflow.com/questions/59845959/composer-package-conflict-in-git-repository-how-to-untrack-files-but-avoid-dele

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!