I\'m not clear on what the following means (from the Git submodule update documentation):
...will make the submodules HEAD be detached, unless
-
To update each submodule, you could invoke the following command (at the root of the repository):
git submodule -q foreach git pull -q origin master
You can remove the -q option to follow the whole process.
This GitPro page does summarize the consequence of a git submodule update nicely
When you run
git submodule update
, it checks out the specific version of the project, but not within a branch. This is called having a detached head — it means the HEAD file points directly to a commit, not to a symbolic reference.
The issue is that you generally don’t want to work in a detached head environment, because it’s easy to lose changes.
If you do an initial submodule update, commit in that submodule directory without creating a branch to work in, and then run git submodule update again from the superproject without committing in the meantime, Git will overwrite your changes without telling you. Technically you won’t lose the work, but you won’t have a branch pointing to it, so it will be somewhat difficult to retrieve.
Note March 2013:
As mentioned in "git submodule tracking latest", a submodule now (git1.8.2) can track a branch.
# add submodule to track master branch
git submodule add -b master [URL to Git repo];
# update your submodule
git submodule update --remote
# or (with rebase)
git submodule update --rebase --remote
See "git submodule update --remote vs git pull".
MindTooth's answer illustrate a manual update (without local configuration):
git submodule -q foreach git pull -q origin master
In both cases, that will change the submodules references (the gitlink, a special entry in the parent repo index), and you will need to add, commit and push said references from the main repo.
Next time you will clone that parent repo, it will populate the submodules to reflect those new SHA1 references.
The rest of this answer details the classic submodule feature (reference to a fixed commit, which is the all point behind the notion of a submodule).
To avoid this issue, create a branch when you work in a submodule directory with git checkout -b work or something equivalent. When you do the submodule update a second time, it will still revert your work, but at least you have a pointer to get back to.
Switching branches with submodules in them can also be tricky. If you create a new branch, add a submodule there, and then switch back to a branch without that submodule, you still have the submodule directory as an untracked directory:
So, to answer your questions:
can I create branches/modifications and use push/pull just like I would in regular repos, or are there things to be cautious about?
You can create a branch and push modifications.
WARNING (from Git Submodule Tutorial): Always publish (push) the submodule change before publishing (push) the change to the superproject that references it. If you forget to publish the submodule change, others won't be able to clone the repository.
how would I advance the submodule referenced commit from say (tagged) 1.0 to 1.1 (even though the head of the original repo is already at 2.0)
The page "Understanding Submodules" can help
Git submodules are implemented using two moving parts:
- the
.gitmodules
file and- a special kind of tree object.
These together triangulate a specific revision of a specific repository which is checked out into a specific location in your project.
From the git submodule page
you cannot modify the contents of the submodule from within the main project
100% correct: you cannot modify a submodule, only refer to one of its commits.
This is why, when you do modify a submodule from within the main project, you:
A submodule enables you to have a component-based approach development, where the main project only refers to specific commits of other components (here "other Git repositories declared as sub-modules").
A submodule is a marker (commit) to another Git repository which is not bound by the main project development cycle: it (the "other" Git repo) can evolves independently.
It is up to the main project to pick from that other repo whatever commit it needs.
However, should you want to, out of convenience, modify one of those submodules directly from your main project, Git allows you to do that, provided you first publish those submodule modifications to its original Git repo, and then commit your main project refering to a new version of said submodule.
But the main idea remains: referencing specific components which:
The list of specific commits you are refering to in your main project defines your configuration (this is what Configuration Management is all about, englobing mere Version Control System)
If a component could really be developed at the same time as your main project (because any modification on the main project would involve modifying the sub-directory, and vice-versa), then it would be a "submodule" no more, but a subtree merge (also presented in the question Transferring legacy code base from cvs to distributed repository), linking the history of the two Git repo together.
Does that help understanding the true nature of Git Submodules?
To address the --rebase
vs. --merge
option:
Let's say you have super repository A and submodule B and want to do some work in submodule B. You've done your homework and know that after calling
git submodule update
you are in a HEAD-less state, so any commits you do at this point are hard to get back to. So, you've started work on a new branch in submodule B
cd B
git checkout -b bestIdeaForBEver
<do work>
Meanwhile, someone else in project A has decided that the latest and greatest version of B is really what A deserves. You, out of habit, merge the most recent changes down and update your submodules.
<in A>
git merge develop
git submodule update
Oh noes! You're back in a headless state again, probably because B is now pointing to the SHA associated with B's new tip, or some other commit. If only you had:
git merge develop
git submodule update --rebase
Fast-forwarded bestIdeaForBEver to b798edfdsf1191f8b140ea325685c4da19a9d437.
Submodule path 'B': rebased into 'b798ecsdf71191f8b140ea325685c4da19a9d437'
Now that best idea ever for B has been rebased onto the new commit, and more importantly, you are still on your development branch for B, not in a headless state!
(The --merge
will merge changes from beforeUpdateSHA to afterUpdateSHA into your working branch, as opposed to rebasing your changes onto afterUpdateSHA.)
Git 1.8.2 features a new option ,--remote
, that will enable exactly this behavior. Running
git submodule update --rebase --remote
will fetch the latest changes from upstream in each submodule, rebase them, and check out the latest revision of the submodule. As the documentation puts it:
--remote
This option is only valid for the update command. Instead of using the superproject’s recorded SHA-1 to update the submodule, use the status of the submodule’s remote-tracking branch.
This is equivalent to running git pull
in each submodule, which is generally exactly what you want.
(This was copied from this answer.)