How to use mercurial subrepos for shared components and dependencies?

China☆狼群 提交于 2019-11-30 12:59:00
Clare Macrae

This may not be the answer you were looking for, but we have recent experience of novice Mercurial users using sub-repos, and I've been looking for an opportunity to pass on our experience...

In summary, my advice based on experience is: however appealing Mercurial sub-repos may be, do not use them. Instead, find a way to lay out your directories side-by-side, and to adjust your builds to cope with that.

However appealing it seems to be to tie together revisions in the sub-repo with revisions in the parent repo, it just doesn't work in practice.

During all the preparation for the conversion, we received advice from multiple different sources that sub-repos were fragile and not well-implemented - but we went ahead anyway, as we wanted atomic commits between repo and sub-repo. The advice - or my understanding of it - talked more about the principles rather than the practical consequences.

It was only once we went live with Mercurial and a sub-repo, that I really understood the advice properly. Here (from memory) are examples of the sorts of problems we encountered.

  • Your users will end up fighting the update and merge process.
  • Some people will update the parent repo and not the sub-repo
  • Some people will push from the sub-repo, ang .hgsubstate won't get updated.
  • You will end up "losing" revisions that were made in the sub-repo, because someone will manage to leave the .hgsubstate in an incorrect state after a merge.
  • Some users will get into the situation where the .hgsubstate has been updated but the sub-repo hasn't, and then you'll get really cryptic error messages, and will spend many hours trying to work out what's going on.
  • And if you do tagging and branching for releases, the instructions for how to get this right for both parent and sub-repo will be many dozens of lines long. (And I even had a nice, tame Mercurial expert help me write the instructions!)

All of these things are annoying enough in the hands of expert users - but when you are rolling out Mercurial to novice users, they are a real nightmare, and the source of much wasted time.

So, having put in a lot of time to get a conversion with a sub-repo, several weeks later we then converted the sub-repo to a repo. Because we had large amounts of history in the conversion that referred to the sub-repo, via .hgsubstate, it's left us with something much more complicated.

I only wish I'd really appreciated the practical consequences of all the advice much earlier on, e.g. in Mercurial's Features of Last Resort page:

But I need to have managed subprojects!

Again, don't be so sure. Significant projects like Mozilla that have tons of dependencies do just fine without using subrepos. Most smaller projects will almost certainly be better off without using subrepos.


Edit: Thoughts on shell repos

With the disclaimer I don't have any experience of them...

No, I don't think many of them are. You are still using sub-repos, so all the same user issues apply (unless you can provide a wrapper script for every step, of course, to remove the need for humans to supply the correct options to handle sub-repos.)

Also note that the wiki page you quoted does list some specific issues with shell repos:

  • overly-strict tracking of relationship between project/ and somelib/
  • impossible to check or push project/ if somelib/ source repo becomes
  • unavailable lack of well-defined support for recursive diff, log, and
  • status recursive nature of commit surprising

Edit 2 - do a trial, involving all your users

The point at which we really started realising we had an issue was once multiple users started making commits, and pulling and pushing - including changes to the sub-repo. For us, it was too late in the day to respond to these issues. If we'd known them sooner, we could have responded much more easily and simply.

So at this point, the best advice I think I can offer is to recommend that you do a trial run of the project layout before the layout is carved in stone.

We left the full-scale trial until too late to make changes, and even then people only made changes in the parent repo, and not the sub-repos - so we still didn't see the full picture until too late.

In other words, whatever layout you consider, create a repository structure in that layout, and get lots of people making edits. Try to put enough real code into the various repos/sub-repos so that people can make real edits, even though they will be throw-way ones.

Possible outcomes:

  • You might find it all works fine - in which case, you'll have spent some time to gain certainty.
  • On the other hand, you might identify issues much more quickly than spending time trying to work out what the outcomes would be
  • And your users will learn a lot too.
codyzu

Question 1:

This command, when executed in the parent "shell" repo will traverse all subrepos and list changesets on from the default pull location that are not present:

hg incoming --subrepos

The same thing can be accomplished by clicking on the "Incoming" button on the "Synchronize" pane in TortoiseHg if you have the "--subrepos" option checked (on the same pane).

Thanks to the users in the mercurial IRC channel for helping here.

Questions 2 & 3:

First I need to modify my repo structures so that the parent repos are truly "shell" repos as recommended on the hg wiki. I will take this to the extreme and say that the shell should contain no content, only subrepos as children. In summary, rename src to main, move docs into the subrepo under main, and change the prod folder to a subrepo.

SHARED1_SLN:

SHARED1_SLN-+-libs----NLOG
            |
            +-misc----KEY
            |
            +-main----SHARED1-+-docs
            |                 +-proj1
            |                 +-proj2
            |
            +-tools---NANT

SHARED2_SLN:

SHARED2_SLN-+-libs--+-SHARED1-+-docs
            |       |         +-proj1
            |       |         +-proj2
            |       |
            |       +-NLOG
            |
            +-misc----KEY
            |
            +-main----SHARED2-+-docs
            |                 +-proj3
            |                 +-proj4
            |
            +-tools---NANT            

PROD_SLN:

PROD_SLN----+-libs--+-SHARED1-+-docs
            |       |         +-proj2
            |       |         +-proj2
            |       |
            |       +-SHARED2-+-docs
            |       |         +-proj3
            |       |         +-proj4
            |       |
            |       +-NLOG
            |
            +-misc----KEY
            |
            +-main----PROD----+-docs
            |                 +-proj5
            |                 +-proj6
            |
            +-tools---NANT
  1. All shared libs and products have there own repo (SHARED1, SHARED2, and PROD).
  2. If you need to work on a shared lib or product independently, there is a shell available (my repos ending with _SLN) that uses hg to manage the revisions of the dependencies. The shell is only for convenience because it contains no content, only subrepos.
  3. When rolling a release of a shared lib or product, the developer should list the all of the dependencies and their hg revs/changesets (or preferably human friendly tags) that were used to create the release. This list should be saved in a file in the repo for the lib or product (SHARED1, SHARED2, or PROD), not the shell. See Note A below for how this could solve Questions 2 & 3.
  4. If I roll a release of a shared lib or product I should put matching tags in the in the projects repo and it's shell for convenience, however, if the shell gets out of whack (a concern expressed from real experience in @Clare 's answer), it really should not matter because the shell itself is dumb and contains no content.
  5. Visual Studio sln files go into the root of the shared lib or product's repo (SHARED1, SHARED2, or PROD), again, not the shell. The result being if I include SHARED1 in PROD, I may end up with some extra solutions that I never open, but it doesn't matter. Furthermore, if I really want to work on SHARED1 and run it's unit tests (while working in PROD_SLN shell), it's really easy, just open the said solution.

Note A:

In regards to point 3 above, if the dependency file use a format similar to .hgsub but with the addition of the rev/changeset/tag, then getting the dependencies could be automated. For example, I want SHARED1 in my new product. Clone SHARED1 to my libs folder and update to the tip or the last release label. Now, I need to look at the dependencies file and a) clone the dependency to the correct location and b) update to the specified rev/changeset/tag. Very feasible to automate this. To take it further, it could even track the rev/changeset/tag and alert the developer of there is dependency conflict between shared libs.

A hole remains if Alice is actively developing SHARED1 while Bob is developing PROD. If Alice updates SHARED1_SLN to use NLog v3.0, Bob may not ever know this. If Alice updates her dependency file to reflect the change then Bob does have the info, he just has to be made aware of the change.

Bigger Questions 1 & 4:

I believe that this is a source control issue and not a something that can be solved with a dependency management tool since they generally work with binaries and only get dependencies (don't allow committing changes back to the dependencies). My dependency problems are not unique to Mercurial. From my experience, all source control tools have the same problem. One solution in SVN would be to just use svn:externals (or svn copies) and recursively have every component include its dependencies, creating a possibly huge tree to build a product. However, this falls apart in Visual Studio where I really only want to include one instance of a shared project and reference it everywhere. As implied by @Clare 's answer and Greg's response to my email to the hg mail list, keep components as flat as possible.

Bigger Questions 2 & 3:

There is a better structure as I have laid out above. I believe we have a strong use case for using subrepos and I do not see a viable alternative. As mentioned in @Clare 's answer, there is a camp that believes dependencies can be managed without subrepos. However, I have yet to see any evidence or actual references to back this statement up.

Bigger Question 5:

Still open to better ideas...

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!