Using long-lived stable branches

For the last couple of years I’ve been using subversion on all the commercial software projects I’ve done. At joost and after that at the BBC we’ve usually used long-lived stable branches for most of the codebases. Since I cannot find a good explanation of the pattern online I thought I’d write up the basics.

Working on trunk

Imagine a brand new software project. There’s two developers: Bob and Fred. They create a new project with a new trunk and happily code away for a while:

Stable branch for release

At some point (after r7, in fact) the project is ready to start getting some QA, and its Bob’s job to cut a first release and get it to the QA team. Bob creates the new stable branch (svn cp -r7 ../trunk ../branches/stable, resulting in r8). Then he fixes one last thing (r9), which he merges to stable (using svnmerge, r10). (Not paying much attention to the release work, Fred’s continued working and fixed a bug in r11) Bob then makes a tag of the stable branch (svn cp brances/stable tags/1.0.0, r12) to create the first release.

QA reproduce the bug Fred has already fixed, so Bob merges that change to stable (r14) and tags 1.0.1 (r15). 1.0.1 passes all tests and is eventually deployed to live.

Release branch for maintenance

A few weeks later, a problem is found on the live environment. Since it looks like a serious problem, Bob and Fred both drop what they were doing (working on the 1.1 release) and hook up on IRC to troubleshoot. Fred finds the bug and commits the fix to trunk (r52), tells Bob on IRC, and then continues hacking away at 1.1 (r55). Bob merges the fix to stable (r53) and makes the first 1.1 release (1.1.0, r54) so that QA can verify the bug is fixed. It turns out Fred did fix the bug, so Bob creates a new release branch for the 1.0 series (r56), merges the fix to the 1.0 release branch (r57) and tags a new release 1.0.2 (r58). QA run regression tests on 1.0.2 and tests for the production bug. All seems ok so 1.0.2 is rolled to live.

Interaction with continuous integration

Every commit on trunk may trigger a trunk build. The trunk build has a stable period of just a few minutes. Every successful trunk build may trigger an integration deploy. The integration deploy has a longer stable period, about an hour or two. It is also frequently triggered manually when an integration deploy failed or deployed broken software.

Ideally the integration deploy takes the artifacts from the latest successful trunk build and deploys those, but due to the way maven projects are frequently set up it may have to rebuild trunk before deploying it.

Every merge to stable may trigger a stable build. The stable build also has a stable period of just a few minutes, but it doesn’t run as frequently as the trunk build simply because merges are not done as frequently as trunk commits. The test deploy is not automatic – an explicit decision is made to deploy to the test environment and typically a specific version or svn revisions is deployed.

Reflections

Main benefits of this approach

Reasonably easy to understand (even for the average java weenie that’s a little scared of merging, or the tester that doesn’t touch version control at all).
Controlled release process.
Development (on trunk) never stops, so that there is usually no need for feature branches (though you can still use them if you need to) and communication overhead between developers is limited.
Subversion commit history tells the story of what actually happened reasonably well.

Why just one stable?

A lot of people seeing this might expect to see a 1.0-STABLE, 1.1-STABLE, and such and so forth. The BSDs and mozilla do things that way, for example. The reason not to have those comes down to tool support – with a typical svn / maven / hudson / jira toolchain, branching is not quite as cheap as you’d like it to be, especially on large crufty java projects. It’s simpler to work with just one stable branch, and you can often get away with it.

From a communication perspective it’s also just slightly easier this way – rather than talk about “the current stable branch” or “the 1.0 stable branch”, you can just say “the stable branch” (or “merge to stable”) and it is not ever ambiguous.

Why a long-lived stable?

In the example above, Bob and Fred have continued to evolve stable as they worked on the 1.1 release series – for example we can see that Bob merged r46,47,49 to stable. When continuously integrating on trunk, it’s quite common to see a lot of commits to trunk that in retrospect are best grouped together and considered a single logical change set. By identifying and merging those change sets early on, the story of the code evolution on stable gives a neat story of what features were code complete when, and it allows for providing QA with probably reasonably stable code drops early on.

This is usually not quite cherry-picking — it’s more likely melon-picking, where related chunks of code are kept out of stable for a while and then merged as they become stable. The more coarse-grained chunking tends to be rather necessary on “agile” java projects where there can be a lot of refactoring, which tends to make merging hard.

Why not just release from trunk?

The simplest model does not have a stable branch, and it simply cuts 1.0.0 / 1.0.1 / 1.1.0 from trunk. When a maintenance problem presents itself, you then branch from the tag for 1.0.2.

The challenge with this approach is sort-of shown in these examples — Fred’s commit r13 should not make it into 1.0.1. By using a long-lived stable branch Bob can essentially avoid creating the 1.0 maintenance branch. It doesn’t look like there’s a benefit here, but when you consider 1.1, 1.2, 1.3, and so forth, it starts to matter.

The alternative trunk-only approach (telling Fred to hold off committing r13 until 1.0 is in production) is absolutely horrible for what are hopefully obvious reasons, and I will shout at you if you suggest it to me.

For small and/or mature projects I do often revert back to having just a trunk. When you have high quality code that’s evolving in a controlled fashion, with small incremental changes that are released frequently, the need to do maintenance fixes becomes very rare and you can pick up some speed by not having a stable branch.

What about developing on stable?

It’s important to limit commits (rather than merges) that go directly to stable to an absolute minimum. By always committing to trunk first, you ensure that the latest version of the codebase really has all the latest features and bugfixes. Secondly, merging in just one direction greatly simplifies merge management and helps avoid conflicts. That’s relatively important with subversion because its ability to untangle complex merge trees without help is still a bit limited.

But, but, this is all massively inferior to distributed version control!

From an expert coders’ perspective, definitely.

For a team that incorporates people that are not all that used to version control and working with multiple parallel versions of a code base, this is very close to the limit of what can be understood and communicated. Since 80% of the cost of a typical (commercial) software project has nothing to do with coding, that’s a very significant argument. The expert coders just have to suck it up and sacrifice some productivity for the benefit of everyone else.

So the typical stance I end up taking is that those expert coders can use git-svn to get most of what they need, and they assume responsibility for transforming their own many-branches view back to a trunk+stable model for consumption by everyone else. This is quite annoying when you have three expert coders that really want to use git together. I’ve not found a good solution for that scenario; the cost of setting up decent server-side git hosting is quite difficult to justify even when you’re not constrained by audit-ability rules.

But, but this is a lot of work!

Usually when explaining this model to a new group of developers they realize at some point someone (like Bob) or some people will have to do the work of merging changes from trunk to stable, and that the tool support for stuff like that is a bit limited. They’ll also need extra hudson builds and worry a great deal how on earth to deal with maven’s need to have the version number inside the pom.xml file.

To many teams it just seems easier to avoid all this branching mess altogether, and instead they will just be extra good at their TDD and their agile skills. Surely it isn’t that much of a problem to avoid committing for a few hours and working on your local copy while people are sorting out how to bake a release with the right code in it. Right?

The resolution usually comes from the project managers, release managers, product managers, and testers. In service-oriented architecture setups it can also come from other developers. All those stakeholders quickly realize that all this extra work that the developers don’t really want to do is exactly the work that they do want the developers to do. They can see that if the developers spend some extra effort as they go along to think about what is “stable” and what isn’t, the chance of getting a decent code drop goes up.

13 thoughts on “Using long-lived stable branches”

jrep says:

February 19, 2010 at 19:15

Great article, love the graphics!

Another thought for your last section: when the project under consideration is part of a larger program, many projects working this way, contributing to a central integration and release, a new factor arises, a sort of “diminishing returns” thing: the acceptable level of process failures (bad commits that either get into the release, or hold up the release) is inversely proportional to the number of contributing projects. If your release process can stand ten such errors a year, and your development process produces five such errors a year — you’re in great shape. But if there are ten projects contributing to your release, then you’re seeing fifty errors, and you’re way way way outside your release-process tolerance. This becomes yet another pressure towards improving the stability of “stable.”
Marshall says:

February 19, 2010 at 23:56

Hi Leo – Can you say what tooling you used to produce those nice pictorial diagrams? Or what tooling you would recommend for this? Thanks!
1. Leo Simons says:
  
  February 22, 2010 at 11:57
  
  Hi Marshall, I used OmniGraffle Pro 5 which is a mac-only program. If you have a mac, well I cannot recommend it more highly — it’s arguably the best semi-structured drawing program in the world!
Jlamande says:

February 23, 2010 at 10:34

Hi, nice shot !
I have a question about a long living project.
Imagine that the trunk is getting filled with code of the new “2.0” version and that some refactoring has been done or features have been reviewed. A problem occured in the live environment and the fix based on trunk code may not be accurate (sorry, I hope this word means something 🙂 ) for the stable branch. The job of Bob won’t resume to a simple merge and could even be a new development.
I’m sure you already met this situation ;-). How would you process ?

You talked about bug fixing but another situation is a new minor feature. If the code is comitted to trunk, the operation to merge it to stable may be painful and worth harmful if some code of the trunk on the same components may have evolved. Would you create some feature branch ? Not exactly what was expected by working on trunk and merging to stable branch.

Thx
1. Leo Simons says:
  
  February 23, 2010 at 15:31
  The “2.0” scenario you’re describing is basically one to avoid when you can. Instead, create a 1.1, 1.2, 1.3, etc. There’s a lot of pretty reasonable ways you can manage adding radical new code or do big refactorings without doing “actual” branching or forking — for example look at Paul Hammant’s branch by abstraction.
  
  When you are in a close-to-a-rewrite situation, you can’t depend on svn to help very much anymore managing the versions. Instead, if you still want stable branches everywhere, you might end up with
  - branches/1.x-stable
  - branches/1.x-trunk
  - branches/2.x-stable
  - 2.x-trunk
  So in this case you (1) make the fix on 2.x, (2) merge the fix to 2.x-stable, (3) make a different fix on 1.x, (4) merge that fix to 1.x-stable. I tend to feel bad when I have to do that 🙂
  
  For the second case, where you have to merge a new feature to stable and it doesn’t merge cleanly, well, historically I would err towards the side of just merging more code (not just the feature) to the point that the change does merge cleanly. If you have a reasonably disciplined development process you can very often do that without problems. If not, finish up that piece of code so that you can cleanly merge it, and then merge it.
fabrice says:

March 2, 2010 at 17:20

Hi

I read an article of Martin fowler and this is his conclusion :

http://martinfowler.com/bliki/FeatureBranch.html

“On the whole, however, I don’t think cherry-picking with the VCS is a good idea. ”

So prefer think about a strong design (branch of abstraction) :
“Feature Branching is a poor man’s modular architecture, instead of building systems with the ability to easy swap in and out features at runtime/deploytime they couple themselves to the source control providing this mechanism through manual merging. ”

So I am not sure your article about a stable branch and cherry picking is a good idea too. I seems to be hard to remember which revision are merged or not to the stable branche etc…
1. Leo Simons says:
  
  March 2, 2010 at 19:51
  
  Hey fabrice, what can I tell you, Martin is wrong, it happens 🙂
  
  Martin’s article is mostly about being against feature branching and being for continuous integration. But the world is not black and white. The model I described above has continuous integration (on trunk), and it is definitely not feature branching.
  
  Hmm, looking more closely, in fact Martin’s description of the CI model is a bit of a mess – because his CI and PI models still have long-lived branches per developer (instead of just a per-developer working copy). I think that’s a mistake, and I doubt anybody that uses centralized version control actually does that in practice.
  1. fabrice says:
    
    March 3, 2010 at 10:31
    
    Hi Leo
    
    In fact I read a second time your article, I understand the benefits of a stable branche. You propose a nice mix between a trunk only approach and the feature branches. Yesterday I discussed with some developpement team chief leader this approach.
    Nowadays they commit all in the trunk (CI approach, I have a CI server that is triggered by each commit) when they need to release to QA they tag from the trunk, and create a maintenance branch ONLY when needed (when // developpement is running on the trunk in order to no interrupt developpement of a 2nd sprint)
    About the stable branche concept, they approve it but it is too time expensive, someone may work a lot to cherry picking from trunk to stable branch.
    However You are aware about that but as you said it is a top bottom approach that must be leaded.
    To finish I take the exemple of a comment of Jlamande , when you need to fix a production bug and the trunk has lot of code different from the production code. You explain, a solution could be the merge of a larger historical of code to the realese branch in order to make the merge cleaner. But the release branch must accept only correction code and not other improvement.
    Any idea ?
fabrice says:

March 3, 2010 at 10:32

I complete my precedent message. A solution could be to release in production more frequently in order to not have a production code too divergent from the current code ?
1. Leo Simons says:
  
  March 3, 2010 at 13:14
  
  Hey fabrice, I don’t know what “more frequent” means in your case. For relatively new codebases with several full-time devs I find that releasing into production at least once a month is a good idea. If you cannot do that the above model probably breaks down and you need to start running maintenance branches. Managing all of those and the merging between them has a significant cost, which is a cost that’s nice to avoid where possible.
  
  For older codebases with very few changes you will only very infrequently have a must-fix bug affecting current production and you can manage a few months of divergence without headache.
  1. fabrice says:
    
    March 3, 2010 at 19:04
    
    Humm ok So If a include my director and customer needs, why release in production a new version when :
    – No new features are requested
    – No bug fixes are expected
    If a push in production a new release not needed (except for technical version reason) that include some regression I can not imagine the consequence 🙂
    
    Second, If a bug is detected on a production release, I think it is faster to fix direclty on the branch and report to the trunk because you can deliver quickly the new production release fixed. On the contrary it is not the case.
    So for RELEASE branches, I think it is a good thing to fix on the branch and report to the trunk.
    Are you agree ?
Leo Simons says:

March 4, 2010 at 1:32

no, I don’t agree 🙂

You should first find out about this thing that you cannot imagine, and contrast it with the available alternatives. My assertion is that life is better when you fail early and fix quickly, and your director and your customers will probably agree (even if they don’t quite understand the choice you present them0.
1. fabrice says:
  
  March 4, 2010 at 10:01
  
  I am agree with you at 100% however It is a utopian world where you can detect all bug earlier.
  In my case test coverage on apps is around 0%…. So I am agree with you, but a precondition is to force developper to code test (and force manager to understand that tests are NEEDED)