Repeatable builds and maven?

A repeatable build is a build which you can re-run multiple times using the same source and the same commands, and that then results in exactly the same build output time and time again. The capability to do repeatable builds is an important cornerstone of every mature release management setup.

Of course, lots of people use some silly much more limited definition of a repeatable build, and are happy as long as “all tests pass”.

Getting to repeatable builds is nearly impossible for mere mortals using maven 1 (heck, maven 1 out of the box doesn’t even work anymore, since the ibiblio repository was changed in a way that causes maven 1 to break), and it is still prohibitively difficult with maven 2.

Of course, the people that do repeatable builds really well tend to create big all-encompassing solutions that are really hard to use with the tools used in real life, and they only really help you when you either do not have a gazillion dependencies, or you do the SCM for all your dependencies, too. For the average java developer, that all breaks down when you find out you can’t quite bootstrap the sun JDK very well, or are missing some other bit of important ‘source’.

Don’t get me wrong. Maven can be a very useful tool. Moreover, in practice, if you do large-scale java, you simply tend to run into maven at some point, and as a release engineer you cannot always do much about that. You must simply realize that when you’re doing release engineering based around maven, that is only sensible if you still really, really pay lots of attention to what you’re doing. Like running maven in offline mode for official builds. And wiping your local repository before you build releases. And keeping archived local repositories around with your distributions. And such and so forth.

Not paying attention or not thinking these kinds of tricky release engineering things through just isn’t very sensible, not when you’re doing so-called enterprise stuff where you might have to re-run a build 3 years after the fact. You cannot afford to count on maven to just magically do the right thing for you. Historically and typically, it doesn’t, at least not quite.

6 thoughts on “Repeatable builds and maven?”

  1. What I find most astonishing, coming from the GNU/Linux world, is the total lack of comprehension for the importance of access to corresponding source code for a binary artifact in order to be able to rebuild it, offline, etc. It’s amazing how many of the people pushing OSGi, etc. just don’t get what a repeatable build is and how their tools prevent it from happening.

  2. I was confused at first, because there was no correlation from your sequence of events to the actual post. None of the referred posts are about reproducibility, and Gump support is about making completely unreproducible builds 🙂

    You are right about the issues in the local repository, however. I recently wrote a proposal for the future to make it a true cache rather than containing build state:

    This in combination with strictly versioned dependencies and plugins (which can be enforced with the Maven enforcer plugin) do give all the pieces to ensure a reproducible build to the extent that you trust the repositories. You have a choice about whether you want to use the remote ones, keep your own unchanged one internally, or to check them into source control as you wish.

  3. I think I should not disturb, I would make myself unpopular immediately 🙂

    The right way to package a local maven repository within debian is to not package a local maven repository with debian. Maven is not readily debian-packagable, it assumes living in private user space too much. Maven users should have their own user-local installs.

    Then, people should help Brett implement that local repository separation, making sure to tweak it so that it becomes easy to package the package-able bits.

  4. Oh, I agree, and I’d be happy to avoid messing around with Maven’s repositories while they are undergoing a design change for as long as I can. 😉

    The current package of maven2 in Debian isn’t used to build any package yet, afaik, so the thread is really just about using the Maven2 package to provide fully reproducible, fully offline builds solely with the packages for Java libraries provided as packages within Debian, as those are the ones we can maintain, make sure they are free software coming with corresponding source code, can be built & patched if necessary with the existing toolchains, etc.

    The push to get Maven into Debian comes from the need to deal with packages that are already packaged in Debian switching to Maven for their build systems.

    So for the build time one needs a way to:

    a) tell Maven not to go out on the net to grab anything

    b) tell Maven where to find the jars corresponding to the dependencies the build requires on the file system, preferably grabbing all the information it needs to set up such a virtual, build-specific repository on the fly from the metadata of the corresponding Debian packages

    c) tell Maven to forget about it all when it’s done, so that no distro-specific-maven-artifacts linger on for the user for the next invocation of maven.

    So, there we are. Which way to go? Tools to on the fly generate POMs for JARs in /usr/share/java from dpkg output (and gems for raven, and poms for ivy, and osgi bundle descriptors, and JAM files for 277, and whatever else the next big Java modularity/build system repository format is). Or maybe patching Maven’s Wagon (and Ivy, Raven, …) to grok dpkg as a metadata source?

    Gotta love the state of the art. 😉

Comments are closed.