Saturday, December 12, 2009

Versions

Looking at the raging debate at versions again I can't help but feeling that there are many people out there that do not understand what OSGi versions are, or what versions are intended to achieve in general. It is not the syntax that are important, it is about the standardization of the semantics.

Versions are a language designed to let two parties communicate over the barrier of time. It is like a Domain Specific Language between artifacts that evolve over time. By having fixed version syntax and semantics, an importer can express how it feels about future changes in the exporter. These semantics tell the exporter what version to use when it evolves. The version indicates to tools how different importers and exporters an be combined in a system.

The simplest solution to versioning is to use a fixed identity. A imports B version 1. If B changes, it becomes version 2 and A must be recompiled, but this changes A so it must also increment its version, ad nauseum. Such systems only work when all the software is in a single build integrated with the deployment. Every deploy is then based upon a full new build, you're basically always at the latest version. However, when you use third party packages or you sell your software then latest version systems tends to be unworkable because minute changes ripple through the whole system.

We need some oil to ease the friction, meet version ranges. A version range allows A to import B from version 1 to version 2, non inclusive. This way we can allow B to increase it is version from 1.1, 1.2, 1.3, etc. without requiring A to be recompiled, stopping any rippling effects dead in their tracks. By specifying an import range of [1,2) A relies on B to properly version future releases. If a change is backward compatible, B can increase the minor part of the version (second number) but does not have to increase the major number (the first number). If a change is made that would break existing code, then B increases the major number and resets the minor number. Such a breaking change would get version 2.0.

We skimped a bit on backward compatibility, this is not a well defined concept and partly in the eye of the beholder. A new method on an interface is backward compatible for code using that interface but breaks an implementer of that interface. How is this difference handled? Well, we can put the semantics on the minor part of the version. Implementers use a range for a single minor version and not a single major version number. So if A implemented interfaces from B it would not use [1,2) but [1.0,1.1), if it only used the interfaces in B it would use [1,2).

Backward compatible does not mean forward compatible, if A was compiled against 1.4 then all bets are off if it was bound to version 1.4. It is therefore important that A requires the base version it was compiled against as a minimum. That is, [1.4,2).

However, we make many changes that we want to deploy but we do not want this reflected in the dependency. If we force all dependencies to be the latest version we end up with the same situation we had when we had a single number; any change in version ripples through the whole system again. That is why the OSGi version has a micro number, the last part. It indicates that you made a change but this change does not affect any clients. It is a small bug fix or minute functional enhancement. Obviously deployments should have the latest fixes installed but it is not enforced to keep things manageable in the field. So even though A might be recompiled against 1.4.2, it would put out a version range that did not include the micro part: [1.4,2).

Last, and in this case also least, there is often a need to know which of two artifacts is the newest, or has a certain state. This is left to the qualifier.

From the previous description it is clear that a simple version syntax can have surprisingly rich semantics. Having these semantics specified and agreed upon allows tools to automate parts of the process of maintaining the versions of the artifacts. This is a good thing because users are quite horrible in maintaining versions. For example, the bnd tool can automatically set the import range depending on the export range, takes implements versus uses into account, and I am working on automatically calculating export version changes based on a previous release. Also Eclipse PDE has extensive versioning support, and I expect bundlor also to do clever things. And last, but not least, it looks like Sonatype will also standardize on OSGi version. Without having fixed syntax and semantics, all these tools would have to use proprietary mechanisms.

So yes, it can be fun to use π for version 3.14 but it kills any hope of tools that understand the semantics and take the error prone chore of maintaining versions out of our hands.

Peter Kriens