Friday, May 6, 2011

Hudson at Eclipse, Versioning, and Semantics

Hudson is moving to Eclipse and that is great news, hope they will now seriously adopt the OSGi bundle as their plugin model and leverage the OSGi µservice model. Knowing that Stuart McCulloch is heavily involved makes me confident that this will work out fine.

One of the early discussions that popped up was a mail thread about versioning. Though the discussion is moving in the right direction (adopting the OSGi version model) the discussion was correct about the syntax but seem to assume that compatibility can be discussed bi-laterally. I am afraid that they are not alone, most people assume that there are only two parties involved with versioning (though in the case of marketing versions the assumption is just one party). However, the key insight we had in the OSGi was that it is a tri-lateral problem.

It is a tri-lateral problem because today interface based programming is not only very popular, it is a best practice. They key advantage of interface based programming is that the Provider and Consumer are bound to an API (the interface) but not to each other. That is, interface based programming removes implementation dependencies. However, though this model significantly reduces the coupling, it unfortunately complicates the versioning in a rather subtle way. So subtle that its implications are ill understood.

With interface based programming there are 3 independent parties:
  1. C - Consumer
  2. A - API
  3. P - Provider
Both C and P depend on A. However, the key insight is that they depend differently on A! Backward compatibility is always about the relation between C and A. When we make an update to A we are usually careful that any consumers are not broken. However, we do not have this luxury for Providers. Almost any change in A requires a change in P because P implements the contract in A.

So how do we version this? Versions are a mechanism to communicate the evolution path of an entity. If I have an entity X that depends on entity Y, how do I decide that Y is compatible with X's assumptions? If X is a Consumer, backward compatibility is relatively easy to provide. If X is a Provider, backward compatibility is much more restricted. We therefore need to encode the following cases in a version:
  1. Backward compatibility with all
  2. Backward compatibility with Consumers
  3. Backward compatibility with Providers
This model works very well with a 3-part version model:
  1. micro - Changes in the micro part do not affect Consumers nor Providers
  2. minor - Changes in the minor part affect Providers but not Consumers
  3. major - Changes in the major part affect Providers and Consumers
In OSGi, we therefore tell people that Providers should import with a version range that limits changes to the micro part only and starts at the version that is compiled against, for example [1.3,1.4). Consumers can be more relaxed: [1.3,2). Not that this a very exact definition of the parts of a version that can largely be mapped to the binary compatibility rules of Java with a bit of help. bnd can remove most of the manual aspects of maintaining these versions.

The only thing now left is the granularity of the version. Do we put the version on the JAR (bundle) or the interface class? If the JAR contained only API then the JAR would be a good granularity. However, in practice  JARs are often not very cohesive and carry besides their API also implementation classes and convenience dependencies. That is, their fan out is uncomfortably large causing maven to download the internet. Classes are too small because most API's consist of a set of classes and versioning these separately is not sensible. The perfect granularity to depend on is therefore the package. A package is a cohesive set of classes that has therefore the right granularity for an API.

I hope that this blog clarifies why it is important to understand versioning in the light of modern programming.

Peter Kriens

3 comments:

  1. Hi Peter,

    I am quite convinced too than proper versioning can help giving confidence when updating libraries.

    You might be interested by a small Java library I created to help applying semver principles (http://semver.org) to Java project. It also includes a maven plugin to check for backward compatibility during maven lifecycle.
    See https://github.com/jeluard/semantic-versioning (documentation http://jeluard.github.com/semantic-versioning).

    Julien

    ReplyDelete
  2. Interesting, though the semantic versioning paper you reference does not make the distinction between the consumer and provider of an API that I tried to explain in this blog. Notice also the work in bnd and in bndtools.

    ReplyDelete
  3. Right. I introduced a similar concept in my API (based on work done by bndtools guys).
    See Checker#CompatibilityType.

    ReplyDelete