Friday, March 30, 2012

What happened to pre-release versions?

In RFC 175, we proposed to introduce two new things for Core Release 5. One was a change to the version syntax to support pre-release versions and the other was to introduce a VersionRange class to provide a common implementation of version range operations. When the Core Release 5 spec was completed, it included the VersionRange class but did not include pre-release version support. So, what happened?

Pre-release, aka. snapshot, versions seemed like a good idea when first proposed. From the RFC:
During development of a bundle (or package), there is a period of time where the bundle has not been declared final. That is, the bundle has a planned version number once final, but that version number is not practically consumed until the bundle has been declared final. However, during development of the bundle, it must have a version number. This version number must be larger than the version number of the previous final version of the bundle but less than the version number intended for the bundle once final.
There are several usage patterns for version numbers which have emerged to deal with this problem. For example, some use an odd/even version number for the minor version to differentiate between development versions and final (release) versions. Some also place the build timestamp in the qualifier to distinguish all built versions of a bundle, but there is no clear marking which is the final version so dependents cannot mandate a final version.
So we proposed a change to the version syntax to open up space between version numbers so that before the unqualified version (e.g. 1.2.3) there would be pre-release versions. So 1.2.3-snapshot would be a lower version number than 1.2.3. It would have a total ordering over all versions and be backwards compatible with existing version syntax.

1.2.3- < 1.2.3-x < 1.2.3 = 1.2.3. < 1.2.3.x

However, we also had to work properly with existing version range syntax. For example, is the version 1.2.3-snapshot included in the range [1.2.3,2.0.0)? We defined two rules for this.
  1. If a version range having an endpoint specified without a qualifier (e.g. [1.2.3,2.0.0)) would include a version with a release qualifier of the empty string (e.g. 1.2.3), then the version range must also include that version when identified as pre-release (e.g. 1.2.3-x).
  2. If a version range having an endpoint specified without a qualifier (e.g. [1.2.3,2.0.0)) would exclude a version with a release qualifier of the empty string (e.g. 2.0.0), then the version range must also exclude that version when identified as pre-release (e.g. 2.0.0-x).
All together we had a complete design and I was able to implement the changes to the Version and VersionRange classes and write compliance tests. We even implemented it in Equinox. So from a runtime point of view, things looked OK.

But the big concern come around interactions with existing tooling, management and provisioning systems. These systems would not understand a bundle having a pre-release version string. They would require a lot of changes to properly handle support the pre-release version syntax.

Furthermore, we also become concerned about the mental complexity of pre-release versions. In numerous discussions within CPEG and with peers, people would get confused over the ordering of versions and whether some version was included in some range. If we, the "experts", couldn't keep it straight in our minds, we might expect others to also have a hard time.

So in the end, given the mental complexity and the downstream impact to tools, repositories and management systems, CPEG decided that the benefit of the changes was not sufficient to justify the cost of the change. So we agreed, after some lengthy discussions to discard the pre-release version proposal.

BJ Hargrave

Friday, March 23, 2012

Surprising Services

Services are arguably what OSGi is about but at the same time they are also the least understood. When we set out to design OSGi our key requirement was to create an ad-hoc collaborative programming model. Instead of components that are centrally managed and/or are closely coupled to their siblings, we looked for a model of independent peers. To understand what I mean:


In a peer-to-peer model you need to be able to find out who your peers are, and how you can collaborate with them. In the agent or actor model the peer is identified and messages are exchanged based on peer identity. Peer dependencies suffers from the transitive dependency aggregation, quite soon almost all agents are transitively dependent on each other and there is no more component reuse. This is the big ball of mud problem. To address this problem we came up with the service model.  OSGi services combine package based contracts with an instance broker, to provide a managed conduit between peers. 

That said, then what is actual the value of the services for application developers?

Well, the surprising effect of services is that the contracts can often be significantly simplified over traditional APIs because in most Java APIs the collaborative parts are mixed with instance coupling techniques (hacks?)  (DocumentBuilderFactoryInitialContextFactoryBuilder anyone?) and administrative aspects.

It turns out that with services you rarely need to define contracts for these aspects since they are taken care of by the service model (factories, but much more), or in most cases can stay inside the component. Many administrative aspects can be handled by the implementation. Service contracts can be limited to strictly the collaborative parts. And size does matter, the smaller the contract, the easier it is to use.

For example, assume you need to use a persistent queue that is used by a set of workers. Active MQ, Amazon SQS, etc. have a significant number of API calls about maintaining the queues, setting the properties, and interacting with it. However, virtually all of those aspects can be defined in Configuration Admin, the only collaborative aspects are how does the worker get its task and how do you queue a task?

The by far simplest solution I know is to define a contract where only the worker registers a Worker service and a MessageQueue service:

  @Component(properties="queue=myqueue")
  public class MyWorker implements Worker<MyTask> {
      MessageQueue queue;


      public void work(MyTask task) { 
         ...
         queue.post( AnotherTask.class, another );
      }


      @Reference(target="(queue=myqueue)")
      void setQueue( MessageQueue queue ) {
         this.queue = queue;
      }
  }

This queuing contract is sufficient to allow a wide variety of different implementations. Implementing this with the Amazon Simple Queue Service is quite easy. A puny little bundle can look for the services, uses the queue service property to listen to queues and dispatch the messages. In this case, the web based AWS console can be used to manage the queues, no code required. A more comprehensive implementation can use Configuration Admin to manage queues, or it can create queues on demand. Implementing this on another persistent queue can be done quite different without requiring any change in the components that act as senders and workers.

If there is one rule about simplifying software that works consistently then it is hiding. Something that you could not see can't bug you. OSGi services are by far the most effective way to hide implementation details; minimizing what must be shared. Our industry is predicted to double the amount of code in the next 8 years, we better get our services on or this avalanche will bury us.

 Peter Kriens

Tuesday, March 20, 2012

Coordinator

Last year we introduced the Coordinator in the Compendium 4.3. Unfortunately, this release 4.3 was held up over some legal issues. However, it will soon be available, in the 4.3 Compendium as well as the Enterprise 5.0.

The Coordinator service is a bit my baby. When we started with OSGi almost 14 years ago one of the first things I proposed was to start with a transaction manager. I'd just read in 3days Transaction Processing from Gray & Reuters and was thrilled, that had been the best non-fiction book I ever read. Obviously the ACID properties were interesting, and very informative to see how they could be implemented, but the most exciting part was the coordination aspect of transactions. Transactions, as described in the seminal book, allowed different parties (the resource managers) to collaborate on a task without prior knowledge of each other. Resource managers when called could discover an ongoing transaction and join it. The transaction guaranteed them a callback before the task was finished. This of course is a dream in a component model like OSGi where you call many different parties of which most you have no knowledge of. Each called service could participate in the ongoing task and be informed when the task was about to be finished. When I proposed a transaction manager the guys around the table looked at me warily and further ignored me, transactions in an embedded environment?

We got very busy but I kept nagging and the rest kept looking if I was silly in the head. In 2004 I finally wrote RFC 98, a modest proposal for a transaction manager light. Of course I immediately ran into the situation that, even though few if any had used it, that there was an already existing Java Transaction API (JTA). Prosyst did some work on this since they saw some value but in the end it was moved to a full blown JTA service. This was not what I wanted because JTA is weird (from my perspective then) because it distinguishes too much between the Container people and the application people. OSGi is about peer-to-peer, containers are about control from above. Try to act like a resource manager with XA (which would give the coordination aspects), however, it looks like it was made difficult on purpose.

The it hit me, I always ran into the opposition because I used a name that too many people associated with heavy and complexity. Though a full blown distributed high performance robust transaction manager is to say the least a non-trivial beast I was mostly interested in the coordination aspects inside an OSGi framework, a significantly simpler animal. So choose to change the name! The Coordinator service was born!

The first RFC was a very simple thread based Coordinator service. When you're called in your service you can get the current Coordination (or you can start one). Each thread had its own current Coordination. Anybody could then join the Coordination by giving it a participant. The Coordination can either fail or succeed, after which all the participants are called back and informed of the outcome. Anybody that has access to the Coordination can fail it, the Coordination will also fail with time out if it is not terminated before a deadline.

So how would you use it? The following code shows how a Coordination is started:


  Coordinator  coordinator = ...
 
  Coordination c = coordinator.create("work",0);
  try {
    doWork(c);
  } catch( Throwable t ) {
    c.fail(t);
  } finally {
    c.end(); 
  }
This template is very robust. The Coordination is created with a name and a timeout. The work is then done in a try/catch/finally block. The catch block will fail the Coordination. Calling end on a failed Coordination will throw an exception so the exception does not get lost. A worker would do the following to participate in the Coordination:

 Coordinator coordinator = ... 
 void foo() {
   doPrepare();
   if ( !coordinator.addParticipant(this))
     doFinish();
 }
 
A worker can use the Coordination service to add itself as a participant. It is then guaranteed to get a call back when the current Coordination is terminated.

An example use case is batching a number of updates. Normally you can significantly optimize if you can delay communications by batching a number of updates. However, how you do you know when you had the last update so you can initiate the batch? If there is no Coordinator, the updates are done immediately, with a Coordinator they can be batched until the Coordination is terminated.


During the specification process a number of features were added: direct Coordinations (not thread based), a variable store per Coordination, and a reflective API.

I guess it will take some time before the Coordinator is really taken advantage of since the model is quite foreign to most developers. However, I am convinced that the service is really what OSGi is all about: coordinated collaborative components.

Peter Kriens