Friday, June 25, 2010

To Coordinate in OSGi

Since day one of the OSGi Framework (over 12 years ago) I have been trying to get a light weight transaction model into the specifications. After several attempts that were skillfully aborted by others or way to heavy for what I had in mind I had actually given up. That is, until the last face-to-face meeting in Mountain View. David Bosschaert (Redhat and EEG co-chair) was looking for a better Configuration Admin solution than the Managed Service (Factory) services provided. One of the key requirements he had was that of a composite update. In Configuration Admin updates are done per PID. A PID is a persistence identity for a Dictionary that contains the configuration properties. David argued that in larger enterprise applications there is need to compose the configuration properties out of a number of smaller dictionaries that each represent a configuration aspect. For example, there could be a com.example.system.config PID for system configuration and a com.example.http PID for configuring an HTTP server.

An very good idea I think but he was confronted with a serious problem. Though Configuration Admin nowadays allows the use of multiple PIDs per service, it does not give any timing guarantee other then that each update will be done on another thread than the setter. The problem with the multiple configuration updates is thus that you could get parts of your configuration milliseconds apart. That is normally not good because changes in configuration are potentially causing expensive operations. After several attempts to find a good solution we realized that the transaction model could solve this problem rather nicely. If the Configuration Admin was transaction aware it could wait with updating the target services and sending out events until the commit part of the transaction.

Now there is something funny with transactions, they have a weird effect on system developers. The moment you start talking about transactions they seem them gain 10 pounds and age 10 years. Transactions are seen as very heavy weight because of the recovery requirements and setup. And acting as a resource manager is a non-trivial task with the XA API. Having lost the battle to use real transactions several times (Framework, Dmt Admin, Deployment Admin) I was not prepared to start such a battle again. That is, until it dawned on me that what I really was looking for was a coordination API, transaction are providing much more than I needed.

My side of the software fence has been relatively free of persistence problems, the whole ACID part of transactions was never my favorite part. I liked transactions because it allowed coordinated collaboration between different parties. In an OSGi system you never know how the processor flows through different services when you call another party. In this model there are many cases where you could things more efficiently, or atomically, if you only knew when the task that was being worked upon was being finished. The coordination part of the transactions was always what I liked so much because I knew there was going to be a callback at the end of the transaction.

So could the answer be a light-weight coordination API? If the Configuration Admin was updated to use this API, it could not send out notifications or update managed targets until all the changes were made, that is, at the end of the coordination. So how could this look like (notice: API is work in progress):
Coordinator coordinator = ... ;
ConfigurationAdmin admin = ... ;

void updateConfigurations(Map configs) {
Coordination c = coordinator.begin("update configurations");
try {
for ( Map.Entry e : configs.entrySet() ) {
Configuration c = admin.getConfiguration(e.getKey());
c.update( e.getValue() );
}
if ( c.end() == OK )
return;
// log!
} finally {
terminate(); // Ensure proper termination
}
}
So how does this look like for the participants? For example, how would Configuration Admin schedule its updates when it would use coordinations? Well, a participant must indicate it wants to participate in a coordination. The method to start a participation is on the Coordinator service. The code to participate is as follows for a schedule method of a Runnable that wants to delay scheduling the Runnable's until the coordination is ended:
 final List queue = new ArrayList();

void schedule( Runnable r ) {
if ( coordinator.participate( this ) ) {
synchronized(queue) {
queue.add(r);
}
} else
executor.execute(r);
}
The Coordinator will callback the participant on either the failed() method or the ended() method. The failed() method can be called concurrently with the initiating thread. The ended() method is always called on the initiating thread.
 // the coordination failed, clear the result
public void failed() {
synchronized(queue) {
queue.clear();
}
}

// the coordination ended ok, clear the result
public void ended() {
for ( Runnable r : queue )
executor.execute(r);
queue.clear();
}
The coordination API therefore allows two completely different implementations to synchronize their work on a common task. This API seems incredibly useful to optimize many of our existing admin APIs: from the framework itself to Remote Service Admin. I wish I had realized much earlier not to call it a transaction API ...

Peter Kriens

P.S. This Coordination API is work in progress, there is no promise this work will ever end up in an official OSGi spec.

No comments:

Post a Comment