Thursday, April 15, 2010

The Catwalk

I guess there is something in the air at the moment that makes people worried OSGi is not successful quickly enough because there are not 7 million Java programmers using the OSGi API on a day to day basis. Kirk Knoernschild gave us the choice between Feast or Famine and SD Times told us OSGi is too complex for the enterprise developer. Well, feasts tend to end in hangovers and I do agree the OSGi API is not very useful for most Java web developers. Is OSGi actually a technology that is used by a (web) application programmers? Will web developers have to start using Service References and Bundle objects? I don't think so.

If you develop, for example, a social networking app then you should not be concerned about low level infra structure details and OSGi should completely stay out of your face, this is the concept of high cohesion. OSGi should never be visible on the (web) application level. However, if you write a middleware component and need to interact with the applications then you need domain objects that represent these applications. Bundles and Service References are then the perfect domain objects that allows you to interact with that app on a system level. For example, the Spring DM extender bundle leverages the OSGi API to allow a developer to write POJOs oin his bundle. Many middle-ware APIs can be simplified because the OSGi API provides detailed information about the callers, making the APIs significantly more cohesive.

OSGi itself does not simplify application development, it only enables third parties to provide frameworks that then can simplify the life of the application developers, or empower them. They function provided by OSGi is the collaborative model that makes different frameworks no islands on their own but actually allows them to work together. OSGi defines the collaboration mechanisms, not the nice to have convenience functions for web development. What OSGi allows is breaking a huge application problem in smaller collaborating parts. The cumulative complexity of a collection of small parts is less than those parts combined. However, to enable this collaborative model we must enforce a number of modularity rules. Rules that are limiting on the component level to create more flexibility on the deployment level.

Unfortunately, those rules break a lot of existing code. We often talk about modularity but in reality we tend to create highly coupled components. When these "components" are run on OSGi they crash against the module boundaries because OSGi enforces these boundaries. Many people forget that a class encoded in a string in an XML file is creating as much coupling as that class used in your code. The only advantage is that these strings do not show up in your automated dependency graph ... OSGi is just the unfortunate messenger of evil hidden coupling.

Application servers adopted OSGi because their problem domain is highly complicated and large. So large that strong modularity was their only option to keep things manageable and OSGi was perfect for this because it already contained some of their domain objects. Most Java application developers develop web apps. Web apps are a highly restricted domain that has been extensively simplified by tools and libraries. Improving on this has a very high threshold. This is the same problem as with the combustion engine and helicopter; there are better technologies in principle but the incumbents of a huge head start in optimization. Therefore we've adopted the WAR format. WAR files will make it easier to start running web applications on OSGi without having to introduce new tools for the developers: their WARs will run on OSGi unchanged. Over time they then decompose their WARs into smaller bundles.

There is one innovation in OSGi that I think is highly undervalued: the µServices. µServices are a paradigm shift as important as the move from structured programming to object oriented programming. µServices are the pinnacle of modularity. If they're so good, why does it take so much time before everybody uses them? Well, SD Times provided some insight, they said that a new technology X is irrelevant because developers have been building fantastic systems for a long time without X. It is hard to better illustrate the reasons why paradigm shifts are so hard and can take multiple decades.

As with OO, there is a chicken-egg problem. To take advantage of µServices you need components that provide and consume these µServices. Our problem so far has been that the major adopters (Eclipse/App Servers/Spring) picked OSGi for its class loaders on steroids and treated the µServices as an extra. But things are changing. Last EclipseCon it was clear that µServices are moving to the front. People that could not care less about services now publicly declared their support for them. Eclipse provides now good tooling for µServices, which will make services more attractive for many Eclipse developers. I am sure this will create the needed network effect.

Kirk notes how our industry is more fashion driven than the fashion industry and both authors complain that OSGi is not visible on the catwalk. And that is correct because OSGi is the catwalk, present in every fashion show picture and sustaining virtually any application that runs on a Java EE Application server based on OSGi, which are actually most of them.

Monday, April 12, 2010

Calling your cake and sending it too

During the last EEG meeting in Mountain View at LinkedIn in March we discussed the next phase in Distributed OSGi: asynchronous messaging. With the Remote Service Admin specification we have an elegant model for handling the distributed topology of a cluster of systems but this model is based on synchronous calls to a service, like:

Baz n = service.foo( bar )

Synchronous function calls are very simple to use because the answer is returned inline on the same thread. This model of computing allows you to store the state on the stack, which is efficient and handy. However, In a distributed environment your thread will block for billions of instructions until the return comes in from the remote systems. Threads are relatively expensive resources and it is a pity they go to waste idling. Anyway, if you have to program in a concurrent environment a lot of advantages of synchronous calling seem to disappear. For example, you must be very careful not to hold locks when you call a remote service for it is easy to create deadlocks.

The alternative is messaging. With messaging you create a message (some object) and call a send method on some distribution provider. For example, in JMS there is a send() method that can take a Message, where the message object can contain arbitrary data. The receiver of the message then can send zero or more responses back. The sender can receive this through a proprietary callback mechanism or message queue.

Programs that are based on asynchronous messaging are highly scalable, are easier to make deadlock free, and are more extendable. For example, persistent queues are transparent for the sender and receiver but can provide some very interesting reliability characteristics to the system. In the early OSGi days I wrote an OSGi test framework that used synchronous calls from the GUI to the framework. After struggling with this model for some time I gave up and went to asynchronous message and I remember it felt like a dry warm towel after a heavy water-boarding session.

A big advantage for the OSGi Alliance is that services are a very convenient way to write their specifications using Javadoc. Message based APIs are not as nearly as easy to document. Also, in many cases the synchronous way of calling methods is by far the most efficient when the method is in the same process. With distributed OSGi, we are often not aware in our code that a service is remote. For the best of both worlds we'd like to be able to do both synchronously and asynchronously. There are actually different solutions that mix the idea of a synchronous call but asynchronously processing the result value.

The simplest solution is to adapt the API to handle the asynchronous return value. Google Web Toolkit uses this model extensively for its remote calls from the web browser to the backend. The basic API is defined with an interface but in reality the caller passes a callback object in every call. The following example shows two interfaces, first the normal and second the adapted version. The caller of the second declare method passes an object that is called back when the result comes in.


interface Tax {
Return declare( Declaration decl);
}
interface ServiceAsync {
void declare( Declaration decl, AsyncCallback result );
}

An alternative is the use of the Java 5 Future interface. A Future is an object that is immediately returned as the result of a synchronous call and can be used to get the result later asynchronously. Futures also require the adaptation of the the API to reflect the asynchronous nature:

interface ServiceFuture {
Future declare( Declaration decl);
}


Though these solutions are simple and provide type safety they are kind of ugly because it violates one of the important rules of modularity: cohesion. The interface that was very much about the domain now mixes in concerns of distribution. What does declaring taxes have to do with the callback? These two aspects are very unrelated and it is usually not a good idea to mix them in API design.

An alternative solution is provided by ECF, the Eclipse communications framework. They defined an IRemoteService that takes an IRemoteCall and an IRemoteListener parameter. The IRemoteCall object specifies the remote procedure call with parameters: a Method object, the actual parameters, and a timeout. The Remote Listener is called when the result comes in. This is an intriguing solution because it allows a call to any service, even a local one. The callee is oblivious of the whole asynchronicity, it is always called in a synchronous way. This solution is quite transparent for the callee but very intrusive for the caller because it is too awkward to use from normal code as long as Java does not provide pointers to methods. It is only useful for middle-ware that is already manipulating Method objects and parameter lists.

Could we use a synchronous type safe calling model to specify a service but use it in an asynchronous way if the callee (or an intermediate party like a distribution provider) could play along? After some toying with this idea I do think we can actually eat our cake and have it too.

It is always good practice to start with the caller because asynchronous handing of a method call is most intrusive for the caller. Assume Async is the service that can turn the synchronous world asynchronous and the player is a music player that is async aware. That is the player will finish the song when called synchronously but it will return immediately when called asynchronously. With these assumptions, the code could then look like:

Player asyncPlayer = async.create(player);
URL url = new URL("http://www.sounds.com?id=123212");
Future r = async.call( asyncPlayer.play( url ) );
// do other processing
System.out.println( r.get() );

This looks quite simple and it was actually not that hard to implement (I did a prototype during the meeting). It provides type safety (notice the generic type of the Future). So how could this work?

The heart of the system is formed by the Async service. The Async service can create a proxy to an actual service, this happens in the create method. For each call going through the proxy, the proxy creates a Rendez Vous object and places it in a Thread Local variable before it calls the actual service.

If the callee is aware of the Async service, for example a middleware distribution provider, then it gets the current RendezVous object. This RendezVous object can then be used to fill in the result when it arrives somewhere in the future.

After the asynchronous proxy has called the actual service the Rendez Vous was accepted or it was not touched. If it was not touched, the proxy has the result and fills it in. If the RendezVous object was accepted by the callee, the Rendez Vous is left alone; the callee is then responsible for filling in the result.

After the call the client calls the call method. This method takes the (null) result of the invoked method. The reason for this prototype is that it allows the returned Future to be typed correctly. Though the call method verifies it gets a null (to make sure it is used correctly) it will use the RendezVous object that must be present in the Thread Local variable. The previous sequence is depicted in the following diagram:

The proposed solution provides a type-safe and convenient way to asynchronously call service methods. It works both for services (or intermediate layers) that are aware of the asynchronicity or that are oblivious of it. For me this is a very nice example of how OSGi allows you to create adaptive code. The scenario works correct in all cases but it provides a highly optimized solution when the peers can collaborate. And best of all, it is actually quite easy to use.

Peter Kriens

P.S. These are just my ramblings, the post is just one of the many ideas discussed in the OSGi EGs, and it has no official status.