Thursday, February 21, 2008

Research Challenges for OSGi

On my flight back from Berlin this week I listened to a podcast from Software Engineering Radio. They interviewed Dave Thomas (OTI Dave Thomas) and it was a very interesting interview. One of the triggering parts was that Dave mentioned that PhD students often have a hard time finding good research challenges. That made me start thinking what the OSGi research challenges would be. Well, these are the ones I came up with.

The Stale Service Problem


The OSGi Framework provides isolation between bundles using a class loader model. This model is quite effective, especially when security is also turned on. However, it is clearly not perfect because we have something called the stale service problem. Bundles in an OSGi platform exchange objects. These objects are linked through their class to a class loader which is closely connected to a specific bundle. What happens when this bundle is started and stopped? Well, the specification tells the other bundles to remove their references to any objects by sending out service notifications. These stale references will exist and hang around because they cannot be garbage collected. Those services refer to the class loader so a single stale reference can pin all the loaded classes of a bundle into memory for as long as it lives.

But the refering bundle can remove any references and everything is ok! So what is the problem? Well, if you really came up with that question I suggest you do some code inspections of open source or in-house software. It is sometimes quite amazing what actually ends up being used. Let's face it, this cleaning up is hard and very error prone.

From a specification point of view we have tried to minimize this problem. The service tracker is a direct result of this effort. After Richard S. Hall and Humberto Cervantes pioneered the Service Binder we added the derived Declarative Services to minimize the complexity of handling this stale service problem. However, from a specification point of view there is no way the problem can be handled full (or fool) proof.

The simple solution is to place bundles in different memory spaces. This can be done with one VM per bundle or a more efficient method is to use isolates from JSR 121. The JSR 121 provides a very attractive model for isolation but is very intrusive for the programming model, just like a multiple process model. Once you go to a different memory space all your calls need to be marshalled. Marshalling is the process that moves the parameter objects over the process/isolate boundary. Isolates provide a very efficient socket like interface between isolates but it still means you need some form of serialization or proxying of all your obejcts that cross the boundary. This tends to be very intrusive for object oriented code. It also adds at least a time penalty that is at least a number of magnitudes over normal method calls. There is obviously also a significant memory penalty running code in multiple processes or isolates.

These penalties are such a pity. Java is a language that pioneered (as far as I know) the concept of security on language level. One of the great promises has always been to run the code from different parties in the same VM and have them protected from each other while collaborating when necessary without cost. I really do believe we have come quite far with a standard VM and adding OSGi brings us tantalizing close to this goal.

Therefore, the first research challenge I post to the research community is to create an OSGi platform that provides the advantage of direct method calls but allows a bundle to be safely stopped and removed from the VM, even if other bundles do not cooperate.

OSGi Resolver


Some OSGi installations can have thousands of bundles. During the booting of the framework, these bundles must be wired so that there imports and exports match up in a consistent and optimal way. This is a task that is on the critical part for startup so performance is at a premium.

Why optimal? Isn't good good enough? Well, no. The hardest part addressed in OSGi frameworks is sharing. Most of the OSGi framework is setting rules about this sharing and enforcing them where possible. However, these rules allow considerable leeway in the solutions that the resolver can use. This is necessary because many correct solutions cause the class space to be split between bundles. That is, one bundle can no longer use the services from another bundle because the packages they use for the service or used classes/interfaces come from different bundles. For example, one bundle uses version 1.0 and the other bundle uses version 2.0. Once this happen, these bundles live in different worlds and can not easily collaborate. The resolver must therefore find solutions where packages are shared as much as possible.

A couple of months ago we had a big discussion about this problem, Thomas Watson (IBM) posted the question if the resolver is an NP-complete problem. We had some very interesting and long conference calls (software engineers love this stuff) and I do not think we found real proof for this statement. However, we clearly all agreed it is a hard nut to crack.

So I pose the second grand OSGi challenge: Provide an OSGi resolver that can resolve a set of bundles in linear time, or prove that this is impossible.

Multiple Versions


The most attractive property of R4 is clearly the fact that it can support multiple versions of the same package in the same VM. Though one should never design a system with this in mind, it is a life saver when you discover two of your bundles require incompatible versions of the same package. And with today's development model of tens if not hundreds of open source products it is almost guaranteed that you will need it.

To allow bundles to collaborate, packages must be shared. It is therefore very beneficial to maximize the flexibility in imports. This is the reason that one can specify a version range for an import package. This range indicates that this bundle can operate with a set of packages. Great! But, ehh, how does the developer manage this?

Normally, there is only a single version on the classpath during development. Tools like PDE, Maven, or JDT provide no support for adding a version dimension to the classpath. The direct result is that many people compile against the latest version. If they use bnd they automatically get an import statement (provided the exporter is a properly marked up bundle) for this latest version. If this has to be maintained by hand then it is even harder. The developer must intricately be aware of what is using what. Import ranges are really hard to set properly because they explode the number of combinations that must be tested. Trying to do this by hand is painful.

As an aside, I believe life would have been easier if we would have had export ranges because the exporter of a package knows exactly to which version he is compatible, being the responsible person for that package. If he changes a package, he is the one thinking and deciding about backward compatibility. The importer is almost always guessing. That is, if I make a change to a package called p, I can knowingly say that I am package p version 2, but I am completely compatible to version 1 I therefore can export p;[1,2]. In contrast, if I import package p I have no clue if version 1 and 2 will be compatible except for the convention of version number interpretation. Alas, export versions did not make it to R4.

The third challenge is therefore support for import version ranges. How should the development tools be altered to support version ranges in such a way that the resolver has maximum flexibility while the import range presents valid tested configurations.

Application Identity


The cool news with an OSGi framework is that you can run multiple applications on a single VM. The bad news is that you can run multiple applications on a single VM. If you share the VM between applications you run into several issues where you would like to know the identity of one you are doing the work for. Inside an JEE application server this is relatively straightforward because applications run autistically in silos. In OSGi, bundles that make up the applications are shared and call each other through the myriad of callbacks. None of the Java or OSGi concepts (Threads, Code base, Bundles, etc) map really well to this identity problem.

The last research challenge is therefore to come up with an identity concept that can be used for security and resource management in a collaborative environment like OSGi.

Conclusion


So, if there are students that are looking for research challenges then the previous problems should keep them occupied. Some of them are really hard problems that do not have clear solutions. However, solving them will solve real problems and enable further possibilities.

Good Luck! Kind regards,

Peter Kriens

1 comment:

  1. The fact that the OSGI resokution problem is NP-complete has been shown in a blog post here

    http://stackoverflow.com/questions/2085106/is-the-resolution-problem-in-osgi-np-complete

    following the proof schema developed in the EDOS projects for package installability in GNU/Linux distributions (see also http://www.mancoosi.org)

    --Roberto Di Cosmo

    ReplyDelete