Thursday, December 18, 2008

Project Jigsaw #2

Alex Buckley has given a presentation at Devoxx in Antwerp last week (December 2008) that is the next step in the design of Java modularity. Alex has sent me the link to his presentation, you can find it on his blog.

This presentation is the next step in the saga of Java modularity. Sun’s history of Java modularity is littered with blogs and presentations; it shows no proper requirements and design documentation whatsoever. Neither has any of these blogs and presentations been discussed in, or presented to, the relevant JSR 294 or JSR 277 expert groups. It feels therefore awkward to react to a design that has no visible requirements nor a proper design document. How can one judge it, except highly subjectively? By its very nature, a presentation is giving an overview of an underlying construct, leaving out all the details that are so important to understand and evaluate a proper design. One can only hope that these design documents exist in Sun's offices somewhere.

However, getting modularity right in the Java platform can make a tremendous difference for hundreds of thousands of companies and millions of individuals. And clearly there is a self interest as well: it is of paramount importance to the OSGi Alliance to get it right. Because there are no discussions on the proper mailing lists, I feel forced to react with the same ill-matched tools of blogs and presentations. Excuses.

The Devoxx presentation begins with sketching the problems that JSR 294 addresses. These can be summarized as:
  • JAR Hell
  • Platform Fragmentation
  • Startup Performance
However, in the remaining parts of the presentation where the solution is sketched, many implicit requirements rear their heads. In the following sections I discuss the problems that I think Sun is trying to address and then address the areas that I think are not on their horizon.

JAR Hell is composed of the following problems (derived from Dependency Hell, DLL Hell, and JAR Hell):
  • (Too) Many Transitive Dependencies. This is related to the dreadful feeling you get when you want to use A, which needs B and C, which need D, E, F, G, ad nauseum. Though a certain amount of coupling is required to enable reuse and extension mechanisms, it often turns out that many dependencies are not always necessary. However, in Java there is no way to express this optionality.
    Many popular (open source) libraries have a staggering amount of dependencies. For example, use maven to compile "Hello World" for the first time. It downloads an impressive amount of libraries because it has no mechanisms to manage the unnecessary transitive dependencies. OSGi handles this problem by using the service model where only the packages with service interfaces are shared and picking packages, which have the smallest possible granularity in the Java VM model, as the unit of sharing.
  • Dependency on multiple versions. Java applications cannot have multiple versions of the same component in a single application. Well, why would you want this? Assume you rely on JAR A and JAR B. Both A and B require JAR C. Unfortunately, A requires version 1 of C and B requires version 2 of C. In Java you are out of luck. The reason is that they only have a linear class path so or C;version=1 comes first or C;version=2 comes first. During run-time, either A or B is bound to the wrong version, wreaking havoc at unexpected times and places. In OSGi, this problem is addressed by precisely wiring up bundles based on meta-data in the JAR manifest.
  • Unmanaged Dependencies. Java has no way to specify dependencies on other JARs. Depending on the class loader hierarchy, the class path, JARs in folders, and magic. This is all done "blind", there is no verification that it matches the assumptions in the JAR files. The effect is that errors happen (too) late and not early. Trying to make a set of JARs work together can therefore be a cumbersome and very hard to get right. In OSGi, this is addressed with the manifest in each bundle that specifies the assumptions of the code. This allows the framework to verify the bundle dependencies before the code gets started.
  • Use of private code. All public code in a JAR file is visible to all other JAR files, and public is pervasive because it is required to implement an interface. It is therefore easy to use code that is supposed to be an implementation detail, which can easily break a client in a later version or unnecessarily restrict the provider. In OSGi, this is addressed by explicitly exporting packages. All other packages are private.
  • Stomping. Stomping is the problem that you overwrite one JAR with another because the name is the same though the version is different. This can have very subtle, and much less subtle, unpleasant effects while everything looks OK. OSGi addresses this problem by having a clear install phase that separates the JAR from the internal structure. That is, a JAR file can have the same name but can both be installed in the same VM if they have different versions. This install phase connects very easily with all kinds of repositories and management systems.
Platform fragmentation is about the sad story of how Java fragmented into an incompatible set of profiles and configurations at the beginning of this millennium. Today we have Jave ME and Java SE that have grown largely incompatible over time. The argument for this fragmentation is that a mobile phone or embedded device could not be expected to run the same VM as a desktop. This is a valid argument but unfortunately the solution of profiles and configurations was not well thought out. The different profiles are not upward compatible which requires that a programmer targets a profile (for example Foundation Profile). This code will likely run into problems on Java SE because there are packages and classes that do not occur on Java SE, and of course vice-versa. Worse, packages often have different contents in different profiles and there are even differences as small as visibility, new or removed methods and fields. And worst of all, there is no meta-data describing what a JAR expects of its environment so that code can start running only to throw an exception somewhere in the future.

The key mistake made with the profiles and configurations was caused by the lack of modularity. If Java had been re-factored in a core VM (including java.lang because everybody must share the Class and Object classes) then sets of packages that could have extended this core VM. However, for some bizarre reason it was often not even legal to run a packaged developed in Java ME. The most bizarre JSR that I ever was involved with was JSR 197 that made it legal to run the javax.microedition.io package on Java SE. It is hard to believe that it took us a year to accomplish this minor, but important for one of the OSGi's specifications, feat.

In OSGi we came from the embedded world so this was a serious problem for us from day one. Obviously, we could not change Java itself, but at least we could address the sub-setting problem and the unmanaged aspect of it. We came up with the execution environments. An execution environment looks a lot like a profile/configuration but it is not intended to be the final word. It is a description of a set of classes that are a proper subset of all feasible Java environments, that is, the common denominator. Compiling against this subset (we have the Jars that contain only public API for you to do this) ensures that you are not coupled to anything outside the execution environment and as a consequence, outside any of the sub-setted profiles. For example, we have ee.minimum, which runs on all known profiles, from CDC/FP to Java SE 7. We use this execution environment to target all our APIs. We also have ee.foundation which is aligned with Java ME Foundation Profile. These execution environments are used by Equinox, Felix, and Knopflerfish to allow their implementations to run on the widest possible set of VMs.

However, we did not stop there. We also designed meta-data in the bundle's manifest that indicates the assumption of the bundle about its environment. A bundle cannot resolve unless the framework can establish that the VM implements one of the required execution environments.

Performance. This problem is related to the work done in Java 6 where the JRE can be incrementally loaded to reduce start-up time. This speedup was necessary because it basically made applets impossible to use because the page froze for up to a minute the first time an applet got started. There are two strategies involved in improving performance: lazy and eager. In a lazy model, the code is not activated until there is a direct need. This model works very well where you have a large application where many parts are rarely used and loading code is relatively fast (local disk). Eclipse is a good example where lazy loading is used heavily to minimize startup and footprint. Eager loading is better when it is clear you will need it soon and loading code is slow (over the net). A good example is the average applet. There are many variations and anybody ever written a serious cache manager knows the trickiness. Performance from a module system therefore depends on the available meta-data and initialization/activation model.

This problem is further clarified later in the presentation when it requires that a module should be able to contain partial packages to speed up downloads: java.lang is split in 3 different modules. This requirement is begging for more complexity and the payoff seems very slim. Splitting packages along performance boundaries will require great foresight to have any performance effects and will in almost all cases be in conflict with minimizing coupling. It is a classic example of coupling of two unrelated concepts (performance, low coupling) into a single design concept (module). In practice, it always results in systems that are not good in either performance nor in decreasing the coupling.

Though performance is a crucial aspect of the VM (and some fantastic work has been done in Hotspot, even without modularity), it is important not to mix concepts that have very different optimization axes. Every time when I see that happening, both axes have to be compromised.

Integration with native packaging systems. One of the not well defined concepts in the presentation is the integration with native packaging systems. A typical native packaging system is rpm. Packaging systems use dependency graphs and scripts to modify an operating system to a new state where it provides new functionality. There is a tremendous experience in packaging systems and nowadays they are quite impressive in how reliable they work.

However, take one step back. The absolute number one value of Java is its platform independence. Native packaging systems should be able to reliably provide modules to the Java platform, but I am fairly confident that inside Java we do not want to see any of these native systems in Java. It is crucial that the Java module system is a well defined system in Java without having to defer to a platform dependent module system. This would be the anathema of Java. However, the presentation seems to assume that native packaging system would integrate with the VM? For example, module version syntax is not defined so it can leverage formats from packaging systems. Alas, if we only had a design document and not a presentation ...

Updated 1: Alex Buckley has told me that Project Jigsaw nor JSR 294 willprepare for native packaging systems ... So my fear is unjustified.

Package granularity. Package granularity has always been a hot potato in Java. Though packages look like a first class citizen, there is a lot of fudging going on to ignore them. On one side we have a Package class representing a package, we import packages in our source code, and there are clear access visibility rules for classes in the same package. However, in the class file format a package is not visible and this has led many (including the JRE) to treat packages as second class citizens.

In the presentation, it is argued that libraries often consist of multiple packages. Though the OSGi service model shows that constraining the interface to a single package works well for highly decoupled designs, it does make sense to be able to think about a group of packages as set that belongs together. This is for me the conceptual advantage of a module: a group of packages that tightly belong together. Superpackages anybody?

Multiple module systems. There seems to be a strong implicit requirement that there will be multiple module systems in the VM. These module systems are somehow supposed to handle the run-time class loading in different ways. Sun will provide a "simple" module system for the JDK, but it will allow others for the application level. They are even polite enough to not specify a version syntax so that OSGi can use its own version syntax while Sun can continue with versions that always start with 1.

Multiple implementations of a specification sounds good, doesn't it? Hmm. Let's look what it means. It means that programmers will have to choose a deployment format because none is specified. As a programmer I will have to choose one or more formats I support because I cannot waste the resources to support all. Do I have any gain as a programmer by having a "choice"? Nope, every choice I make makes my code incompatible with modules from other systems.

Specifications are supposed to simplify the life of programmers, not make it harder. By not having an open discussion about deployment formats and creating consensus around one format, the problem is dumped on the lap of millions of programmers, creating confusing and chaos where none should be necessary. Even if interoperability is supported, there will be lots of small problems for no obvious reason.

And there is of course a political aspect. Sun moved the deployment aspects to an OpenJDK project called Jigsaw. The scope of Jigsaw is to create a module system for the JDK and applications. Though I am fairly sure it will be not match OSGi's capabilities it will be part of the JDK. It is hard to ignore the similarity between Microsoft including Internet Explorer in their operating system because they could not compete on functionality with Netscape.

Missing Aspects
The following section details problems that I thought were an intrinsic part of a module system but that are not discussed in the presentation. I think these areas are very important and closely related to Java modularity.

Class Space Consistency. One of the hardest parts in the OSGi R4 specification was class space consistency. Once you allow multiple versions of the same package in a VM you must ensure that the different modules use the right class loaders or you get hard to diagnose class cast exceptions. That is, a class X from class loader A is not compatible with a class X from class loader B. Confusing but true. In OSGi we have the concept of a class space and we maintain consistency in this class space using the "uses" directive, providing information of what implementation dependencies a package has. With this directive, a framework can assign bundles to different class spaces and thereby ensure no collisions happen. The presentation explicitly acknowledges that this can happen in the proposed model, but does not propose to fix this.

This might not be a major problem for a JRE where it is likely that all modules are only a bug fix away from each other. By definition, a JRE is only depending on itself. It is not very likely that a JRE will have the problem of multiple versions of the same packages. However, application programmers that use a large number of open source libraries are rarely that lucky.

Supporting multiple versions in one application is one of the core aspects of JAR hell. Enabling this is therefore good. Not guaranteeing class space consistency will only create module hell.

Plugin/Extensions. The largest missing area in the presentation is a plugin extension model. One of the primary reasons to choose OSGi is to provide an extension model but the presentation assumes all modules are statically wired. Implicit in the model is that we'll be stuck with class path scanning (OK, module scanning) and class loading hacks to make today's applications.

Compatibility. Java is clearly backward compatible and I applaud Sun for the remarkable feat that Java 1 code can still run on a Java 6 VM. However, there is also forward compatibility and that is largely lacking from the JDK perspective. Most javax packages can easily run on earlier VMs. For example, the javax.script package defines a way how script engines can make themselves available to application code. There is nothing in this package that would make it impossible to run it on a Java 1.2 VM. However, it is only available in Java 6. If you can always run the latest VM (I guess like when you are a Sun employee) this is hardly a problem. However, for the rest of us it does pose a problem to move our code bases to the next version, which usually causes a lag of 2-4 years.

Project Jigsaw in OpenJDK and JSR 277 target Java 7, which is supposed to be out in 2010. How could people on older VMs take advantage of some of the features? Is modularity really only needed on future VMs? Looking at OSGi it seems unnecessary to only focus on the future VMs, it runs as well on Java 7 as on Java 1.2.

Preliminary Conclusion
The presentation unfortunately shows all the aspects of too few eyes from too few perspectives. The interests from the VM perspective have a huge role in the presentation. However, the application programmer's perspective has been more or less ignored.

And then there is the most important aspect of all: multiple module systems. Java is in the extremely fortunate situation to have only one modularity standard today that is well adopted and highly mature. Analyzing the problems as stated in the presentation I have not seen anything that OSGi could not do better today. Though in any other area competition is good, a module system is the technology that allows parties to compete on better implementations without technological friction. A single module system will reduce the ease with which one can adopt open source libraries or commercial components. The world really does not need more than one module system in Java.

I am hoping that Mark Reinhold and Alex Buckley will bring their requirements to the OSGi Core Platform Expert Group where we could discuss the problems and have more (some very experienced) eyes on what is really need. I am pretty sure we can find a consensus. I actually hope we can do this soon: there is an OSGi Expert Group meeting in Boston in January and Sun is a member.

I am fairly sure there is only one requirement we can unfortunately not address: "There shall be no OSGi technology in the solution".

Peter Kriens

Monday, December 15, 2008

Project Jigsaw

As many of you know, there is a new phase in the history of Java modularity. A few weeks ago Alex Buckley contacted me for an urgent phone conference. Through the grapevines I already had picked up that a change was in the air, the exit of Stanley Ho (the poor JSR 277 spec lead) had also been a sign on the wall. Unfortunately, the phone conference was canceled while it was supposed to take place and I had to live in suspense for another two weeks. Last week, while I was in Stockholm chasing a room at 1 am because there was none booked, I finally spoke with Alex and Mark Reinhold. This week, at Devoxx I got the change to talk to Mark and Alex in person though I was severely handicapped by having lost my voice.

History
A bit of historic context for the uninitiated.

1998 Nov. The OSGi Alliance has since 1998 worked on the development of a formal standard for modularity in Java based on the Java class loader model. This standard works on all Java 2 VMs. The specifications are currently at release 4.2. It is very mature, respected, and adoption in the last few years is exponential.


2003 Oct. JSR 232 Mobile Operational Management, which was really OSGi R4 with mobile services was approved in the J2ME Executive Committee.

2005 Jun.
In spring 2005 Sun filed JSR 277 Java Module System. This JSR clearly covered the area that OSGi did so well and it was therefore not clear to may why this JSR did not start with OSGi. A discussion followed and the JSR was accepted by the JCP Executive Committee in June 2005, albeit several members remarked in their vote that OSGi should be taken into account. As a highly interested party I tried to get into this JSR, but I was denied because the expert group was full ... Unfortunately, the discussions were also closed to observers.

2005 Oct. JSR 232 (OSGi R4.0.1) published its EDR.

2006 Feb. In response, with the backing of the OSGi Alliance, IBM filed for JSR 291 Dynamic Component Support for JavaTM SE, led by Glyn Normington. This JSR focused on bringing OSGi into the JCP for Java 2 SE as was done earlier for the Jave 2 ME. Unfortunately, Java has split a number of years ago in two, more and more incompatible, branches. This JSR was run fully in the open.

2006 Apr. Sun decided to extract the language aspects of modularity into a separate JSR from 277. This became JSR 294, the intentions were declared in a short (and cryptic) personal blog of Gilad Bracha (the spec lead). At Java One in 2006 he explained that the deployment guys could not be trusted with the language. Point taken.

2006 Aug. JSR 232 went final and JSR 291 published its Early Draft Review (EDR).

2006 Oct. JSR 277 publishes its early draft. This draft was unfortunately quite bad and extensively discussed. For more information why it was not very good, read my blog for more information. Interestingly, Gilad Bracha, the spec lead for JSR 294 leaves Sun. Alex Buckley takes over as spec lead.

2007 Apr. A commendable decision is made to allow observers on the JSR 277 and 294 mailing lists. However, traffic is quite minimal.

2007 Aug. JSR 291 (OSGi 4.1) goes final.

2007 Nov. JSR 294 publishes their first Early Draft Review. This proposal was based on the original super packages concept and was, ehh, dreadfully complex. I analyzed their proposal in a blog.

2008 Jan. Glyn Normington (JSR 291 Spec lead) files a bug on JSR 277 to ask it to consider interoperability with OSGi. After visible and invisible pressure, Sun accepts this requirement and starts to take a deeper look.

2008 Apr. Stanley Ho publishes an OSGi interoperability model on the JSR 277 mailing list. Unfortunately, this model heavily relies on the second Early Draft of JSR 277 that has not been made public so far. This makes it very hard to judge.

2008 Mar. Alex Buckley posts a message on the 294 mailing list indicating that he had given up on the original superpackages concept, agreeing that it had gotten too complex. He proposed to introduce a new keyword "module". This significantly changed the whole 294 story and was applauded by me in a blog. The module keyword showed a lot of promise.

2008 Apr. JSr 294 is folded back into JSR 277 and Alex and Stanley will act together as spec lead.

2008 May. Stanley Ho proposes a model for versioning in JSR 277. This is met with a lot of resistance because it gratuitously differs significantly from the OSGi versioning scheme, as explained in my blog. Other blogs reacted rather violently to this proposal.

2008 Oct. Stanley Ho leaves Sun

2008 Dec. Mark Reinhold announces project Jigsaw, JSR 277 is put on hold, and JSR 294 is resurrected.

Today
It is kind of interesting to see how other bloggers have taken the news of project Jigsaw. The blogers that like OSGi still show a lot of distrust of Sun's motives. And though they might be right, I think Sun genuinly wants to get modularity into Java 7 and has concluded that OSGi will play a role in Java 7, regardless of what they do. Since Alex Buckley took over some of the specification work in 277 and 294, there has been a real conversation going on and I feel trust was built. Where in the past I often disliked what Sun was doing, I never felt it was evil. Most of it could better be explained from their lack of knowledge of what OSGi really was and an understandable desire to keep things their way.

The difference today is that now some of the key people like Mark Reinhold and Alex Buckley have made an effort to talk to us. If things would turn out very differently from what I expect today, then it would be a personal issue. Something which it never was before. So I am actually quite positive now they promised to more actively participate in the OSGi Alliance to work on modularity.

That said, I obviously have my rather simple wish list for project Jigsaw of only three prioritized items:
  1. Requirements
  2. Requirements
  3. Requirements

This whole mess of the last three and a half years is caused by a process in the JCP that allows people to get started with solutions before they negotiated the requirements with the stakeholders. Requirements make everything in development go smoother because they scope the work and make it possible to explore different solution without falling back to the childish "mine is better." Requirements expose the different perspectives the stakeholder have and allow the negotiation to take place before any investment is made.

If Sun wants to do project Jigsaw really well, they should take the next few months to create an OSGi Request for Proposal (RFP). This is not a complex document with long list of common sense (and therefore useless) requirements numbered in a complex scheme. On the contrary, it is a story about how things are done today, establishes the span and scope, and elucidates the problem that needs to be solved.

An RFP establishes a common vocabulary and understanding in the problem domain. Developing such an RFP allows the expert group to understand the perspectives of others before anybody has made a commitment in a solution. This lack of commitment defuses any disagreements. Even better, once the project gets into the solution space, the group can explore alternatives that can be judged against the requirements instead of subjective opinions.

During one of our conversations Mark indicated that he liked simple solutions. It is hard to disagree with that statement. However, simplicity is in the eye of the beholder. What is simple for one, can be simplistic for another. Without requirements, it is very hard to decide if a solution is simplistic or simple.

I do understand that the time frame is short, Java 7 is supposed to be out early 2010. However, if there is one lesson I learned the hard way then it is this: If it is not worth doing it right, it is not worth doing it at all. So Mark, Alex, I am more than willing to help doing it right this time!

Peter Kriens

P.S. This blog was first posted to the EclipseCon blog due to a combination of a heavy cold and a confusing GUI.

Tuesday, December 2, 2008

OSGi DevCon/EclipseCon Submissions

Last Friday the submission system closed for new submissions. Fortunately. I had 53 submissions at closing time, all vying for a very limited number of slots. The fun task that befell me was to make that choice. Well, at least it looks like we will have a very strong program in March 2009!

If you want to see the submissions, you go the OSGi subissions page. Feel free to add comments or send me an offline mail about them. Any feedback is appreciated.

It is good to see the enthusiasm in the market for OSGi. Looking at all the submission at EclipseCon it looks like OSGi is hot!

Peter Kriens

EclipseCon Submissions

Last friday the submission system closed for new submissions. Fortunately. I had 53 submissions at closing time, all vying for a very limited number of slots. The fun task that befell me was to make that choice,