Monday, April 17, 2006

A Strong Lesson About Modularity

Last week I learned a strong lesson about modularity: “Investigate before you take the easy route”. It started with a problem on the Felix mailing list. A Felix user had experienced Class Not Found exceptions for a Swing class that he did not use at all. Inspecting his code, there was no sign of him using that Swing class; the only reference was to a class that used the culprit as super class. The OSGi specifications only mandate the imports of classes you directly use.

The suspicion quickly turned to the boot class path. Normally, a VM uses hierarchical delegation. If a class loader gets a request for a class, it should first ask its parent loader, which should ask its parent, etc. The VM’s loader is at the root. This is a simple delegation policy with flaws (but that is another story).

However, the OSGi specifications forbid the delegation to the parent except for java.* packages. Why? We had found that bundles did not import packages they expected to be on the boot class path. This made it impossible to replace or add those packages by other bundles because the other bundles were never consulted; the request was delegated to the boot class before the package lookup was done.

Substitutability is one of the best features that an OSGi framework brings along, it was therefore clear that we had to deny the automatic parent lookup. This requires frameworks to export any packages that are on the boot class path.

However, unfortunately there is code delivered with the Java VM that implicitly assumes the parent delegation model. When it needs to load a class it uses whatever class loader is at hand, assuming that in the end the boot class path loader is used and the class is found in the end. It should be clear from the previous description that this is not true for an OSGi framework, causing seemingly random failures in bundles. In a previous version of Java there was a VM bug in the verifier that used the wrong class loader, breaking the modularity, which did not help as well. For pragmatic reasons we therefore added the org.osgi.framework.bootdelegation (Section 3.8.3) property. This property is a list of packages that requires delegation to the boot class path, bypassing the controlled class loading mechanism.

Ok, back to the original problem. As you can understand, we thought that the Class Not Found exception was caused by a VM bug or by some class that used the wrong class loader. The standard solution to these problems is adding the package to the org.osgi.framework.bootdelegation, and be done with it. The list showed some pressure to Richard Hall, the main Felix core committer. However, he was stubborn. His solution was to add the missing package to the imports of the bundle. I was against this solution because if this was caused by the VM bug, the standard boot class path delegation was the solution. Problems should be fixed at the right place, fixing a symptom is bound to bite you later.

Other people joined the discussion and we found out that one framework vendor had default set the org.osgi.framework.bootdelegation to *, which basically means the old parent delegation model. This is ok for a deployment but it is bad during development. A * for boot delegation effectively hides any import problems. This was Richard’s greatest objection, delegating everything to the boot class path would not allow you to find class path problems until you were in the field.

During the discussion I was getting the uneasy feeling we had screwed up in the specification if these types of problems were popping up. So I asked Richard to send me the offending bundles. Assuming this was a VM bug, I tried to create a smaller problem with the same sequence. However, I could not get it to fail. I therefore decided to recreate the bundle using my own build environment. This environment uses BTool which analyzes the bundles and write the manifest. To my surprise, BTool created an import statement for the class that generated the error; the code must therefore refer to the class despite the fact it was not directly using it. I immediately realized how stupid I was; this was the same bug we had had last year and that I had solved by extending BTool. I had been sidetracked by the VM bug and boot class path delegation discussion!

Last year I had analyzed that the compiler can insert a reference to a superclass if a method of the superclass is used on a subclass. For example, of you use the HttpServlet class and you refer to for example getServletInfo, the compiler will insert a reference to the GenericServlet class, even though you do not import this in your source file.

So what did I learn from this exercise? First, Richard was right on two counts. He resisted the pressure to make boot class path delegation default * and he was right (for the wrong reason!) to import the culprit class in the bundle. After this exercise, he decided to improve Felix by analyzing Class Not Found exceptions and provide a clear message how to solve them.

However, the most important lesson I learned is that we need a simple tool to verify a bundle. In my sparse spare time I started to write a simple certifier. It is still in its infancy, but I already find lots of errors in freely available bundles. Nevertheless it might be useful so you are free to download the jar (or exe) it and use it. Just run it with:

java –jar certifier.jar (filedir)…

Please note that this is not an official OSGi bundle and has no official status, use it at your own risk.

I’ll probably make some updates in the coming weeks. Feedback appreciated!

Peter Kriens

3 comments:

  1. Nice blog..

    But let me know how to ask plugin to look for package like javax used for xm tranform from dependencies rather than jre.

    ReplyDelete
  2. Hi,

    Could you please let me know, in which file to add the property "org.osgi.framework.bootdelegation".

    I am trying to access Webservice from a spring batch job deployed on OSGI. I am getting the below exception as it is not able to connect to webservice( not finding the axis related jars in runtime)

    Any help is highly appreciated. Thanks
    org.springframework.batch.item.adapter.DynamicMethodInvocationException: java.lang.reflect.InvocationTargetException
    at org.springframework.batch.item.adapter.AbstractMethodInvokingDelegator.doInvoke(AbstractMethodInvokingDelegator.java:108)
    Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    Caused by: java.lang.ExceptionInInitializerError
    at org.apache.axis2.util.XMLUtils.toOM(XMLUtils.java:564)
    Caused by: java.lang.IllegalStateException: No valid ObjectCreator found.
    at org.apache.axiom.om.util.StAXUtils$Pool.(StAXUtils.java:44)

    ReplyDelete
  3. I actually do not see this as a good feature, sorry. If forces to have every .jar archive "in OSGI way" and does not allow to accept decisions on which part of our code we need and want to have as plugins and for which part see no need an would prefer to leve 'as is'. It forces to dive into OSGI with head rather than evaluating in a controlled way and gradually expanding OSGI scope after we see it makes sense and causes no problems. This single restriction forces me to consider seriously alternative plugin systems.

    ReplyDelete