Friday, June 23, 2006

Signature Analysis for Mobile OSGi

This week I received a mail about the requirements for the JSR 232 Test Case Kit (TCK). We had all the parts except for one. SUN requires a signature verification of the API packages. These are always the fun parts: how to verify the API signatures with the least amount of work. Let me first give you a short explanation of the purpose of this test.

Sun is rightly paranoid about Java fragmentation. That they sometimes cause fragmentation by being paranoia about it is another story, the need for consistency between implementations of an API is clear. During the early Java days Microsoft improved Java by adding extra, windows specific, methods to the java.lang packages. What is wrong with improvements one can ask? Well, developers started to use these features because they were really, eh, improvements. Unfortunately, the use of the improvements made it impossible to run that code on any other environment, like, lets say Solaris. This is a bit painful if one realizes how much cost is involved to provide the Java portability. After this experience, Sun hired a large amount of lawyers to add clauses to all their contracts that APIs could be implemented but not extended. Today, they therefore require that all TCKs verify that the API they test is not improved. So you are warned, it is likely that one day you have promised at least once to Sun never to improve their APIs; I have many times.

Not improving does not mean that the classes in the specification and implementation are identical. It is perfectly legal to re-implement them and add extra (package) private methods and classes. The re-implementation may not change anything in the visible classes, methods or fields. This includes the type, the name, staticness, finalness, or abstractness. Visible means it is public or protected.

So how does one go about checking the equality? The naïve solution is to write one humongous test case where you hard code a test for each field, method, etc. This is of course my nightmare solution because it is horror to maintain. My attention span for repetitive tasks is unfortunately measured in nanoseconds. Writing such a test case would likely make me loose my hair and take ages, neither of which is an attractive option.

I could generate an XML file from the API jar files, write an XML parser and verify the structure of the API based on the information in the XML file. I already had the code to parse the JAR file and turn it into XML. However, frequent readers know my great love for XML solutions, so this was not a desirable option either. What next?

When making a prototype for declarative services ++ (which was the embryo for iPojo), I had been playing with ASM. ASM is a fantastically small library (base is 20k) that makes it very easy to modify class files. If you can modify it, you should also be able to read class files. ASM uses the visitor pattern, which is very powerful but not that well known; probably because it is non trivial to understand. With the visitor pattern, you provide separate an object structure from its algorithms. Instead of the algorithm traversing the object structure, the structure visits each member and calls appropriate methods in the algorithm object. The visitor patterns allows you to be oblivious of the types and structure of your target object but learn about it in a controlled and type safe way.

ASM uses the visitor pattern to traverse the bytecodes of a class. The Class Reader class takes the class bytes and a visitor object. This visitor object is called for all the class information, fields, methods, etc. There is a base class for these Class Visitor objects so you only have to implement the methods you really need. Normally, you also need a Class Writer, which is a visitor itself so you can do a 1:1 transformation by giving the Class Reader a Class Writer.

ASM looked interesting for the signature analysis. So the next step was to get the API. I am proud to say that we have a very clean build in the OSGi build system. We have a JAR file for the Core, Compendium, and MEG specifications. I just copied this JAR to the signature verification bundles (one for each specification). The rest was peanuts, in the OSGi test case I read the JAR and parse all the classes using the Class Visitor. For each found class, I then get the Class object from the OSGi Framework using Class.forName. The rest is just work, I compare each method and field for visibility, presence, or absence. Any discrepancy in the visible aspects of the class are flagged as errors.

The whole bundle is about 300 lines and the extra build lines were 3! Quite satisfactory. However, before I get too snug, I was stuck with one problem: I could not detect extra classes in the implementation. Java unfortunately does not have a method to iterate over the contents of a package. On the PC you can use the java.classpath property but that is not an option in the embedded world; there is no guarantee that the code is stored as directories or JARs. So I gained a battle, but not yet the war. If you can tell me a solution to finding the extra classes (or tell me with authority that Sun is not testing extra classes), then I would really appreciate this!

No comments:

Post a Comment