Monday, January 23, 2012

Objects Revisited

Alan Kay is the  inventor of Smalltalk, the first fully truly object oriented language. I learned Smalltalk in the early eighties and almost everyday that I use Java I am crunching my teeth that James Gosling did not steal more ideas from Smalltalk. About 20 years ago, during an OOPSLA, Alan Kay presented the idea that data should always carry its own methods to access that data. His example was a tape (!) that would contain the data as well as the code to interpret that data.

I think this idea was very much at the core of the Java Management standard first proposed around 1997. Each device would have a Java VM on board and the management system could send little management programs that would be executed on the device. However good it sounded at the time (and I tried to push this idea in Ericsson) the idea never became successful, it was just too complex to make it work reliably on a larger scale where machines have different versions and are implemented in more languages I could ever learn in this life. It was just too complicated, error prone, and risky. Exchanging, or relying on, arbitrary code between loosely coupled machines turned out to be a surprisingly bad idea. Objects, however useful they are in many places, seem to be getting more and more in the way when you build larger distributed systems.

The reason is that objects are so ill suited to go outside their process is that they force the objects to expose their innards, the very thing objects try so hard to hide. Even if we could encapsulate the data during the transition as Alan Kay suggested we would create a huge burden on the receiver to understand (and trust) the code that encapsulates the data. We also created a huge dependency problem that the code provided with the data can actually correctly run on the receiver.

There has always been an impedance mismatch between persistence and object orientation. JPA does a decent job but there is something fishy when you need such huge, complicated, and performance intensive middleware only to simplify the life of the developer. Recently I've been doing some more thinking about this subject and I think that though objects work beautifully in a single process they are ill suited for anything that involves crossing the process boundary, which obviously includes persistence.

Last week during an OSGi EG conference call the problem came up again during the discussion of a specification: do we support serialization for some of the domain objects or not? What is often not realized is that serialization is a public interface since it is shared with the world, it is not an internal implementation detail. This is the essence of modularity, there is an inside and there is an outside. What is on the inside only can be changed what escaped from the inside must be carefully (and thus more expensively) evolved since its dependencies are unknown. 

The problem is acute with interface based programming. Two systems running a service defined in interface S (maybe separated in time) that need to communicate their domain objects can only do so if the specification for S defines a serialization format.  Putting a serializedVersionUID in an interface is a total waste of bits (although they do occur!). The only solution that I see is that we need to make the marshalling a first class citizen in the contract since the data representation is part of the public API.

However, what format should be used? The standard Java serialization format is quite awkward to parse except for implementation classes.There is good old XML but JSON is increasing in popularity and there are enough other serialization standards out there to fill books. SQL is also a kind of serialization format. Picking one without making others unhappy will be hard.

I've come to the conclusion that the best format is actually ... Java.  I started to use what I call data classes. These are classes with only public fields of primitives (or their wrappers), strings, data classes, and collections or arrays of data classes. This subset is very easy to (un)marshal to almost any available marshalling technique using simple rules and reflection. These data classes can act as a very convenient schema for my public interface to other processes, including me in the future (a.k.a. persistence). Since they are part of the Java type system they are easy to use and the compiler can do a lot of sanity type checking. And they can easily be versioned in OSGi.

The data classes are a solution to a problem I see becoming prevalent. It is against pure object orientation but I honestly do not see another solution; The shared code model just does not work very well. Sad, but I think it is time to declare defeat, maybe Java 8 should not steal from Smalltalk but the struct from C?

Peter Kriens

Wednesday, January 11, 2012

Java Generics are a Lemon

After working with Java for almost 15 years and deep knowledge of Java generics on the class format level I learned something very basic the really, really hard way. I knew the collections in Java were not that good in comparison what you find in other environments (immutable anyone?) but now I learned that even adding all that extra cruft on my classes is useless when you have a major refactoring.

This week I learned that for the collections and maps the get, remove, containsKey, containsValue, and equals methods do not use the generic type parameter. This means you can call it with any type and you do not get an error if you call it with a type that is not compatible with the generic type of the collection.

I found this out when I changed many Map types to take another key type, expecting that Eclipse would nicely point me out what to change. Well it does not. The puts and parameter calls are nicely pointed out but a significant amount of code fails because it always fails because the object is now no longer found. Fortunately I am saved by having hundreds of solid test cases that tell me where to look.

I understand these methods were not generified because things became too hairy. Why that did not raise concerns about the power of the generics at the time beats me.

Well, guess I learned something.

Peter Kriens

Monday, January 2, 2012

Moving On

A bit more than 13 years ago I was asked to go to Linköping, Sweden to help out an Ericsson business unit to get the Java Embedded Server running on their e-box. This single appointment quickly cascaded into an almost full time job managing the OSGi specification process on behalf of Ericsson. In 2001 I switched to the OSGi Alliance to become the Technical Director and in that capacity the editor of the specifications. A hectic decade followed with too much travel, several economic booms and busts, various controversies, working with some really great people, and many rock solid specifications to show for it. When I look at my bookshelf I see a satisfying sight of two shelves with OSGi specifications and books. All said, it was a pretty good decade.

However, it is time to move on. Not because I feel OSGi is not the right answer, on the contrary. I think the OSGi service model is as important as structured programming and  object orientation was in the previous decades to increase productivity in the software industry. The reason to leave is that I see a business opportunity in the gap between the mainstream Java developer and where OSGi is today. Working with the myriad of problems around modularity has given me a solid background to ease the transition of existing applications into more modular software. And after a decade of writing specifications creating real systems again looks pretty attractive.

I will stay on until after OSGi DevCon 2012 in Reston, Virginia at the end of March. During that time I will finish the upcoming Core and Enterprise specifications that are currently in the pipeline. After that, well, to find out you have to follow me on Twitter (@pkriens) ...

If you want to show your like what I've done in the OSGi then I would appreciate if you linked with me at LinkedIn and/or provide a recommendation.

Now back to work on my last two OSGi specifications.
Peter Kriens