Tuesday, May 4, 2010

Duct Tape


I discovered software when a high school friend got a programmable TI-57 calculator. This calculator made me fall in love with developing software; a love affair that has not ended yet. Falling in love made me want to read everything about her. In the pre-google era books were used for that purpose and I read all the books I could get my hands on. At that time software books were mostly about structured programming. The basic idea was that you take a function, decompose it in smaller function, and so on. The mantra for this decomposing was:
  • low coupling
  • high cohesion
The idea of coupling was not hard to understand but I had a bit of a problem with cohesion, it was more fuzzy. And I never liked the design magic that structured programming required. Therefore, when I discovered my mistress' objects around '83 I fell in love all over again. In the period between the calculator and the discovery of object oriented programming I had studied, started a company, and developed a number of editorial systems for newspapers. Until that time most with assembly and crafting a lot of code for networking, databases, operating system, and even a GUI (eat your heart out!). As any new generation we worked hard to ignore the lessons from our predecessors and objects seemed to be the perfect vehicle to put the old geezers in their place. Move out of the way Tom DeMarco, Edward Yourdan, and Michael Jackson, we're coming! So the cohesion and coupling mantra's went out of the window and we showed how data encapsulation with inheritance brought you all these goodies.

Most of you know the results. though objects work very well on the medium scale, once the systems grow and evolve they tend to create systems that resemble spaghetti and become hard to maintain. Something got lost along the way. Interestingly, spaghetti code was exactly the problem that drove structured programming.

Problems that we solve on one level appear almost identical on higher levels is a hallmark of fractal problems. The size of systems has increased exponentially since I started and spaghetti problems that we thought we had solved reappear in a different incarnation. So how do we address the spaghetti problem at our current scale? Could the old structured programming mantra help us when we apply it to objects? Lets take a look.

Modules were defined for functions, not for classes but when we took about a module today we talk about a set of classes with restricted accessibility. Now, classes have a tendency to act as duct tape: very useful and darned flexible. As the joke goes: if you can’t fix it with duct tape, you probably just haven’t used enough. However, classes can also be just as sticky and they can get surprisingly entangled when you're not paying attention. So even if OO modules are restricting accessibility, they've lots of classes sticking out eager to get entangled with other classes, preferably from other modules.

Clearly interfaces solve the sticky type problem by type-separating the client and the provider of an object. However, they do not help in getting in getting an instance without getting entangled with the implementation class. We have to solve that within the OO paradigm: Listeners, Factories, and Dependency Injection. However, these patterns never completely got rid of the stickiness of the implementation. Listeners require the client to know the library implementation, Factories require the factory implementation to know the implementations some way and with DI frameworks there is some all-encompassing God-XML that is sticky as hell. The big problem is that in ALL those cases the implementation class somehow leaks out of the module ready to stick. Often as a string instead of a class but the only advantage of that is cosmetically: your dependency graphs look better because the analyzers usually miss these couplings. However, the dependency still is as concrete as a class coupling and just as bad, just looks better.

So our current toolbox of patterns may hide the coupling to the implementation classes but they do not really get completely rid of it. They also have some surprisingly bad characteristics that we somehow got used to. These patterns are extremely biased to the client and forget that the module that provides the instance might have something interesting to say as well. They all force the provider to provide an object at the time they deem right without letting the provider decide if it is ready and if it really wants to provide an object. The provider has no way to signal when it is ready and maybe even offer multiple instances. Again, Factories and DI are surprisingly one sided.

Let us start with context. When you call a factory or use DI then the caller/receiver can provide parameters but the callee has nothing to say. An instance is created out of the blue from one of its classes. If there is any context, it must be stored in static fields, which is an evil software practice. A module that provides an implementation can not easily different instances based it inside knowledge. For example, maybe a provider tracks instantiated machines in the cloud and wants to offer an object per machine. Sorrt, can't do that. The provider module can only provide some manager where the manager then provides listeners, etc. etc. This lack of context makes APIs seriously more complex.

Similarly, a providing module has nothing to say about timing. The caller of the Factory or the DI engine make the decision when to instantiate the class from the module. The providing module has nothing to say about when it is actually ready to provide such an object. This makes startup ordering and dependencies really hard to manage.

We've all been working with these patterns for so long that it is very hard to see how limited they are. However, in a truly modular system the client module and providing module are peers that collaborate. Each module can play the client role and provider role in different times and neither the client role nor the provider role is passive. None of the existing patterns can handle this peer-to-peer module, all fall short in handling type coupling to the implementation class, the dynamic context, and the timing.

In a perfect world, each module should be able to offer an object to other modules. And each module should be able to react to the availability of those objects. This is the Broker pattern. If I need a Log object, I ask for a it (or get injected when it is available). With the Broker pattern, the providing module can decided when to offer, solving the timing problem. The providing module can instantiate as many objects as it likes and offers them, solving the context problem. And last but not least, it is the providing module that creates the object and thereby never exposing the implementation class.

That said, there is one thing that is still fuzzy in this model: who defines the collaboration contract? Is this the provider? Well, that makes it hard to use other providers for the same collaboration. After a lot of hard thinking it seems obvious that modules are not enough, we need to reify this contract in runtime. If we reify the contract then modules can provide this contract and use this contract without being bound to a specific implementation. Both providers and clients would be peers.

So if we talk about software modularity than in an Object Oriented world we need to augment modules with something that reifies the contract between the modules. So far we've struggled with half baked solutions like Factories and the current DI model to address the problems caused by the lack of a reified contract. These contracts are much more important for a design than module.

I strongly feel that today we're in an almost identical situation as in the eighties when objects addressed the shortcomings of structured programming. This was a hard sell because many people looked at the implementation details and not at the design primitives that OO provided: encapsulation of data, inheritance, and polymorphism. Making people see the primitive instead of the vtables and descriptors was hard work.

I believe that today we need a design primitive that allows us to reason about the collaboration of modules. A design primitive that allows us to design large scale systems and still understand the fundamental architecture of what've we created. A primitive that allows us to see a picture of a complex system and understand quickly how it can be extended. OSGi µServices are very close to that primitive I believe.

I must admit that several times in my life I was unfaithful to my mistress. When we first got married she was all Smalltalk and I loved her for it. But over time she developed a lot of C++ behavior and I not so happy about that. Fortunately, she decided to pick herself up and become so much more dynamic with Java. But if I am really honest I was eying other software the last few years, still missing some of her old dynamism. However, since I saw the potential of her µServices I was head over heals in love again.

Peter Kriens