About two years ago I found out that a French OSGi expert (Jerome Moliere) actually lived around the corner. It was quite a surprise because I live in the nick of the woods in a small village in the south of France. However, it is a pretty good combination, the south of France and somebody you can discuss OSGi with. There are a large number of pretty good lunch places within 15 km radios and there about 30.000 wineries in a circle of about 100km. So since that day we regularly meet for lunch and enjoy the good life.
So why should you care? Well, I ran into a link to a Google Shared Spaces. Google shared spaces is one of those powerful but simple ideas. A shared space is a link where people can share information that is then visually aggregated. You can share a map, voting, a drawing, text, shopping list, etc. Anybody that has the link can modify the shared space. I had to think of one of those shared spaces when I saw the Map cluster. Would it not be fun to find out who in your neighborhood has some interest in OSGi? Having lunch or a drink in the local bar might be really helpful. Just think of how much more fun it is to complain about class loaders when you're not alone? Or just think how nice it is to show something cool that you've made to someone that actually has a clue?
For this reason I've created a shared map where you can just add your name and location to the map. I know we developers are not known for our outstanding social skills but it is worth a try! So do not be shy or lazy. And if you get a good chapter together you can always invite me for lunch!
You can find the map here: http://goo.gl/61MKk
Peter Kriens
(Follow on Twitter)
Sunday, July 17, 2011
Monday, June 27, 2011
Negative Qualifiers
Some face time with your colleagues can sometimes be really productive. Last week we had an EG meeting at IBM in my home town Montpellier. As I've known BJ Hargrave (IBM) and Richard Hall (Oracle/Apache Felix) for so many years I'd invited them to stay in our guest house. After arriving around noon we took the afternoon off to solve world hunger and climate change over a glass of rosé. As too often happens in these important discussions, we were seriously distracted with versioning strategies.
The OSGi version is defined as linearly increasing where the floor is the shortest form of the version. That is, 1.2.3 is less than any other version that starts with 1.2.3. A side effect of this model is that it is hard to model pre-release versions. A common approach is to use a snapshot version for artifacts that are on their way to becoming a release version. For example, 1.2.3-SNAPSHOT indicates that this artifact is becoming version 1.2.3 but is not there yet.
You can do this in OSGi with the qualifier but it is awkward because you want the final release version to be sort highest. Like 1.2.3.A1, 1.2.3.A2, 1.2.3.FINAL. The fact that you cannot just release 1.2.3 but have to use 1.2.3.FINAL generally offends our highly developed aesthetic sense.
The maven solution is to use a magic keyword: -SNAPSHOT. However, we felt a bit uncomfortable with a magic keyword. The qualifier in OSGi is left open for enterprises to implement a release strategy and it would be nice if they could do the same with a qualifier that came before the actual release. So why not have negative qualifiers?
The syntax for the OSGi qualifiers specifies a period as separate between the version and the qualifier, for example 1.2.3.qualifier. If we allow a minus sign instead of the period we can interpret the qualifier as coming before the version instead of after. That is, 1.2.3-qualifier is actually less than 1.2.3. This contains the special case of -SNAPSHOT but enables the use of qualifiers to track build, patch, or other variations.
So what about ranges? Bundles that depend on artifacts normally specify a range like [1.2.3,1.2.4). I think this should include a version like 1.2.3-SNAPSHOT but there was some discussion about this, some wanted snapshot only when the range indicated snapshots, e.g.[1.2.3-,1.2.4-). For me this is awful because it requires rebuilding and retesting after you released. In think [1.2.3,1.2.4) must accept any negative qualifiers on 1.2.3. The discussion is ongoing.
Negative qualifiers would really help bndtools. We could always build with a 1.2.3-${tstamp} to capture the build time. If one day the artifact is released, we take the JAR and convert the minus in the version to a period. For example, the artifact has version 1.2.3-201106261020 if it comes from the development tree/repository. Once it is released, we change the manifest to version 1.2.3.201106261020 inside the release repository. No need to recompile or retest.
There are some loose ends with version ranges but I am sure we can work this out. However, the fact that got excited about this idea (world hunger has to wait another meeting) does of course not mean that the OSGi will start using this model tomorrow. BJ is working on an RFP to there is hope for the future. If you have any ideas, let us know.
Peter Kriens
The OSGi version is defined as linearly increasing where the floor is the shortest form of the version. That is, 1.2.3 is less than any other version that starts with 1.2.3. A side effect of this model is that it is hard to model pre-release versions. A common approach is to use a snapshot version for artifacts that are on their way to becoming a release version. For example, 1.2.3-SNAPSHOT indicates that this artifact is becoming version 1.2.3 but is not there yet.
You can do this in OSGi with the qualifier but it is awkward because you want the final release version to be sort highest. Like 1.2.3.A1, 1.2.3.A2, 1.2.3.FINAL. The fact that you cannot just release 1.2.3 but have to use 1.2.3.FINAL generally offends our highly developed aesthetic sense.
The maven solution is to use a magic keyword: -SNAPSHOT. However, we felt a bit uncomfortable with a magic keyword. The qualifier in OSGi is left open for enterprises to implement a release strategy and it would be nice if they could do the same with a qualifier that came before the actual release. So why not have negative qualifiers?
The syntax for the OSGi qualifiers specifies a period as separate between the version and the qualifier, for example 1.2.3.qualifier. If we allow a minus sign instead of the period we can interpret the qualifier as coming before the version instead of after. That is, 1.2.3-qualifier is actually less than 1.2.3. This contains the special case of -SNAPSHOT but enables the use of qualifiers to track build, patch, or other variations.
So what about ranges? Bundles that depend on artifacts normally specify a range like [1.2.3,1.2.4). I think this should include a version like 1.2.3-SNAPSHOT but there was some discussion about this, some wanted snapshot only when the range indicated snapshots, e.g.[1.2.3-,1.2.4-). For me this is awful because it requires rebuilding and retesting after you released. In think [1.2.3,1.2.4) must accept any negative qualifiers on 1.2.3. The discussion is ongoing.
Negative qualifiers would really help bndtools. We could always build with a 1.2.3-${tstamp} to capture the build time. If one day the artifact is released, we take the JAR and convert the minus in the version to a period. For example, the artifact has version 1.2.3-201106261020 if it comes from the development tree/repository. Once it is released, we change the manifest to version 1.2.3.201106261020 inside the release repository. No need to recompile or retest.
There are some loose ends with version ranges but I am sure we can work this out. However, the fact that got excited about this idea (world hunger has to wait another meeting) does of course not mean that the OSGi will start using this model tomorrow. BJ is working on an RFP to there is hope for the future. If you have any ideas, let us know.
Peter Kriens
Friday, June 24, 2011
Ease of Use and Spring
Things should be as simple as possible, but not simpler.
As true as this is, the boundary between simple and simplistic is unfortunately highly subjective. If something works for you, don't change it. Unfortunately, many developers are in a situation where the environment and problems are highly complex and OSGi can simplify such situations.
The primary value that OSGi brings to the table is when you have to maintain a large code base over long periods of time, you have many external dependencies, you need to provide middleware, if you work with multiple developers, or if your development is spread over multiple locations. For those situations OSGi provides a type safe module extension to Java that can significantly reduce your development complexity.
So what is the usability issue? Unfortunately, both the modularity and type safety are at odds with prevailing class loading patterns in Java. OSGi is often the messenger telling you that your baby is neither modular nor really type safe. For example, Dependency Injection in Spring requires access to implementation classes from the configuration, exactly the classes you should really keep private. Worse, dynamic loading is only type safe if there is a single namespace in the universe. A single namespace causes DLL hell aka JAR hell that OSGi prevents by managing multiple class name spaces in a type safe way.
These are not OSGi problems, if Jigsaw one day will be modular and type safe it will have exactly the same issues!
To get the many benefits of modularity and type safety you must therefore change the way you think about Java ... If something is inside your module everything works as before but if you want to work with other modules you need to consider type safety and module boundaries. Object Oriented Design forced you to consider inheritance, encapsulation, and object identity, things that did not exist in Structured Design. Java modularity requires just such a paradigm shift. What we need is an inter module communication mechanism that truly hides implementation classes and enforces the type safety across modules. OSGi services are such a mechanism.
As Spring heavily relies on dynamic class loading it ran head on into the module boundaries and multiple name spaces. The resulting mess is what causes the negativity. Though Spring DM included OSGi services the problem was that they were more supported than embraced, resulting in many Spring DM/Blueprint modules to use the existing dynamic class loading model. However, this results in the import/export of large numbers of implementation classes. Exporting/Importing implementation classes increases the public space, the opposite of modularity. The history would be very different if the Spring libraries had embraced services, setting an example for true modularity.
There are only two solutions to this problem: Change OSGi or change the world. As I do believe there is no value in a little bit of modularity I can only see false advertising value in punching convenient holes in the modularity story of OSGi. I do believe it is worth keep strong modularity and its enforcement. Very large applications like Eclipse, Websphere, JDeveloper, and others demonstrate that at certain scales OSGi becomes a necessity. And as more and more open source projects start to embrace OSGi it will become easier and easier to build OSGi based systems.
That said, as Rod justly points out, there are lots of people that are happy with Tomcat and depend on hard core dynamic class loaders. The OSGi service based programming model could provide benefits to these applications even if it did not enforce the boundaries. We're therefore thinking about a version of OSGi that inherits the class loading architecture of whatever environment it runs in; a JAR on the class path is then treated as a bundle. For example, in a WAR all entries in the WEB-INF/lib directory would be available as bundles. Karl Pauls tested this and it is a testament to modularity that many existing bundles actually worked out of the box. Though this solution obviously does not help when you are in JAR hell, it does provide many of the benefits of OSGi in your own familiar environment. And one day, when JAR hell finally arrives, the OSGi Service Platform is there to serve you.
Peter Kriens
As true as this is, the boundary between simple and simplistic is unfortunately highly subjective. If something works for you, don't change it. Unfortunately, many developers are in a situation where the environment and problems are highly complex and OSGi can simplify such situations.
The primary value that OSGi brings to the table is when you have to maintain a large code base over long periods of time, you have many external dependencies, you need to provide middleware, if you work with multiple developers, or if your development is spread over multiple locations. For those situations OSGi provides a type safe module extension to Java that can significantly reduce your development complexity.
So what is the usability issue? Unfortunately, both the modularity and type safety are at odds with prevailing class loading patterns in Java. OSGi is often the messenger telling you that your baby is neither modular nor really type safe. For example, Dependency Injection in Spring requires access to implementation classes from the configuration, exactly the classes you should really keep private. Worse, dynamic loading is only type safe if there is a single namespace in the universe. A single namespace causes DLL hell aka JAR hell that OSGi prevents by managing multiple class name spaces in a type safe way.
These are not OSGi problems, if Jigsaw one day will be modular and type safe it will have exactly the same issues!
To get the many benefits of modularity and type safety you must therefore change the way you think about Java ... If something is inside your module everything works as before but if you want to work with other modules you need to consider type safety and module boundaries. Object Oriented Design forced you to consider inheritance, encapsulation, and object identity, things that did not exist in Structured Design. Java modularity requires just such a paradigm shift. What we need is an inter module communication mechanism that truly hides implementation classes and enforces the type safety across modules. OSGi services are such a mechanism.
As Spring heavily relies on dynamic class loading it ran head on into the module boundaries and multiple name spaces. The resulting mess is what causes the negativity. Though Spring DM included OSGi services the problem was that they were more supported than embraced, resulting in many Spring DM/Blueprint modules to use the existing dynamic class loading model. However, this results in the import/export of large numbers of implementation classes. Exporting/Importing implementation classes increases the public space, the opposite of modularity. The history would be very different if the Spring libraries had embraced services, setting an example for true modularity.
There are only two solutions to this problem: Change OSGi or change the world. As I do believe there is no value in a little bit of modularity I can only see false advertising value in punching convenient holes in the modularity story of OSGi. I do believe it is worth keep strong modularity and its enforcement. Very large applications like Eclipse, Websphere, JDeveloper, and others demonstrate that at certain scales OSGi becomes a necessity. And as more and more open source projects start to embrace OSGi it will become easier and easier to build OSGi based systems.
That said, as Rod justly points out, there are lots of people that are happy with Tomcat and depend on hard core dynamic class loaders. The OSGi service based programming model could provide benefits to these applications even if it did not enforce the boundaries. We're therefore thinking about a version of OSGi that inherits the class loading architecture of whatever environment it runs in; a JAR on the class path is then treated as a bundle. For example, in a WAR all entries in the WEB-INF/lib directory would be available as bundles. Karl Pauls tested this and it is a testament to modularity that many existing bundles actually worked out of the box. Though this solution obviously does not help when you are in JAR hell, it does provide many of the benefits of OSGi in your own familiar environment. And one day, when JAR hell finally arrives, the OSGi Service Platform is there to serve you.
Peter Kriens
Wednesday, June 15, 2011
Masterclass on OSGi
Neil Bartlett and I are organizing a Masterclass on OSGi in the Stockholm archipelago from August 30 to September 2. In this masterclass we provide a thorough schooling in OSGi (using bndtools) but we also have a chance to explore the areas the attendees are interested in. The last Masterclass was a great success and we intend to repeat this success in Stockholm. One interesting aspect was that experienced people that had been using OSGi for several years were surprised how much easier OSGi had become.
Though we've a places left there is a bit of urgency because the hotel indicated that the rooms are becoming scarce. As the masterclass usually runs into the wee hours of the night it would be a pity if you had to stay in another hotel.
You can find all the details here: Masterclass on OSGi.
Peter Kriens
Though we've a places left there is a bit of urgency because the hotel indicated that the rooms are becoming scarce. As the masterclass usually runs into the wee hours of the night it would be a pity if you had to stay in another hotel.
You can find all the details here: Masterclass on OSGi.
Peter Kriens
Thursday, May 26, 2011
The Unbearable Lightness of Jigsaw
Yesterday Mark Reinhold posted a link to the new requirements on Jigsaw. He states a more ambitious goal to make Jigsaw not just for VM developers only as they said before. The good news is that OSGi is recognized. The even better news is that now there are requirements it is obviously clear that OSGi meets virtually all of them, and then some. The bad news is that despite OSGi meeting the requirements it does not look likely that Jigsaw will go away anytime soon.
As this round will likely get me another 15 minutes of fame, let me use this fame to try to explain why I consider the Jigsaw dependency model harmful to the Java eco-system.
Jigsaw's dependency model must be based on Maven concepts according to requirement #1. This model is the same as Require-Bundle in OSGi. It is based on a module requiring other modules, where modules are deployment artifacts like JARs. To deploy, you transitively traverse all their dependencies and put them in the module system to run. That is, a module for Jigsaw is a deployment module, not a language module.
It is a popular solution because it is so easy to understand. However, Einstein already warned us: "A solution should be as simple as possible, but not simpler." The problem of managing today's code base for large applications is complex and all of us in the Java eco-system will be harmed with a simplistic solution.
The key problem is that if you depend on one part of a JAR you inherit all its transitive dependencies. For example, take the JAR Spring Aspects JAR. Even if my code only uses the org.springframework.beans.factory.aspectj package in this JAR, the Spring Aspects JAR will unnecessarily provide me with on dao, orm, jpa, as well as the transaction package. If I deploy my application I am forced to drag in the dependencies of these packages, which drag in other JARs, which each have additional dependencies, etc. etc. During deployment tens (if not hundreds) of JARs are dragged in even though those packages can never be reached from my code. Not only is this wasteful in memory, time, and complexity, it also inevitably causes problems in compatibility. For example, if the jpa dependency and the orm dependency transitively depend on different versions of the log4j JAR I cannot even run my code for no real reason whatsoever. In practice this problem is often ignored, for example maven will take the first log4j version it encounters, regardless if this is a suitable version. The value of type safety is questionable if that kind of fudging is acceptable.
Maven users are already all too familiar with this problem: many are complaining that maven downloads the Internet. This is not bashing maven, maven only does what it is told to do in the poms. In Tim O'Brien's blog he explains that Maven does not download the Internet, People download the Internet. Though he rightly blames the poms, he ignores the fact that the dependency model makes it hard to work differently. It is like you want a candy bar but you can only get the whole supermarket. He advises each JAR to be self contained and completely modularized to not drag in extraneous dependencies. Agree. Unfortunately this means you get lots and lots of JARs. And with lots of JARs on your module path without side by side versioning the compatibility issues like the above log4j example explode. And Jigsaw is not likely to have simultaneous multiple versions in run-time when critically reading the requirement.
In Unix dependency managers the transitive aspect is not such a big deal because dependencies are relatively coarse grained and there is a reason why you usually have to build stuff to make it compatible with your platform. Java is (thank God) different. Java dependencies are not only much more fine grained, they are binary portable to any platform, most important of all, often they have no dependency on an implementation but use a dependency on a contract instead. Unix was brilliant 40 years ago and is still impressive today but let's not undervalue what Java brought us. We must understand the differences before we revert to a 40 year old technology from an operating system to solve the problems of a modern and unique programming environment like Java.
The key advantage that Java brings to the dependency landscape is that each class file encodes all its dependencies. Give me your class file and I will tell you what packages you depend on. For example, when I compile my JAX-RS class against the jersey JAR(s) you will find that the class file only depends on the javax.ws.rs package even though the Jersey file contains lots of implementation packages. However, in the Jigsaw world I must select a module JAR that provides that package. Best case is to pick the JSR 311 API JAR for compilation to ensure I do not accidentally pick up implementation classes. However, I do not want to make the life of the deployer hard by unnecessarily forcing a specific implementation in run-time. So what module should I pick to depend on?
Despite common believe, we do not need fidelity across all phases between the build time and run-time; we need to compile against API and run against implementations that are compatible with that API. A good Java module system must allow me to depend on an API module at compile time and select an appropriate implementation at deployment-time. Forcing the compile time dependency to be the same as the run-time dependency creates brittle systems, believe me. The problem can be solved by separating the concept of a deployment module (the JAR) and the language module.
We could define the concept of a language module in Java and allow the JAR to contain a number of these modules. Surprisingly (for some), Java already provides such a language module, they are called packages. Packages provide a unique name space, privacy, and grouping, all hallmarks of modularity. The JLS introduction even names them as similar to Modula's modules. Actually, module dependencies on packages are already encoded in the class files. The only thing we need to add to the existing system is versioning information on the package and record this version in the class file. During deploy-time a resolver can take a set of JAR's (which can be complete) and calculate what modules should be added depending on the actual environment. E.g. if it runs on a Mac it might deploy another set of additional JAR's then if it runs on the PC or Linux.
To manage real world dependencies we need this indirection to keep this manageable. From 2010 to 2017 the amount of software will double. If we want to have a fighting chance to keep up we must decouple compile time from deployment time dependencies and have a module system that can handle it.
It seems so obvious to build on existing concept in the Java language that it begs an explanation why the transitive dependency model on JARs is so popular. The only reason I can think of is that is easier for the module system providers to program. In the Jigsaw model traversing the dependencies is as easy as taking the module name + version, combining it with a repository URL, doing some string juggling and using the result as a URL to your module. Voila! Works with any web server or file system!
Package dependencies require an indirection that uses a mapping from package name to deployment artifact. The Jigsaw/Maven model is like having no phone number portability because the operators found a lookup too difficult. Or worse, it is like having a world without Google ...
So for the convenience of a handful of module system providers (in the case of Jigsaw just one because it is not a specification) we force the rest of us to suffer? Though people have learned to use dynamic class loading and not telling the truth in their poms to work around dependency problems inherent in the dependency model we should not encode this in my favorite language. Should we not try to address this problem when a new module system is started? Or better, use an already mature solution that does not suffer the impedance mismatch between compile-time and run-time?
At JavaOne 2007 Gilad Bracha, then Sun's man in charge of JSR 277, told us he had started JSR 294 because he did not trust the deployers to handle the language issues. At that particular time this remark puzzled me.
Peter Kriens
P.S. Opinions stated in this blog are mine alone and do not have to agree with the OSGi Alliance's position
As this round will likely get me another 15 minutes of fame, let me use this fame to try to explain why I consider the Jigsaw dependency model harmful to the Java eco-system.
Jigsaw's dependency model must be based on Maven concepts according to requirement #1. This model is the same as Require-Bundle in OSGi. It is based on a module requiring other modules, where modules are deployment artifacts like JARs. To deploy, you transitively traverse all their dependencies and put them in the module system to run. That is, a module for Jigsaw is a deployment module, not a language module.
It is a popular solution because it is so easy to understand. However, Einstein already warned us: "A solution should be as simple as possible, but not simpler." The problem of managing today's code base for large applications is complex and all of us in the Java eco-system will be harmed with a simplistic solution.
The key problem is that if you depend on one part of a JAR you inherit all its transitive dependencies. For example, take the JAR Spring Aspects JAR. Even if my code only uses the org.springframework.beans.factory.aspectj package in this JAR, the Spring Aspects JAR will unnecessarily provide me with on dao, orm, jpa, as well as the transaction package. If I deploy my application I am forced to drag in the dependencies of these packages, which drag in other JARs, which each have additional dependencies, etc. etc. During deployment tens (if not hundreds) of JARs are dragged in even though those packages can never be reached from my code. Not only is this wasteful in memory, time, and complexity, it also inevitably causes problems in compatibility. For example, if the jpa dependency and the orm dependency transitively depend on different versions of the log4j JAR I cannot even run my code for no real reason whatsoever. In practice this problem is often ignored, for example maven will take the first log4j version it encounters, regardless if this is a suitable version. The value of type safety is questionable if that kind of fudging is acceptable.
Maven users are already all too familiar with this problem: many are complaining that maven downloads the Internet. This is not bashing maven, maven only does what it is told to do in the poms. In Tim O'Brien's blog he explains that Maven does not download the Internet, People download the Internet. Though he rightly blames the poms, he ignores the fact that the dependency model makes it hard to work differently. It is like you want a candy bar but you can only get the whole supermarket. He advises each JAR to be self contained and completely modularized to not drag in extraneous dependencies. Agree. Unfortunately this means you get lots and lots of JARs. And with lots of JARs on your module path without side by side versioning the compatibility issues like the above log4j example explode. And Jigsaw is not likely to have simultaneous multiple versions in run-time when critically reading the requirement.
In Unix dependency managers the transitive aspect is not such a big deal because dependencies are relatively coarse grained and there is a reason why you usually have to build stuff to make it compatible with your platform. Java is (thank God) different. Java dependencies are not only much more fine grained, they are binary portable to any platform, most important of all, often they have no dependency on an implementation but use a dependency on a contract instead. Unix was brilliant 40 years ago and is still impressive today but let's not undervalue what Java brought us. We must understand the differences before we revert to a 40 year old technology from an operating system to solve the problems of a modern and unique programming environment like Java.
The key advantage that Java brings to the dependency landscape is that each class file encodes all its dependencies. Give me your class file and I will tell you what packages you depend on. For example, when I compile my JAX-RS class against the jersey JAR(s) you will find that the class file only depends on the javax.ws.rs package even though the Jersey file contains lots of implementation packages. However, in the Jigsaw world I must select a module JAR that provides that package. Best case is to pick the JSR 311 API JAR for compilation to ensure I do not accidentally pick up implementation classes. However, I do not want to make the life of the deployer hard by unnecessarily forcing a specific implementation in run-time. So what module should I pick to depend on?
Despite common believe, we do not need fidelity across all phases between the build time and run-time; we need to compile against API and run against implementations that are compatible with that API. A good Java module system must allow me to depend on an API module at compile time and select an appropriate implementation at deployment-time. Forcing the compile time dependency to be the same as the run-time dependency creates brittle systems, believe me. The problem can be solved by separating the concept of a deployment module (the JAR) and the language module.
We could define the concept of a language module in Java and allow the JAR to contain a number of these modules. Surprisingly (for some), Java already provides such a language module, they are called packages. Packages provide a unique name space, privacy, and grouping, all hallmarks of modularity. The JLS introduction even names them as similar to Modula's modules. Actually, module dependencies on packages are already encoded in the class files. The only thing we need to add to the existing system is versioning information on the package and record this version in the class file. During deploy-time a resolver can take a set of JAR's (which can be complete) and calculate what modules should be added depending on the actual environment. E.g. if it runs on a Mac it might deploy another set of additional JAR's then if it runs on the PC or Linux.
To manage real world dependencies we need this indirection to keep this manageable. From 2010 to 2017 the amount of software will double. If we want to have a fighting chance to keep up we must decouple compile time from deployment time dependencies and have a module system that can handle it.
It seems so obvious to build on existing concept in the Java language that it begs an explanation why the transitive dependency model on JARs is so popular. The only reason I can think of is that is easier for the module system providers to program. In the Jigsaw model traversing the dependencies is as easy as taking the module name + version, combining it with a repository URL, doing some string juggling and using the result as a URL to your module. Voila! Works with any web server or file system!
Package dependencies require an indirection that uses a mapping from package name to deployment artifact. The Jigsaw/Maven model is like having no phone number portability because the operators found a lookup too difficult. Or worse, it is like having a world without Google ...
So for the convenience of a handful of module system providers (in the case of Jigsaw just one because it is not a specification) we force the rest of us to suffer? Though people have learned to use dynamic class loading and not telling the truth in their poms to work around dependency problems inherent in the dependency model we should not encode this in my favorite language. Should we not try to address this problem when a new module system is started? Or better, use an already mature solution that does not suffer the impedance mismatch between compile-time and run-time?
At JavaOne 2007 Gilad Bracha, then Sun's man in charge of JSR 277, told us he had started JSR 294 because he did not trust the deployers to handle the language issues. At that particular time this remark puzzled me.
Peter Kriens
P.S. Opinions stated in this blog are mine alone and do not have to agree with the OSGi Alliance's position
Tuesday, May 24, 2011
What If OSGi Ran Your Favorite Language?
If you look at the OSGi API then you see that the class loader modularity is strongly tied to Java. I actually do not know any other language that provides a way to intercept class loading in the flexible way that Java allows this to be done. It is also clear that most other languages out there are not type safe, something the OSGi has been fanatic to maintain and support.
However, the most important part of the OSGi programming model has always been the µservices, an Inter Module Communication (IMC) mechanism, that has virtually no overhead but is a surprisingly powerful design primitive; µservices encode so many important design principles.
OSGi µServices are also the cornerstone of Distributed OSGi. As so much code is not written in Java, how could we provide the service model to other languages? Obviously JVM languages can usually use the Java API but that is likely not optimal for a language, you like to take advantage of the unique possibilities each language provides.
Now, I could sit down and dust off my Smalltalk and write a service registry with the OSGi semantics for Smalltalk. With a bit of effort I could probably do it for C(++), maybe even Basic? However, it is clear that most language experts do not live in my house. So the question to you is, how would you make the µservice model look in another language while remaining compatible to the OSGi semantics?
Write a blog or google document how your µservice API would look in your favorite language and place a comment on this blog with a link. I am really curious how those APIs would look like. Give me your Python, Ruby, Haskell, Scheme, Clojure, Smalltalk, Javascript, ActionScript, Scala, Cobol, Perl, PHP, .NET, Beta, ML, Modula, Pascal, Fortran, Forth, Erlang, Postscript, Bash, and any of those other languages you like!
Peter Kriens
P.S. This is something I am highly interested in which does not necessarily imply that the OSGi Alliance is interested in pursuing this.
Friday, May 20, 2011
What You Should Know about Class Loaders
The more (open) source code I see the more I realize that so many developers do not understand the implications of class loaders and just try different things until it seems to work. Not that the Class Loader API is that hard but the implications of Class loaders are often not understood. In a modular environment, class loader code wreaks havoc.
Unfortunately, Class Loaders have become popular as the Java extension mechanism. Developers love the idea that you can take a String and turn it into a Class, there are so many hacks you can do with this! However, what is often not realized that in a modular world this just cannot work. The implicit assumption is that each Class object can be identified with a unique name. It should not take much more than 2 seconds to realize that in a modular world we also require the name of the module because the Class name is no longer unique. That is, class unique.X can actually occur in multiple modules. So given the string "unique.X" there is no way to identify from which module to load this class. In a modular world there are multiple class name spaces!
What is the advantage of having multiple class name spaces? Well, it solves a huge problem in larger applications that are built from many parts that come from different sources. Think open source. Take an example of logging. Logging is so popular that everybody does it, unfortunately with different libraries. In practice, it becomes hard to ensure that all your dependencies use the same logging library. In a modular world you can connect each library to its correct logging module. In a non modular world, the first one wins because libraries are placed on a linear search path; you often have no clue who won until you get an error. Obviously this problem is not limited to logging. It is actually kind of flabbergasting that a weak mechanism like the class path is acceptable for professional applications.
So why are developers using strings so often to get a class? Well, the core reason is that Java does not have an extension mechanism. No, the Service Loader is not an extension mechanism, it is encoding of a bad convention leaking its implementation out of all its methods. Extensions through class loading are highly popular, very often in Factory patterns. Sadly, extensions are exactly the place where you want to cross module boundaries. What is the use of an extension that happens to be in your own module? So these strings often contain the name of an implementation class that implements a collaboration interface. The three most important rules of modularity are: Hide, Hide, and Hide. Do you really want to expose the implementation class from a module? Isn't that the antithesis of modularity?
In spite of these problems, the pattern to use class loaders for extension mechanisms has become highly popular. Today we even have generations that actually do not realize how bad it is to use global variables (class names) to access context through static variables. My programming teacher in 1979 would have set me back a class when I would have proposed such a solution. However, as you know, you go to war with the army you have, not the army you might want or wish to have at a later time. So the class loading hack has become highly popular in lieu of an alternative. So popular that it is used in places where it is not even needed. Worse, it will bite you whenever you move your code base to a modular world.
So here are number of rules/hints you can use to prepare your software for a modular world (and simplify your life at the same time).
Peter Kriens
Unfortunately, Class Loaders have become popular as the Java extension mechanism. Developers love the idea that you can take a String and turn it into a Class, there are so many hacks you can do with this! However, what is often not realized that in a modular world this just cannot work. The implicit assumption is that each Class object can be identified with a unique name. It should not take much more than 2 seconds to realize that in a modular world we also require the name of the module because the Class name is no longer unique. That is, class unique.X can actually occur in multiple modules. So given the string "unique.X" there is no way to identify from which module to load this class. In a modular world there are multiple class name spaces!
What is the advantage of having multiple class name spaces? Well, it solves a huge problem in larger applications that are built from many parts that come from different sources. Think open source. Take an example of logging. Logging is so popular that everybody does it, unfortunately with different libraries. In practice, it becomes hard to ensure that all your dependencies use the same logging library. In a modular world you can connect each library to its correct logging module. In a non modular world, the first one wins because libraries are placed on a linear search path; you often have no clue who won until you get an error. Obviously this problem is not limited to logging. It is actually kind of flabbergasting that a weak mechanism like the class path is acceptable for professional applications.
So why are developers using strings so often to get a class? Well, the core reason is that Java does not have an extension mechanism. No, the Service Loader is not an extension mechanism, it is encoding of a bad convention leaking its implementation out of all its methods. Extensions through class loading are highly popular, very often in Factory patterns. Sadly, extensions are exactly the place where you want to cross module boundaries. What is the use of an extension that happens to be in your own module? So these strings often contain the name of an implementation class that implements a collaboration interface. The three most important rules of modularity are: Hide, Hide, and Hide. Do you really want to expose the implementation class from a module? Isn't that the antithesis of modularity?
In spite of these problems, the pattern to use class loaders for extension mechanisms has become highly popular. Today we even have generations that actually do not realize how bad it is to use global variables (class names) to access context through static variables. My programming teacher in 1979 would have set me back a class when I would have proposed such a solution. However, as you know, you go to war with the army you have, not the army you might want or wish to have at a later time. So the class loading hack has become highly popular in lieu of an alternative. So popular that it is used in places where it is not even needed. Worse, it will bite you whenever you move your code base to a modular world.
So here are number of rules/hints you can use to prepare your software for a modular world (and simplify your life at the same time).
- Do not use class loaders - If you feel the urge to use a class loader think again. And again. Class loaders are highly complex beasts and your view of the world might not match the world your code is actually used in.
- Do not use class loaders yet - Only for the experts.
- Extensions - If you have to connect to other modules use a Inter Module Communication (IMC) mechanism. A good IMC provides access to instances and not classes. If you do not have one, just make one yourself it is that not much code. Even better, use the OSGi service registry. Best is a real OSGi framework but if you're not yet ready for that step use something like PojoSR from Karl Pauls.
- Class.forName is Evil - Class.forName will pin classes in memory forever (almost, but long enough to cause problems). If you're forced to do dynamic class loading use ClassLoader.loadClass instead. All variations of the Class.forName suffer from the same problem. See BJ Hargrave's blog about this.
- Understand the Context - I've just debugged GWT code that was unnecessarily mucking around with class loaders when it needs to deserialize the request. The first token was a GWT class but the remainder of the request was the interface and parameters related to the delegate (the object implementing the RPC method.) However, for some reason they used the Thread Context Class Loader (you're in deep s**t if that one becomes necessary), and Class.forName. Completely unnecessary because they had access to the delegate object and thus its class loader. The class loader of this object must have access the RPC interface as well as all its parameters. There is a similar situation for Hibernate. You can name the classes in an XML file or you can give Hibernate the class objects for the entity classes. If you give it the class objects Hibernate works fine in a modular world. Give it XML and it chokes because in a modular world there is no 1:1 relation ship to class name and Class object. Reasoning about the context in which a class is loaded can prevent problems.
- Centralize - If you really need class loaders centralize and isolate your code in a different package. Make sure your class loading strategies are not visible in your application code. Module systems will provide an API that allows you to correctly load classes from the appropriate module. So when you move to a module system you only have to port one piece of code and not change your whole code base.
- Never assume that a class name identifies one class - The relation between class name and Class object is 1:n, not 1:1. You might desperately want it to be true but in a modular world this does not make it true.
- DynamicImport-Package and buddy class loading are not an alternative - If you browse the web you find lots of well meaning people pointing to these "solutions". Unfortunately, these "solutions" move you right back all the problems of the class path.
Peter Kriens
Subscribe to:
Posts (Atom)