Monday, June 4, 2007

OSGi and Hibernate

A long time Dutch friend of mine, Petr van Blokland and me are working on a yet another web framework. I will not go into the reasons for YAWFW, except that his background is graphic design which means aesthetics play a large role in this framework. However, this focus does not exclude us from having to handle mundane details like web requests, models, views, and databases. I am participating because it is a good way to get some enterprise systems experience. Obviously the whole infrastructure is based around OSGi.

To get the picture, a short sketch of the architecture. Incoming requests are dispatched to a servlet by the HOST header to a service. This makes it possible to handle multiple sites in the same OSGi framework. We call this servlet the site servlet.The servlet we dispatch to can be anything, but so far we are using a hand made servlet that uses Groovy classes as pages. The first part of the path is the class name, the second part is the method name and the remainder are parameters. That is, a request like http://www.acme.com/home/index is a call to the class index() method in the home class.

The Groovy class is a normal class that does not have to extend or implement anything. When we create it, we assign it a special meta class which we call the Builder. It is a bit different from a Groovy builder, but tries to achieve the same goal: merge code and html. The Builder is setup by the site servlet. Each site can have its own specialized builder. The builder can provide methods to the page class, this makes it easy to provide high level tags to the page. For example, you can make your own tag for a navigation bar.

Builders in Groovy are a marvelous concept, they merge the power of a full blown language with the power of declarative programming à la HTML. The achieve this goals using closures. Closures are blocks of code that have access to their surrounding context but that can be manipulated as objects.

def x = { print "hello $it" }
x() // Call the block

As a Smalltalker, the absence of closures in Java is one of the most painful experiences in my professional life. Anyway, closures allow you to parametrize functions. In Groovy, closures allow you to write code like:

def index() {
def title = "Hello World"
html {
head {
title { title }
}
body {
h1 { title }
}
}
}

This is actually quite an interesting approach but it is not the purpose of this blog. We also needed a database for our web framework ... Groovy provides some very interesting SQL capabilities and I was really tempted. However, Groovy uses some weird trick to recompile the source code in runtime to prevent one from having to write SQL statements (see findAll). A lofty goal, but in this case not worth the expense. The code generated weird errors when the source code was not available in run time. So the only alternative seemed to be Hibernate.

Hibernate is an Object Relational Mapper (ORM). It allows you to write simple Java objects that are persisted in a database. The objects are free of any markup, required super classes or implemented interfaces. All the mapping information is detailed in xml files that are adjacent to the class files. That is, when hibernate must map a class to a relational database, it uses the class name to create a resource path and loads that resource path as its mapping file. However, such a mapping file must be read before the database connection is created.

Hibernate manipulates the classpath, and programs like that usually do not work well together with OSGi based systems. The reason is that in many systems the class visibility between modules is more or less unrestricted. In OSGi frameworks, the classpath is well defined and restricted. This gives us a lot of good features but it also gives us pain when we want to use a library that has aspirations to become a classloader when it grows up.

We needed a database so in the past few weeks I created a bundle that mediates between bundles with domain objects, connections to the database, and bundles that want to use a database. It was actually quite tricky to get it all to work and I am not sure I found the best solution. Lets take a look what I did and let me know what you think.



To work with Hibernate, you need a Session object. You can get a Session object from a SessionFactory. To get the the SessionFactory, you need to create it with a Configuration object. The Configuration object is created from a configuration XML file. By default, Hibernate loads this from the root of your JAR file, however, you can add classes manually to the configuration if so desired.

This gave us a problem. We had several bundles that needed to use the database but could not a priori decide which bundles would be available at any given moment. They could all have used their own Hibernate Session Factory but that would significantly complicate the configuration management and it would cost performance.

This looked like a clear extender problem. Bundles should be able to declare the classes they contribute to a Hibernate session.

For this model I architected the following manifest header:

Hibernate-Contribution ::= default; \
classes="xierpa.impl.pw.User,xierpa.impl.pw.Role"

The name field in the header (default) is the name of the database contribution. The classes attribute contains a comma separated list of classes. These classes are loaded from the bundle and added to the configuration that uses this contribution. Simple. Any bundle can now provide domain classes to Hibernate by just declaring this header.

However, where does the Session Factory Configuration come from? I decided to use Configuration Admin for this purpose. For each configuration, one can create a factory configuration. The properties for this configuration describe the connection parameters. The configuration was so large, that I had to extend FileInstall to not only install bundles but also handle configuration files. The normal interaction with the command line was too basic.

So, the Hibernate extender receives the configurations and tracks started bundles. With this information it matches up the contributions from any bundles to configurations for Session Factories.

The bundle that wants to get a Hibernate SessionFactory object just gets the HibernateDomain service and requests a session any time it needs a database transaction. This session comes from a factory that is automatically refreshed whenever one of the contribution changes or the configuration parameters changes.

Overall this model turns out to work very well. It is easy for bundles to provide contributions and it is easy for bundles to use a fresh Hibernate Session, without having to track the configuration and contributions.

I of course also had to run into one nasty class loader problem. Hibernate created a proxy using classes from the Hibernate bundle. Unfortunately, it asked the class loader of the domain objects for this Hibernate specific class. "Oh, what tangled webs we weave when first practice to classload ..."

Obviously I always use bnd and bnd only inserts Import-Package statements for the packages that the code really uses. One of the great advantages of Hibernate is that it allows you to use objects not coupled to Hibernate at all. Obviously, bnd can therefore not insert any references.

The short term solution to this problem was Require-Bundle, which is of course not a good solution as readers of this blog can testify. It solves the problem in this case but it creates many other problems in the long run. In the next release of the OSGi specification we must find a solution to this problem. It must be possible to create uncluttered bundles that only import what they need but can still load classes that other bundles make them require. This problem has already been raised in CPEG and it has high attention because the same patterns appears for other libraries as well.

I am very interested in how other people solve the problem of using libraries like Hibernate in an OSGi Service Platform. Please provide feedback.

Peter Kriens

10 comments:

  1. Hi Peter,
    Thanks for articulating your design so clearly. Always interesting to see what others are doing with the technology. Some points to raise:

    1. Security would concern me about the possibility of invoking any Goovy method from a URI typed in a web browser. Large scale systems developed by many people will likely have some code (that manipulates the DB) that I would not like people to be able to remotely invoke so easily without at least being able to validate the input (arguments to the method) first.

    2. Your HibernateService should support getting the current session instead of getting "fresh" sessions. This allows code to be written with no knowledge of the transaction logic. See:
    http://blog.hibernate.org/cgi-bin/blosxom.cgi/2005/09/30/
    This would require providing a Hibernate CurrentSessionContext implementation that uses your HibernateService.

    2.5 It was not clear from your description if multiple database contributions could be specified in the manifest. If not then that seems limiting. If so then the database contribution name has to be hardcoded as well in some code. The database contribution name smells to me like the name of a Spring bean that provides a SessionFactory. Why not just use a Spring 2.0 namespace extension? Perhaps Spring is too heavy-weight on dependencies? Costin from the Spring OSGi newsgroup linked to this blog but I am interested in hearing why you are not integrating with Spring.

    3. Bundles defining database contributions in MANIFEST.MF, while flexible, seems suited only for the scenario where the DB is treated like "some place for applications to store stuff". In many large businesses, the DB is the most vital application they run. Allowing any deployer of an OSGi bundle to the runtime to declare classes (or heaven forbid update the schema) seems irrelevant. The SessionFactories are managed for the entire OSGi runtime independent of what the applications want (see J2EE). Likely JMX would be easier to use than deploying OSGi bundles with metadata. I would take the Spring approach on this - the deployed application simply depends on a SessionFactory. It shouldn't be in the business of declaring the SessionFactory at runtime.

    4. I would be interested to know your thoughts on why
    DynamicImport-Package: *
    is not ideal for solving your classloading problem. It is abit slower for resolving classes but Hibernate proxy class creation happens so infrequently (only at SessionFactory creation time?) that it can't matter that much (or am I wrong about that)?

    Thanks for your thoughts on this subject...
    -mike

    ReplyDelete
  2. 1. This is a standard security issue and the language is irrelevant. Input validation and escaping are crucial in all cases.

    2. Fresh session. Well, what I meant it gets the "current" fresh session. You actually get a fresh SessionFactory when the underlying configuration or contributions are changed. Obviously, all participants in the transaction must share the same session or things will horribly fail. It was one of the design goals of this bundle to share the same session without having ahead of time knowledge of all the possible combinations

    2.5 I started hardcoding the configuration name, figuring that in most cases there would only be one. However, my friend made it clear each site has its own db. I therefore do an indirection. Bundles contribute to a logical name and the database "uses" that logical name. This is quite flexible and prevents hardcoded configuration names.

    I do not use Spring because I want to avoid a God file that has to know all. The purpose of the exercise is let the available bundles define the configuration instead of a central configuration file. That is, if bundle A is present, it can participate, if not, it can not, without changing any configuration data.

    3. I think the configuration defines all the issues you worry about. The bundles with domain objects only provide contributions to a logical name. It is up to the configuration to name them. Obviously for an industry strength implementation security permissions must be added. However, if you do not know what bundles are running in your system, your in deep trouble anyway, I think :-)

    Richard Hall once said: "The set of installed bundles is your configuration." This has always been very true for OSGi based systems, it was therefore a bit of a surprise to see enterprise people thinking more in traditional one-process applications with a central control. So far, I really like this model that the system adapts to what I install.

    4. DynamicImport-Package is a last resort attempt. We spent an incredible effort to get rid of many classloading problems caused by a linear classpath. Dynamic-ImportPackage is back to this class loading hell when used on a large scale. We added it because it is sometimes the only (sad) solution but when used on a larger scale you kill all the advantages of modularity. For example, DynamicImport-Package can not handle versions in any reasonable way. (I know you add the attribute, but if you know the version, why shouldnt you know the class).

    Kind regards,

    Peter Kriens

    ReplyDelete
  3. Hi Peter,
    I'm glad smart people like you are trying to figure out how to get libraries like Hibernate to work with OSGi. A similar tool is Oracle Toplink which may become a part of an Eclipse project named EclipseLink. They wish to build on top of OSGi. You can find out more here:
    http://www.eclipse.org/proposals/eclipselink/

    cheers,
    Cameron

    ReplyDelete
    Replies
    1. Hi Cameron,

      EclipseLink is good but it is stick to the Equinox implementation, for example, to load model classes from different bundles I have to use org.eclipse.osgi.internal.composite.CompositeClassLoader which is inside the framework implementation (org.eclipse.osgi) !!!

      I don't know alternative solutions yet.. Do you have advise for this ?

      Kind regards,
      Thanh Le

      Delete
  4. Hi Peter,
    Thanks for an excellent article.

    In your comment you said,
    2. Fresh session. Well, what I meant it gets the "current" fresh session. You actually get a fresh SessionFactory when the underlying configuration or contributions are changed.

    Since, SessionFactory objects are immutable you must be building a new one. How did you ensure the existing SessionFactory was properly disposed of (i.e. closed) and that all of the Sessions it had created were not still in use?

    Regards,
    Adam

    ReplyDelete
  5. I think it is no problem that existing sessions are continue to be used. If they could use the session a micro session before, they can use it a microsecond after the configuration changed. Hibernate specifically indicates that sessions are relatively cheap entities that should be used on the level of a web request or other high level "request" concepts.

    This of course leaves the problem of garbage collecting the SessionFactory ... I leave that as a convenient secret :-), which is another way of saying I have not addressed this in the prototype.

    Kind regards,

    Peter Kriens

    ReplyDelete
  6. Yes, it's the issue of the garbage collection of the SessionFactory that I'm raising. The Javadoc for SessionFactory.close() states:


    Destroy this SessionFactory and release all resources (caches, connection pools, etc). It is the responsibility of the application to ensure that there are no open Sessions before calling close().


    Hence, calling close() could interfere with transactions in progress on the existing Sessions. Not calling close() could lead to leaked resources...I think! :)

    I've been looking into this same sort of design, and I'm not sure how well suited Hibernate is in its current incarnation.

    Regards,
    Adam

    ReplyDelete
  7. The easiest solution is to return a proxy to the Hibernate SessionFactory and then count the number of sessions you hand out and get back. It is a solvable problem I think.

    Kind regards,

    Peter Kriens

    ReplyDelete
  8. Hi Peter,

    I have been working with Apache felix + Apache Cayenne for the past few months. Regarding your various concerns, I am certain that Apache Cayenne should serve your purpose.

    Otherwise, GPL db40 will work fine too.

    Regards,
    James Yong

    ReplyDelete
  9. Hi Peter and others,

    I have implemented a sample OSGi/Hibernate/Spring DM/Spring project loosely based on the pattern described above. I'm not using the extender model yet, but the Hibernate SessionFactory is dynamically updated as bundles are started and stopped.

    The project is described here:

    http://code.google.com/p/voluble/wiki/OsgiHibernateSpringSpringDMSample

    The linked page includes instructions for getting and running the code. If anyone gets a chance to look at it, I would love to get some feedback while the example gets build out a bit further!

    ReplyDelete