Wednesday, December 11, 2013

Attributes, attributes, and attributes

A lot of thanks for the people that sent me some persistence files.  It was a bit humbling for me that I only got less than 10 replies, but those good people are highly appreciated! If you feel guilty now, then don't hesitate to still sent me some files to my personal email!

Conclusion? Well, there is quite a variety out there and there is still a lot of diversity, lots of experimentation, and lots of struggling going on. Let me reflect on the most striking issue I noticed: Meta Data.

Persistence means your data is going to make a trip in a time capsule. Java was designed as an in-memory in-process programming language and this has as consequence that you need to provide some extra information how that trip should be made. What classes are persisted, where they are persisted, with what data types, under what name, etc. The most common approach is to use the javax.persistence annotations to markup a persistence object.

For smaller projects this seems to work well but I noticed that the larger projects went outside Java and escaped to text files. I have seen hibernate XML mapping files, proprietary XML files, and one project uses a Domain Specific Language (DSL) they developed with the Eclipse Enterprise Modeling Framework (EMF).

The advantages of annotations should be clear:

  1. Minimizes redundancy since they interact with the Java type information
  2. Can be safely refactored in all IDEs
  3. Can use small names because of Java's package scoping
  4. Type safe
However, one of the projects that used XML did not restrict itself to the persistence problem alone. The attributes of the persistent entities are not only used in Java to persist, they are also reflected as forms in the user interfaces and must be validated before they are stored.This meta-data is used in the GUI (potentially in another language), the core code, and the database (SQL Schema).  For each attribute, there are quite a large number of attributes on those domain attributes that must be defined by the developer. Using a proprietary file format or a DSL obviously provides for ample space to capture this information. Generators can then be used create the annotated data objects, SQL Data Definition Language files (DDL), HTML forms, Javascript validation, etc. This, of course, has the big disadvantage of requiring an extra step in the build. 

I must admit that  I find this model quite attractive considering that most enterprise applications must manage a very large number of attributes in a large number of entities and having a single definition of the domain model sounds awfully attractive; it can prevent a lot of redundancy. The surprise is of course that there is no dominant syntax for this information. The existing data description languages (ASN.1 or XML Schema anyone?) look utterly unusable for this purpose; these languages allow such complex data to be specified that generating Java, Javascript, HTML, etc. will be awfully complicated.

So what do you think? Just annotations on the Java Entity objects (potentially with extra validation annotations that can be used by the HTML5 GUI) or have a special domain object  specification file?

Peter Kriens @pkriens


  1. It was a bit unclear to me what you actually needed so that is why I did not sent anything. We are using annotations at the moment and have used a model with code generation in the past.

    My concerns with JPA annotations and a the current state of persistence engines is the black box. What I mean is this:

    There are many annotations that describe the relationship between files. This will do something that works and is likely optimized given the defined annotations. However it could not be the ideal path for your performance or for your storage. My experience is that it takes a long time to understand all the subtle effects an annotation has on the database and by the time you discover a performance penalty or a misplaced annotation you could be well in production because it maybe only starts to show when there are thousands of objects in the database.

    The current state of the JPA annotations is far too complex for the normal tech to grasp. The real effects on the database are not described. The annotations should be much clearer in their specification towards the effect on the database. Not only this but the persistence engine also has to support i-don't-know-how-many databases and each of them has its own special quirks that the persistence engine needs to know about. The result probably is a large and complex codebase that can never catch all corner cases.

    In my opinion we have to go back one step. Techies always have the urge to automate everything in their field just because it can be done. The result is that the average programmer has no clue what goes on under the covers and can therefore not be effective in issue resolution and optimization.

    The programmer is not aware of the database and therefore data access will not be optimized.

    When programming with JDBC and SQL directly the programmer has full awareness of the database. All queries are constructed manually and the programmer sees a direct effect of his action on the database. The database is constructed manually.

    I don't say we have to go back to that stone-age but maybe our quest for simplicity has lead to more complexity.

  2. @Peter
    Using some kind of mechanism to generate screens etc. only works for very specific applications. These are the same type of applications that are often implemented with 4G tools (yes they are horrible). In some frameworks such as Grails and Jboss Forge there is also scaffolding support that does this in a generic way.

    Although there certainly are applications where this works, it is NOT a general applicable mechanism. I have seen many projects get into trouble by starting out like this, and ending up maintaining there custom built metadata/generation platform instead of doing any actual development.

    I have seen this many times as well: JPA being used as an excuse to not understand the database, and then complain about the performance problems that JPA causes.
    This should't be solved by using bad abstractions like JDBC. Making someone think by taking tools away is not the right way to educate a team IMO. I believe it's the responsibility of team leads and architects to make this clear in code reviews, and making other developers aware of the problems.

    @Anyone ;-)
    A difficult architectural problem with relational databases in a modular system is data ownership. A service should be responsible for it's own data, but this is tricky if there are all kind of relations between tables. I'm obviously not saying that relational databases can't be used in a modular system, but there are some difficult design decisions to make.

  3. I wanted to post a reply yesterday but I was too busy. I am happy to see that others already commented exactly the same as I wanted.

    We tried to build a framework in the past that did the "magic". We read many articles in this topic. We worked a lot and on the end we were really happy that we needed only write a couple of lines of code to add another CRUD screens. However, the customer came with a new requirement that we did not think of and a simple change took days/weeks of development. I must say we failed.

    We tried it again. We read a lot of forums and articles how our patterns could be even more flexible. We created a very complex solution that created screens with validation and persistence by just writing a couple of lines of code. Well, a customer came again and wanted to have just one new feature that we did not think of. A 4 hours job became weeks. We failed again.

    I do not believe in frameworks that wires all the layers together. I do not believe in solutions where "writing just a couple of lines of code and annotating" can satisfies a complex requirement. It means that it can satisfy similar requirements well, but when it comes to new idea, the developer starts suffering from putting the new logic into the "magic" framework.

    My opinion about JPA is almost the same as @Wim and @Paul has. The problem is that JPA is supported by Jave EE and the concept looks cool from the outside. This is the reason that it must be supported by OSGi and enRoute (sadly). It will not be changed until the community of OSGi becomes stronger than the community of Java EE or one bigger company stands up (like Oracle or JBoss) and say that they will not continue supporting JPA (or JSF or other complex technologies)