Monday, May 18, 2009

Processes?

The last few weeks have been quite hectic, closing the Core specifications and working hard on the Compendium. The Compendium contains two major new specifications: Remote Services and Blueprint. The process of creating these specifications seems to have surprised many, and angered some. There is even a blog that makes us look like fools, just making up a process as we go along. Maybe that is understandable, we are trying to run the specification process on a trust and consensus basis and that goes better in an informal atmosphere. People that trust each other just work more effective and are more productive.

However, we do have a formal process documented in RFC 75 and we try to follow it meticulously. This process document has two purposes. First it defines for newcomers the model of how the work will be executed. Second, it is provides clear rules for the worst case when consensus cannot be reached. Unfortunately, it is of course one of the hundreds of things we have to do and most people assumed that the OSGi process was like most other standard organizations. They never have read the process document, really understood the early presentations about the process, nor read one of Eric Newcomer's blogs.

The (seemingly) unique aspect of the OSGi is that it recognizes the role of a Specification Document Editor (SDE). The SDE is paid by the OSGi; it is not an employee of a member company. So far, I had the honor to play this role for the releases 2, 3, and 4.

This makes the RFCs are what their abbreviation say they are: Request For Comments, they are not the final specification. RFCs are input of the specification writing process, a document to reach consensus about a technical design between disparate parties. They can be compared to a design document. And don't we all know very well how the final product changes during development? The vote for the RFC indicates a technical agreement among the EG members, not a vote for the final specification. And frankly, I am a bit disappointed that there is a confusion because the OSGi specification documents look in my eyes very different from the RFCs ...

So let me quote the applicable chapter that describes the current phase (TSC is the technical steering committee that consists of the EG chairs and the Technical Director (me)):

5.6.1 Input
The input to this process shall be the RFC(s) and RFP(s) and any supporting documentation from the EG. This shall be provided to the SDE by the TSC. It is possible that a single Specification may incorporate more than a single RFC and RFP. This integration shall be at the direction of the TSC.

5.6.2 Actions
The SDE shall create the appropriate documents based on the content of one or more RFCs. Under ideal circumstances the formulation of the Specification would be a mechanical process but it is expected that the SDE will uncover inconsistencies or other issues in the RFC(s) which require clarification by the appropriate EG. In this case the SDE shall liaise with the appropriate EGs directly to resolve the issue. The SDE shall have at least one and preferably several review cycles with the appropriate EGs to ensure accuracy prior to completion of the Specification.

5.6.3 Output
When a Specification has been completed it shall be electronically signed using the OSGi Alliance certificate and then voted upon by the EG as stated in section 5.5.7.2. If the document is rejected then it is returned to the SDE together with a written explanation as to the problems with it. The SDE, in conjunction with the EG, shall then modify the document as needed to address the EGs concerns before the document re-enters the formal process.

I think this phase of the process is for a large part the reason that our specs have so few errata. The process of taking documents from an EG and explaining the contents in a consistent tends to discover a lot of issues. Inconsistencies with other parts of the OSGi specs, hidden compromises that do not make sense when looking at things as a whole, hidden assumptions and knowledge, overlaps with other parts of the spec, etc. However good the RFC editors are, it is hard to create a good technical design and at the same time understand, and take into consideration the overall context in which it will be placed. RFC editing is a secondary responsibility, while the SDE is doing this as a primary responsibility. Then again, the SDE has no power whatsoever, any change as well as the final specification, must be approved by the EG.
I do not think any RFC has come through this Specification Writing phase unscathed. However, nobody has ever denied that the specification was better than the input RFC(s) due to this phase, well so far at least.

The second related frustration of last week was a bug report in Eclipse complaining about my reading schedule because there are API changes in an RFC that they based their product on. Since about two years, we addressed public concerns that we were too closed, by publishing interim drafts of the RFCs as well as specifications. Obviously, we made it crystal clear that there are no guarantees about the final specification. Worse, we had virtually no feedback for the RFCs so far and that we are now being banged on the head for fixing issues that come up late during specification writing. It ain't over 'til the fat lady sings ...

That said, I am not denying we have a problem. Though the core went out on the planned date, but the compendium is 4 weeks delayed and it is not completely clear Remote Services and Blueprint will be finished in that time frame. Part of the problem is that the OSGi Alliance has only one SDE, and that person (me) is only being paid part-time to work on it. The EGs have been very active lately and this has significantly increased the workload. We need to fix this somehow. However, I do think we should keep the SDE role for the sake of the final quality.

However, in my opinion, a specification is not cost free for a community, a bad specification can actually be quite expensive. I will not use any names here. I pride myself to work for an organization that actually wants to publish high quality specifications and is willing to pay the (sometimes) steep price. Tim Diekmann's (co-chair of the Enterprise EG) mail signature is:
"There is never enough time to do it right, but there is always enough time to do it over" -- Murphy's Law
Well, so far, we actually always tried to do it right because specs cannot be done over. Even if it sometimes really hurts.

Peter Kriens

P.S. Processes rate slightly above licenses on my list of favorite subjects ... Back to RFC 119 and RC 124.

3 comments:

  1. Peter,

    As the author of the bug report that complains about your reading schedule, I should say: That was not intended to implicitly blame you personally for what's apparently happened WRT RFC 119...and I do understand overworked and underpaid WRT your own resources (believe me). My apologies if it reads as being insensitive or a personal attack (as it apparently does).

    But I do think that there are major problems with any standards process (and the org/EE) that allows stuff like this to happen. I also think that when this happens, it damages everyone...as it reduces the likelihood that RFC119 specifically and OSGi standardization generally will be used and accepted. And although I agree correctness is important for standardization, I believe it's also important to meet 'customer' needs for standardization in a timely manner...as the standards world is sadly littered with examples of unused and irrelevant 'perfect' standards.

    It's great to see that you admit things need to be different but I would encourage you (and perhaps more importantly the OSGi membership) to do more than remind everyone that an RFC isn't a finalized spec. This isn't a valid defense of what's happened, IMHO...because although strictly correct, it violates valid and common assumptions about how 'normal' enterprise sw standards efforts operate. In other words, *everyone* is surprised in a negative way.

    If you or others need resources/help to complete/finalize a spec without this happening, then it seems to me its incumbent upon you to make that clear to the OSGi Alliance membership, and then incumbent on the membership to provide that help.

    ReplyDelete
  2. If somebody looks at an OSGi 4.2 spec 2 years from now, I'd like to acknowledge that I worked on it. Explaining that we punted on issues because of a deadline makes you look with egg on your face when everybody has long forgotten that crucial deadline.

    Anyway, though it is bad that we could not make Eclipse's deadline (we really tried hard and we made most), the actual changes we talk about are hardly architectural and it would surprise me if they could not be adopted with a few hours of refactoring and search/replace. I really think that this is a small price to pay for consistency, correctness, and in this case clarity.

    Then again, maybe I am becoming too old fashioned to do this work, believing that the quality is in the end what defines success.

    Kind regards,

    Peter Kriens

    ReplyDelete
  3. quote
    If somebody looks at an OSGi 4.2 spec 2 years from now, I'd like to acknowledge that I worked on it. Explaining that we punted on issues because of a deadline makes you look with egg on your face when everybody has long forgotten that crucial deadline.

    scott
    I disagree with your characterization of the situation as 'punting on issues'. I don't think everyone agrees with you that the issues are crucial.

    quote
    Anyway, though it is bad that we could not make Eclipse's deadline (we really tried hard and we made most), the actual changes we talk about are hardly architectural and it would surprise me if they could not be adopted with a few hours of refactoring and search/replace.

    scott
    The amount of work involved in the refactoring isn't the point. The point is that we now can't give API guarantees with our/Galileo major release...not even 'provisional' API...and so, we've had to mark all OSGi API as both x:-internal/provisional AND deprecated (see bug and comments).

    quote
    Then again, maybe I am becoming too old fashioned to do this work, believing that the quality is in the end what defines success.

    scott
    Although that's a viewpoint that I generally agree with, I think when it comes to sw standards it's perhaps a little naive (with apologies). And, I'm sure you are familiar with this quote:

    "The perfect is the enemy of the good". - Voltaire

    I also reject the implicit assertion that this represents a pure choice between quality and timeliness. I think it's likely that people (other than me) disagree with you that RFC 119 in it's current form is of low quality.

    And one further point: if RFC119 IS of sufficiently low quality to justify relatively large changes at this point, why wasn't this discovered and discussed/fixed before the EE vote in Dec 2008?

    btw, sorry about the cumbersome quote/scott above, but this comment editor isn't the greatest.

    ReplyDelete