This is the first installment in the
enRoute project's principles as I announced last week. In the principles section I am trying to lay down
first principles. This first section is about what I believe to be the root of much of software's complexity: time. The purpose of sharing this text is to get feedback so we can come to a common set of principles that are shared by many people. I am doing this in installments since I am told that my blogs are already to long. The whole text can be found at
github, where you can also
report any ideas, issues, and proposals.
Enjoy.
Time
Trying to explain our industry to lay people is hard. It is hard because what we software developers are doing has remarkably little to do with the concrete world; cyberspace is truly a different area. We use words like
build,
object, and
framework that are defined in a concrete world but have much more ephemeral semantics in our virtual world. You build a house from concrete, stones, and wood, a far cry from flipping bits on a hard disk in what we call the build process. Objects are, yeah, what are objects actually? And where a real framework is touchable, our frameworks are intangible. No wonder that many our partners are at large when we try to describe what it is that we do. We tend to utterly confuse them with these inadequate metaphors.
Out of this all, the hardest aspect to explain aspect of our work is the
volatility. The baker bakes bread, and the bricklayer builds buildings. They deliver a concrete result to their customers and the next day they bake or build something brand new, unrelated to yesterday's work. Software engineers 'build' their 'software' several times a day, but they seem to deliver largely the same thing over and over to their customers. We seem to be working on something that is continuously evolving but is still called the same. The closest metaphor is maybe a city. A city is a continuously evolving entity that never stands still, still we continue to give the same name. Julius Ceasar would not recognize Rome today, it is still the same city that he once knew.
It is interesting to see how we lack proper terminology in our industry. In
maven we talk about an artifact but it is not clear if refers to the bits on disk (the JAR file), or the project that builds it, or something maybe even something else? In this document we use
program for the what is the combination of
groupId and
artifactId in maven and
revision when a specific
version is added. The term
project defines the concept of a set of programs that can be used to
build a revision.
The difficulty of describing these core development processes clarifies why explaining to uninitiated what you do day in and out is hard. The core of our business is a long lasting process of reshuffling bits so that when they are combined with computers and users we achieve the results we promised. We call this process 'maintenance' but it has very little to do with the maintenance in the real world. In the real world, products deteriorate. A car needs an oil change or certain parts are designed to wear out over time and need to be replaced before they pass a breaking point, causing great damage. Bizarrely, in software we theoretically do not have wear and tear since a bit is a bit and they do not chance happenstance. A revision is immutable for all time. What we call 'maintenance' is actually a different process. In this process we:
Though bugs can just be stupidities, quite often they are caused by the coder's assumptions of the environment. And when this environment changes, the assumptions are no longer met and the code fails. This is also called
bitrot. It is the weird effect that over time programs that are not maintained will start to fail.
It should therefore be clear that a large part of our work is addressing the effects time. The context changes, which requires us to change the software, which changes the context for others. When we develop software we should be aware at any time that we are not really building anything but that we are in a continuous re-shaping process. It is crucial to be aware that any real world system lives in an ever evolving context where our own changes contribute to this changing context.
There are many practices in our industry that would be perfectly ok when change was no continuous, but that have unexpected consequences in world that never stops changing.
A surprising example is
aggregating, putting parts together in a greater whole. For example, you repackage a number of JARs in a single JAR. Every time you aggregate a set of parts, you create an additional responsibility because the underlying artifacts, the dependencies, will each change over time at their own rate. Each of these changes will add maintenance costs to rebuild the aggregate. Also, you will have to make the aggregation evolve at the rate of its fastest evolving part or the clients of the fastest moving part will be upset. Therefore, by aggregating you increase the entropy of the build.
Last but not least, you now also constrain the revisions of the constituents as they are in the aggregate. Clients, that need a different set of the constituents are out of luck.
To problems around aggregation are illustrated by the concept of
profiles. A profile is a set of API revisions put together so that end users can have a single JAR to compile against. In the Java world there are a number of J2ME profiles, and of course Java SE and Java EE can also be seen as profiles when squinting the eyes a bit. Developers in general love them because it seems to simplify their lives considerably. That is, until they find out there is a newer version of a profile's constituent that they absolutely need or when it is time to add new parts and they find that the process of maintaining the profile is highly politicized since there are now many different interests to take into account. In the 90's Ericsson and HP had TMOS, a Telecom Management Operating System, that imploded because they found it impossible to release a revision that satisfied the needs of all their users.
Though an aggregate or repackaging can have benefits, see [Modularity], the drawbacks of increasing the rate of evolution and the additional constraints between the parts do have a cost that is caused to this continuous changing world. These costs are often ignored because they are in the future when the first decision is taken, however, they should be taken more into account. We should reflect on our way of working not with an eye towards processes in the real world, but be acutely aware of the effect of a continuous changing world.
With respect to time, we should then take the following principles into account:
- Versioning – Ensure that independent parts are versioned so that we (and the computer know) of what revision we are talking about.
- Prepare for change – Ensure that the code base is always optimal for additional changes since they will happen.
- Minimize the cost of change – Since things will change ensure that when change happens the impact, and thus the cost, is minimal.