Wednesday, February 18, 2009

Avoiding duplication and increasing flexibility in software: what about version control?

These thoughts were catalysed by Section 4.2.4 "Once and only once" of PHP in Action where they preach the evils of duplicated code. This is something I strongly believe in. (My best exorcism to date is this Moodle commit.)

Unfortunately, the example they pick to illustrate their point utterly fails. Their example of what not to do is making a custom version of an application in a hurry by copying the whole thing and editing a few lines. Well, duh! Put the code in git and make a branch for the custom version. And, of course, when you do have time, come back and refactor, at which point git diff/merge/rebase ... will probably be a big help.

Now, this is a form of software engineering that open source developers do all the time (while having wild flame wars about which version control system is best). But, when computer scientists dream about paradigms like object orientation, design patters, refactoring, and so on, as ways to reduce the need for code duplication and increase the flexibility of software designs, do they consider the role of version control systems? I don't recall reading anything.


The point about flexibility is particularly interesting in the context of opens source software written in in an interpreted language. Given my background, I am of course thinking about Moodle, which is a PHP web application. Suppose, by way of example, you want to change some part of the processing that occurs when a student submits a quiz, and Moodle works out what score they deserve. The 'proper' design patterns way to do this is:
1. Check: Is the processing algorithm factored into a separate class, as recommended by the Strategy pattern? If not, refactor.
2. Subclass the default Strategy to implement your customisations.
3. Configure things so the factory methods instantiate your Strategy class, rather than the default one.

Alternatively, you could just hack the code. Preferably as a custom branch in your version control system. When a new Moodle version is released, just upgrade and merge (or pull and rebase). One could even try to claim that this way is better, because if the code you hacked changes between releases, the merge may fail, helping you realise that you need to rework your customisation. If you don't notice the problem until you get to testing (which may happen anyway) it will be much harder to find and diagnose.


Another example is this post in the Moodle quiz forum. Someone wanted to tweak something in Moodle, they were told to change the definition of a constant in a library file. Note that because there is no duplication of magic numbers throughout the code, it was only necessary to change the number in one place, so there is some proper design going on here.

One could, of course, have a separate configuration file for all the settings like this in Moodle, but does that really make anyone's life easier? If you think of all the things people might want to tweak, that would be one big file. Also, it would move the constant definition further from where it is used, reducing the cohesion in the code. You could also handle this by making this number an admin setting, stored in the database, and editable through the web interface. In this case, I would argue that that is bad UI design. 3 is a perfectly good default for almost everyone, and Moodle already has more than enough configuration settings. Sufficiently few people need to adjust this, that telling those that do to edit the code is an adequate interface.

You could call this approach "The whole code is a configuration file". In the abstract, you would not say it was good design, but for obscure configuration options like this, it may be the best way.


So, in a world with version control and interpreted languages, what is flexible and easy-to-modify software?


(To head off the obvious comments, I had better re-enforce that I do like nice clean software design, and design patterns, and so on; but I also spend most of my life working on Moodle where parts of the code are not like that; and somehow, most of the time, it just does not seem to matter. Millions of people around the world happily use and customise Moodle despite the lack of design patterns in the code. Should I be sad, or happy?)