Wednesday, February 18, 2009

Avoiding duplication and increasing flexibility in software: what about version control?

These thoughts were catalysed by Section 4.2.4 "Once and only once" of PHP in Action where they preach the evils of duplicated code. This is something I strongly believe in. (My best exorcism to date is this Moodle commit.)

Unfortunately, the example they pick to illustrate their point utterly fails. Their example of what not to do is making a custom version of an application in a hurry by copying the whole thing and editing a few lines. Well, duh! Put the code in git and make a branch for the custom version. And, of course, when you do have time, come back and refactor, at which point git diff/merge/rebase ... will probably be a big help.

Now, this is a form of software engineering that open source developers do all the time (while having wild flame wars about which version control system is best). But, when computer scientists dream about paradigms like object orientation, design patters, refactoring, and so on, as ways to reduce the need for code duplication and increase the flexibility of software designs, do they consider the role of version control systems? I don't recall reading anything.

The point about flexibility is particularly interesting in the context of opens source software written in in an interpreted language. Given my background, I am of course thinking about Moodle, which is a PHP web application. Suppose, by way of example, you want to change some part of the processing that occurs when a student submits a quiz, and Moodle works out what score they deserve. The 'proper' design patterns way to do this is:
1. Check: Is the processing algorithm factored into a separate class, as recommended by the Strategy pattern? If not, refactor.
2. Subclass the default Strategy to implement your customisations.
3. Configure things so the factory methods instantiate your Strategy class, rather than the default one.

Alternatively, you could just hack the code. Preferably as a custom branch in your version control system. When a new Moodle version is released, just upgrade and merge (or pull and rebase). One could even try to claim that this way is better, because if the code you hacked changes between releases, the merge may fail, helping you realise that you need to rework your customisation. If you don't notice the problem until you get to testing (which may happen anyway) it will be much harder to find and diagnose.

Another example is this post in the Moodle quiz forum. Someone wanted to tweak something in Moodle, they were told to change the definition of a constant in a library file. Note that because there is no duplication of magic numbers throughout the code, it was only necessary to change the number in one place, so there is some proper design going on here.

One could, of course, have a separate configuration file for all the settings like this in Moodle, but does that really make anyone's life easier? If you think of all the things people might want to tweak, that would be one big file. Also, it would move the constant definition further from where it is used, reducing the cohesion in the code. You could also handle this by making this number an admin setting, stored in the database, and editable through the web interface. In this case, I would argue that that is bad UI design. 3 is a perfectly good default for almost everyone, and Moodle already has more than enough configuration settings. Sufficiently few people need to adjust this, that telling those that do to edit the code is an adequate interface.

You could call this approach "The whole code is a configuration file". In the abstract, you would not say it was good design, but for obscure configuration options like this, it may be the best way.

So, in a world with version control and interpreted languages, what is flexible and easy-to-modify software?

(To head off the obvious comments, I had better re-enforce that I do like nice clean software design, and design patterns, and so on; but I also spend most of my life working on Moodle where parts of the code are not like that; and somehow, most of the time, it just does not seem to matter. Millions of people around the world happily use and customise Moodle despite the lack of design patterns in the code. Should I be sad, or happy?)


  1. Very interesting point, my (limited) experience with computer scientists is that source control is rarely even considered. Though, what about open source software! Surely this is the real winning tool here!

    But I think the contrast of drupal vs moodle helps the counter example here. With drupal, everything is a hook and this strength is used to build a flexible general purpose CMS which can be changed in more ways than can be imagined. Even on my own basic internal company site we use a set of 5/6 modules which often hook into the same points and do powerful things which end up integrating seamlessly, and cleanly.

    Moodle lacks a massively flexible system of hooks, for this defect people generally tend to find it more easily 'hackable' to do as they wish. But moodles' scope of 'hacking' is limited and in many ways its enforced inflexibility is its strength.

    I don't think that even a powerful DVCS like git would help you apply multiple custom modules which interact with each other at the same point - it'd be conflicts central.

    Perhaps this is a gauge we should consider when designing where flexibility is required.

  2. "The whole code is a configuration file" - I like this idea Tim. I think it's especially true of Interpreted languages like PHP, that are (relatively) easy to read and understand.

    Is the problem with design patterns that they give too much weight to design? Practitioners know that code is as much evolved as designed. Or maybe that design patterns and OO (although incredibly useful) are now falling behind the curve of where practitioners are at?

    A coder may have an instinct of what is right for their particular project, given its particular language, particular architecture, stack, likely deployment, etc. This may be tacit knowledge for the developer i,e. they don't know that they know it.

    They may sense good patterns that do not map onto existing theory.

    >>"Millions of people around the world happily use and customise Moodle despite the lack of design patterns in the code. Should I be sad, or happy?) "
    Be happy! A beautiful piece of code that is never sees the light of deployment is about as much use as a chocolate kettle :)
    - Eamon

  3. A rather late reply. I agree that good software design is evolved. However, I think that one of the ways it evolves is when you refactor the code to a known pattern. I do find patterns helpful when thinking about design, but they should be used in combination of a healthy dose of the YAGNI principle.

  4. About giving configuration options an UI of their own: yes, when thinking about the actual UI of an application, too many obscure options are harmful. Essentially they just show the developer failed to make a UI decision and tried to push the responsibility for the decision to the user when it should have been pushed to a User Experience (UX) designer.

    However, when you approach the point from configuration, no one is stopping you from using prograssive disclosure. As long as the way to access it does not distract the users who don't need it, a less prominent configuration screen may, depending on the context, increase flexibility. It just depends on the needs of the key personas involved whether or not this is appropriate.

    But the point about having constants right in the code to keep them close to where they are used was interesting. I bumped to this problem when I was still coding more actively and didn't see that way of thinking.