Tim's blog: 2010

Tuesday, November 2, 2010

How (not) to sell your open source development services

It must be terrible trying to sell software development services if you work in Open Source. The restrictions of the GPL appear to make it almost impossible to demonstrate your previous work to prospective clients. I would like to offer the following advice, which is my attempt to distil what appears to be current best practice in dealing with this tricky situation.

Do not make it possible for your potential client to find any examples of code you have written. If you cannot avoid it completely, make it as difficult for them as possible. For example, if you have created a certain sort of plugin, just say "We have created a new Moodle block." Do not give any clues as to which block that might be. If the open source project provides a convenient place to upload and share your contributions, try to avoid making any code you have produced easy to find there. Even if you have been forced to share your code in these places, do not on any account provide your potential client with a link to the example of work that you are most proud of. It is fairer to let them search and find a representative sample of your work, if they are clever and patient enough to do so.

Your development staff are, or course, all faceless drones who are nothing to be proud of. Your potential client does not care who will actually be doing the work they are requesting. This is particularly important when the open source project has a strong community. It must not be possible to identify your staff in the recognised list of project contributors; nor should it be easy to discover how they have contributed to the on-going development of the project's code by looking in the project's issue tracker.

It is important is to dress your proposal up in meaningless marketing-speak. If your client cannot complete their buzzword bingo card while reading your document, they just won't hire you. Including a PowerPoint presentation with impressive diagrams and more platitudes about your company can create a particularly strong impression.

Wednesday, October 20, 2010

The new question engine - how it works

In my last blog post I promised more details of the new question engine "in a week or so". Unfortunately, things like work (mainly fixing minor bugs in the aforementioned question engine); rehearsing for a rather good concert that will take place this Friday; and buying curtains for my new flat, have been rather getting in the way. Now is the time to remedy that.

Last time I explained roughly what a question engine was, and that I had made big changes to the one in Moodle. I now want to say more about what the question engine has to do, and how the new one does it.

The key processes

There are three key code-paths:

To display a page of the quiz:

Load the outline data about the student's attempt.
Hence work out which questions are on this page.
Load the details of this student's attempt at those questions within this quiz attempt.
Get the display settings to use (should the marks, feedback, and so on be visible to this user).
Taking note of the state of each question (from Step 3) update the display options. For example, if the student has not answered the question yet, we don't want to display the feedback now, even if we will later.
Using the details of the current state of each question, and the relevant display options, output the HTML for the question.

The bits in italic are the bits done by the quiz. The rest is done by the question engine.

To start a new attempt at the quiz:

From the list of questions in the quiz, work out the layout for this attempt. (This is only really interesting if you are shuffling the order of the questions, or selecting questions randomly.)
Create an initial state for each question in the quiz attempt. (This is when things like the order of multiple choice options are randomised.)
Write the initial states to the database.

To process a student's responses to a page of the quiz.

From the submitted data, work out which attempt and questions are affected.
Load the details of the current state of those question attempts.
Sort all the submitted data, into the bits belonging to each question. (The bits of data have names like 'q188:1_answer'. The prefix before the '_' identifies this as data belonging to the first question in attempt 188, and the bit after the '_' identifies this as the answer to that question.)
For each question, process its data to work out whether the state has changed and, if so, what the new state is. This is really the most important procedure, and I will talk more about it in the next section.
Write any updated states to the database.
Update the overall score for the attempt, if appropriate, and store it in the database.

These outlines are, of course, oversimplifications. If you really want to know what happens, you will have to read the code.

There are other processes which I will not cover in detail. These include finishing a quiz attempt, re-grading an attempt, fetching data in bulk for the quiz reports, and deleting attempts.

The most important procedure

This is the step where we take the data submitted by the student for one particular question and use it to update the state of that question.

Moodle has long had the concept of different questions types. It can handle short-answer questions, multiple-choices questions, matching questions, and so on. Naturally, what happens when updating the state of the question depends on the question type. That is true for both the old and new code.

Now, however, there is a new concept in addition to question types. The concept of 'question behaviours'.

In previous versions of Moodle, there was a rather cryptic setting for the quiz: Adaptive mode, Yes/No. That affected what happened as the student attempted the quiz. When adaptive mode was off, the student would go through the quiz entering their response to each question. Those responses were saved. At the end, they would submit the quiz and everything would be marked, all at once. Then the student could review their attempt (either immediately, or later, depending on the quiz settings) to see their marks and the feedback. When adaptive mode was on, the student could submit each question individually during the attempt. If they were right first time, they got full marks. If they were wrong, the got some feedback and could try again for reduced marks.

The problem with the previous version of Moodle was the way this was implemented. There was a single process_responses function that was full of code like "if adaptive mode, do this, else do that". It was a real tangle. It was very common to change the code to fix a bug in adaptive mode (for example), only to find that you had broken non-adaptive mode. Another problem was the essay question type, which has to be graded manually by the teacher. It did not really follow either adaptive or non-adaptive mode, but was still processed by the same code. That lead to bugs.

A very important realisations in the design of the new question engine was identifying this concept of a 'question behaviour' as something that could be isolated. There is now a behaviour called 'Deferred feedback' that works like the old non-adaptive mode; there is an 'Adaptive' behaviour; and there is a 'Manually graded' behaviour specially for the essay question type. Since these are now separate, you can alter one without risking breaking the others. Of course, the separate behaviours still share common functions like 'save responses' or 'grade responses'. We now also have a clean way to add new behaviours. I made a 'certainly-based marking' behaviour, and a behaviour called 'Interactive', which is a bit like the old Adaptive mode but modified to work exactly how the Open University wants.

It takes two to tango, and three to process a question

In order to do anything, there now has to be a three-way dance between the core of the question engine, the behaviour and the question type. Does this just replace the old tangle with a new tangle (of feet)? Fortunately there is a consistent logic. The request arrives at the question engine. The question engine inspects it, and passes it on to the appropriate behaviour. The behaviour inspects it in more detail, to work out exactly what need to be done. For example is the student just saving a response, or are they submitting something for grading. The behaviour then asks the question type to do that specific thing. All this leads to a new state that is passed back to the question engine.

So, the flow of control is question engine -> behaviour -> question type except, critically, in one place. When we start a new attempt, we have to choose which behaviour to use for each question. At this point the question engine directly asks the question type to decide. Normally, the question type will just say "use whatever behaviour the quiz settings ask for", but certain question types, like the essay, can instead say "I don't care about the quiz settings, I demand the manual grading behaviour."

If you like software design patterns, you can think of this as a double application of the strategy pattern. The question engine uses a behaviour strategy, which uses a question type strategy (with the subtlety that the choice of behaviour strategy is made by the question type).

Summary

So that is roughly how it works. A clear separation of responsibility between three separate components. Each component focussing on doing one aspect of the processing accurately, which makes the system highly extensible, robust and maintainable. Of course, everyone says that about the software they design, but based on my experiences over the last year, first of building all the parts of the system, and then of fixing the bugs that were found during testing, I say it with some confidence.

Tuesday, October 5, 2010

Introducing the new Moodle question engine

The rest of the (Moodle) world is eagerly anticipating Moodle 2.0, but I would like to tell you about what I have been doing for most of the last year, but which you won't be able to have until Moodle 2.1 - unless, that is, you are a student or teacher with the Open University, in which case you will be using it from this December.

What I have done is to rewrite a large chunk of the Moodle quiz system. What chunk is that? Well, first you can split a quiz system into two main parts. There is the quiz part, which says, "This quiz comprises these questions, and will be open to students between these dates". It tracks the student and they attempt the quiz, and stores their total score. Then there is the part that deals with the details of the individual questions within each quiz.

The question part can again be split in two. There is the question bank which lets the teacher create and store questions. For example "This is a multiple choice question where the student must select one of these three options, and it is an 'Elementary maths' question." Then there is the code that controls what happens when a student attempts a question "The student sees three radio buttons and a Submit button, and when they click the button we compute a score as follows and show this feedback." That second bit is what I call the question engine, and that is what I have rewritten.

However, you cannot just change the question engine in isolation. There are knock-on effects. For example, the quiz module still maintains overall control of things, even though it delegates a lot of the details to the question engine. So there are places where the quiz says things like "Dear question engine, please display this question now", or "Dear question engine, the student submitted this data, please process it", or "Dear question engine, the teacher wants to see all students responses to all questions in this quiz, give me the data to display." All those places have to change when the question engine changes.

There were also small changes required to the question bank. mainly because the new question engine has some new features that need some extra options stored with each question. So, the question bank needs to store the new options; let teachers edit them; back them up and restore them; import them and export them; and so on.

Altogether, my year's work added about 52,000 lines of new code and removed about 25,000 lines of old code (or, if you prefer, added 27,000 lines and altered 25,000 lines). At least that is the size of the change that I committed to the OU's CVS server last Friday, just in time to make the feature-freeze for the December update of our VLE. For comparison, the whole of Moodle 2.0 is about 1,600,000 lines of code, although that includes several large third-party libraries.

I am sure that there will be some minor bugs still to be found and fixed, but this new code has already had extensive testing from my colleagues Phil Butcher and Paul Johnson, so I am confident that the remaining bug-fixes will be minor.

There is much more I want to write about the new question engine, but I think this introductory post is already long enough. Therefore, I will split the remainder of what I want to say into separate posts which I hope to publish over the next week or so.

Tuesday, September 14, 2010

On the other side of the fence

I've been having problems with my teeth recently. Fortunately, none of it has been painful, but cycling home from yet another dentist's appointment this afternoon, I suddenly had a thought that the experience I was having with my dentist might be a bit like the experience of a non-developer who encounters a bug in the Moodle quiz. Allow me to explain.

I think it has been a total of four trips to the dentist over the summer. First I went for a regular check-up. That revealed that one of my fillings was cracked, so there was another appointment to drill it out and redo it. Then a filling in another tooth fell out one Friday evening, so I could not get anything done about it until the following week, and over the weekend a bit of the tooth next to the hole broke off, which was really worrying. (Luckily, as I said above, it was not painful.) So anyway, that required a big filling to fill the resulting hole. Then part of the new filling broke off, so today's appointment was to re-do the missing bit. Hopefully that remedial fix will work. If the new bit breaks off again, I will have to go back to have the whole filling drilled out and redone.

With all this going on, I fear I have been starting to have unkind thoughts like: "Is my dentist competent?" "Should so many things go wrong over a few months?" "How do I know if this is normal?". This is exacerbated by what appears to me to be a slightly casual attitude on her part. I expect that these really are routine problems, and rather boring to her. I, however, am worried, so I would have appreciated a more concerned-seeming bedside manner.

As I say, it was just after I had climbed on my bike to go back to work this afternoon, that I had the epiphany that this is probably how someone feels when they come to the Quiz forum after encountering a problem in one of their Moodle quizzes. To them it seems like some terrible problem that has them really concerned. They describe their symptoms, and I read it and think "oh yes, the problem is probably in that bit of code, let me do a quick fix." And then maybe I screw up and introduce a regression, but when that is pointed out, it too is easily fixed. To me it seems like some routine and minor matter, but I have never really thought how the process of fixing bugs feels to someone who is not a software developer, and who does not really understand what is going on. Will they still trust Moodle?

Now that I have thought about it, will I do anything? I fear I am unlikely to change my 'bedside manner'. My time seems to be more than taken up with actually doing the bug fixes, and other development, to spend too much time being nice to people. Still, I will try to make a bit more effort in future know I now how it feels to be on the other side of the situation.

Before finishing, allow me to point out (particularly to American readers) that all these trips to the dentist have cost me very little or nothing. In particular, the two recent appointments to replace the filling that had (partly) fallen out cost me nothing. Thank you NHS. I hope the new government does not cut you to death.

While I am writing, I will also share the news that I finally bought an apartment of my own over the summer, after years of renting. It is very nice, and worth the hassles of dealing with solicitors. Hopefully now that the stresses of dealing with the move are behind me I can concentrate more on Moodle development, although there are still a few minor things to deal with like getting some more furniture. Over the weekend I ordered some new sofas, including some bright orange cushions. I can't think where I got the idea for such a daring colour ;-)

Friday, July 9, 2010

Book review: Moodle 1.9 Extension Development

It is a book that the community has needed for a long time, a book that tells you how to write Moodle code. Now it is finally here. Does it live up to expectations?

Yes, I think it does. The authors, Mike Churchward and Jonathan Moore, are two experienced Moodle developers (they both work for Canadian Moodle Partner Remote Learner) so they can write authoritatively on the subject.

One issue with a book like this is that the examples given are, necessarily, fairly basic. To illustrate key techniques and ideas, a book must explain using the simplest example that makes the point. The question is, when you come to solving real problems, will the techniques you have learned expand to cope? Well, this is where the experience of the authors counts. They are telling you the right way to do things that works for real applications, even if they are only using simple examples to illustrate them.

The book does a thorough job of covering just about every type of Moodle plugin there is. Of course, some plugin types get more space than others, with the two most important, blocks and activity modules getting the most space. Therefore, some other plugin types, like question types, and gradebook plugins, are covered rather briefly. Between the chapters on the different types of plugins are chapters on more general topics like security, accessing the database, and so on.

Anyone who has had code reviewed by me will know that I get really pedantic when I review something. As I was reading the book I made a list of minor errors, or points where I disagreed with the authors. From the 300 pages of the book, I only found 22 things to put on my list, and none of them are interesting enough to mention here. (I did send the list to Jonathan.) So, I think this book has a very high standard of accuracy.

This book does assume you already know how to program in PHP, and write HTML and CSS. I think that was the right decision. There are plenty of excellent books out there that will teach you to write general web application in PHP, and it would be silly to duplicate those in a book that is uniquely about writing code for Moodle.

It is unfortunate timing that this book was released only a few months before Moodle 2.0. Moodle 2.0 does change quite a lot of the rules for how to do Moodle development, and so a lot of the details in the book will soon be out of date. However, don't let that stop you from getting this book. We have just talked about how this book helps you make the jump from being a general PHP developer to being specifically a Moodle (1.9) developer. Well, from there to being a Moodle 2.0 developer is just another small step. You won't be wasting much time if you learn about Moodle 1.9 first, and anyway, some people will still be running Moodle 1.9 for some time to come, and it will be a while before there is a book about Moodle 2.0 development on sale.

This should go without saying, but programming is an activity that you actually need to do to understand. You won't become an expert Moodle programmer just be reading a book. You will become a Moodle programmer by actually trying to write Moodle code, and learning from your own mistakes, and from the code other people have written in the past. What a book like this will do for you is that it will help you avoid a lot of the really basic mistakes, and it will set you off on the right path. So it will make your own learning-by-doing much more efficient, but I cannot replace the doing. Also, I would like to point out that while this is the only book about Moodle development, it is certainly not the only resource to help you learn Moodle development. If you are interested in this book, you should also look at the Developer documentation on Moodle Docs and the Introduction to Moodle Programming course on http://dev.moodle.org/.

Overall, if you want to learn Moodle development, this is a good book to help you attain your goal. Sure, you can get a lot of the information for free online, but in this book the authors set it out clearly and in a logical order. The information in this book has been written by expert Moodle developers and then carefully reviewed, so you can read the book without being on your guard for misinformation. You would have to be more careful just using the information Google finds for you online. So, as I say, this book lives up to expectations. If you want a book on Moodle development, get this one, and don't worry too much about Moodle 2.0 making it out-of-date.

Wednesday, June 9, 2010

Computational Knowledge

I spent today at the London Computational Knowledge Summit. A rather pretentious name for a meeting organised by Wolfram about Wolfram|Alpha, Mathematica, and what these sorts of tools can do. They way they see it is that if Google is a means to retrieve facts, Wolfram|Alpha aspires to be a way to use information. For example, governments have started dumping a lot of data into the public domain, but how easy is it for a citizen to extract meaningful information from that data?

For someone used to Moodle and open source, I was deeply struck by how Cathedral (as opposed to Bazaar) the whole Wolfram/Mathematica world is. That may be, in part, a natural consequence of when they started. Wolfram has been working on Mathematica for twenty years. That is an impressive achievement, and it means they have been working on it since before Linux was a gleam in Linus Torvald's eye. They started at a time when the commercial model was how most software was written, and they are clearly still an American commercial software development business at heart.

It is interesting to ask what would have happened of Mathematica had been GPL. Would it have developed more or less than it has as a commercial project? Wolfram would not have been able to pour as much money into it, but would a community have done as much, or more? I am not sure. Clearly the commercial model has been highly successful for driving the development of Mathematica. It is a very cool, and very sophisticated tool. On the other hand, one can dream about a world where the Mathematica engine is free, and as a result shipped on every OLPC XO. What effect would that have on the world over the next twenty years? (Quiz, can you name Mathematica's closes open source rival? I can, but only because it is used in STACK. Speaking of which, it was nice to renew my acquaintance with Chris Sangwin today.)

Another things that struck me about the Wolfram world is how insular it is. Some of the speakers we heard clearly spend all their time thinking about Mathematica. They have a really amazing tool, and, having what they hope is the ultimate sledge-hammer, they seem to view most problems as nuts to be cracked. It made me worry about whether the Moodle community seems that insular from the outside.

However, the day was not just about Maths and Mathematica, but more about the democratisation of knowledge. How can a concerned citizen reach meaningful conclusions about the world using the data that is now available? Conrad Wolfram set the scene by talking about this mission. He started with two slides, one showing a page from Principia Mathematica, and then a page from a modern mathematical paper. They both comprised text, diagrams and equation. Certainly, modern type-setting is better, and the diagrams are now in colour, but that is not much to show for 350 years progress. Is this the best way to publish scientific knowledge in the 21st Century? No, it isn't. For instance, it is silly to show a static graph. It should be a graph that the reader can manipulate, and the raw data should be hidden behind it in a form that you can extract to perform you own analyses on if you choose. Is the paper about a model? If so the model should be embedded there, and you should be able to experiment with varying the inputs and the assumptions to see what happens. If you are interested, you should be able to get at the source code to see exactly what was implemented. So, rather than a document being a static, one-way, low-bandwidth form of information transmission, it should be an interactive things that allows the "reader" to engage in a two-way dialogue.

Of course, Mathematica is the perfect tool for authoring such documents, or so Wolfram would like you to think. To encourage this, they are talking about a CDF - computational document format - which would be an 'open' standard based on a subset of Mathematica's capabilities, with a 'free' player for all common platforms. Think PDF and Adobe. Will it take off? Well, does the rest of the world trust Wolfram enough to adopt their format? Can the offer enough more than HTML5? I am not sure.

The next talk was by Andrew Dilnot from Oxford University. He talked very engagingly about the difficulty with statistics and gave some nice topical examples of the need to interpret numbers in context. Presented with a numerical fact, you need to ask some basic questions: is this big or small? going up or down? where did it come from? For example, the smoking ban has saved the NHS £8million. Well, that is small. The NHS annual budget is £100billion. Why are the media making such a big deal of the story? His talk could be summarised as: All numbers are wrong, but using numbers is still much better than not using numbers.

There were a couple of talks about that were more pretty than interesting. John Barrow from Cambridge talked about the history of scientific images, and Alan Joyce showed some of the cool things you can do with Mathematical. There were also some glimpses behind the scenes of Wolfram|Alpha. As I say, very Cathedral. There is no web spidering or crowd-sourcing here. At one end it is about Wolfram people going out and finding the most authoritative data sources they can get, and loading them into a format their Mathematica code can use. At they other end they are training the Alpha system to take natural language queries, decide which methods to apply to what date, and select visualisation to display the results, in order to automatically give the best possible answer to that question. They gave an interesting parallel for this. In the past, all actors had to appear on stage to whatever live audience could fit into the theatre. Today, film and television allow most people to see the best actors in the world. Can the expertise of the best thinkers in different fields be made accessible in a similar way? Well, that is their goal. The initial results are impressive, but these are only the very early stages of a highly ambitious project and there is a lot more to do.

In the afternoon, Conrad Wolfram talked about his thoughts on maths education. This has traditionally been focussed on teaching students how to perform computations by hand. But, said Conrad, this is only one part of what you need to know to be a mathematician. The full steps are:

Ask an interesting question about the world.
Translate that real world question into mathematics.
Compute the solution.
Translate that solution back into real-world terms, and find some way to validate whether your answer makes sense.

(OU Maths students, note the similarities to the Modelling cycle as taught in OU courses.) Historically, maths education has focussed on Step 3. However, these days computers are just better than humans at that. At the same time, there has never been a greater demand for people who are good at the other three steps (in business, to guide public policy, and so on). So we need to rethink the maths curriculum. To put it another way, is it cheating if you use Wolfram|Alpha to do your maths homework? Well, if it is, then we are probably assessing the wrong things.

At the end, Stephen Wolfram gave a presentation by video link summarising some of the themes of the day.

Overall, it was a very interesting day. While I don't think it quite lived up to its pretentious title, it was certainly more than just a Wolfram marketing exercise. I would like to thank the Open University for sending me. I can see myself using Wolfram|Alpha in future for certain sorts of queries, but it will not replace Google as my day-to-day search engine. (I don't think it aspires too.)

Thursday, March 25, 2010

When do students submit their online tests?

I am currently studying an Open University course (M888 Databases in Enterprise systems. There was an assignment due today, and like many students, I submitted only an hour before the deadline.

That got me thinking, are all students really like that? Well, I don't have access to our assessment submission system, but I do work on our Moodle-based VLE, so I can give you the data from there.

This graph shows how many hours before the deadline students submit their Moodle quizzes (iCMAs in OU-speak)

That is not exactly what I was expecting. Certainly, there is a bit of a peak in the last few hours, but there is another peak almost exactly 24 hours before that, with lesser peaks two and three days before.

Note that all our deadlines are at noon (it used to be midnight, but that changed a few months ago). The graph above is consistent with our general pattern of usage. The following graph shows what time of day students submitted their quiz attempts. It is same shape as our general load graph for most OU online systems.

I don't know what, if anything, this means, but I thought it was interesting enough to share.

By the way, if you want to compute these graphs for your own Moodle, here are the database queries I used:

-- Number of quiz submissions by hour before deadline
SELECT 
    (quiz.timeclose - qa.timefinish) / 3600 AS hoursbefore,
    COUNT(1)

FROM mdl_quiz_attempts qa
JOIN mdl_quiz quiz ON quiz.id = qa.quiz

WHERE
    qa.preview = 0 AND
    quiz.timeclose <> 0 AND
    qa.timefinish <> 0

GROUP BY
    (quiz.timeclose - qa.timefinish) / 3600

HAVING (quiz.timeclose - qa.timefinish) / 3600 < 24 * 7

ORDER BY
    hoursbefore

-- Number of quiz submissions by hour of day
SELECT 
    DATE_PART('hour', TIMESTAMP WITH TIME ZONE 'epoch' + timefinish * INTERVAL '1 second') AS hour,
    COUNT(1)

FROM mdl_quiz_attempts qa

WHERE
    qa.preview = 0 AND
    qa.timefinish <> 0

GROUP BY
    DATE_PART('hour', TIMESTAMP WITH TIME ZONE 'epoch' + timefinish * INTERVAL '1 second')

ORDER BY
    hour