Thursday, October 15, 2009

The Open University's approach to plagiarism

I spent the day at an OU internal conference on plagiarism. Until now, most of my engagement with this subject has been people coming to the Moodle Quiz Forum expecting technological magic bullets. There seems to be a depressing assumption in some quarters that preventing students from copying or pasting during a quiz attempt, or from using a spell checker, in some way helps the validity of an assessment. And using Google during a quiz attempt is surely cheating, it couldn't possibly be a valuable study skill for the 21st century?

Anyway, it is just not possible to enforce these kinds of restrictions from a web server application like Moodle. Image, for a moment, what would happen to the web if any advertiser could stop you leaving their horrible web site and going to Google to find something better. If you really want to set up a restricted environment (akin to the traditional invigilated/proctored exam hall) you need client-side software installed on the computer the student will be using. If you are looking for something like this, take a look at Safe Exam Browser, and open-source effort built on Firefox, which is now integrated into Moodle 1.9.6 (due for release soon). See MDL-19145.

I am pleased to say that the OU approach is diametrically opposed to this wishful chasing after technological magic bullets. The conference today started with two keynotes by Jude Carroll from Oxford Brookes University, who has been thinking about these issues for the last nine years, and who is an excellent speaker. She has the very clear summary of the issue: Universities have the responsibility/duty/privileged to accredit student's learning. They give the student a bit of paper saying that they have learned something, and in our society that bit of paper has a certain amount of value. So universities need to measure learning, and that is where assessment comes in. The real problem with plagiarism is that it breaks the link between what the student submits for assessment, what they have learned. If a student writes something in their own words, then that writing is good evidence of the extent to which they have taken in the ideas of a course. If they have just copied and pasted someone else's words, you cannot assess what they have learned. Also, teachers do not set assessment tasks just to measure students. We hope that performing the tasks helps the student to learn.

Jude also made the point that exactly is plagiarism does depend on the context. For example the level of experience of the student, the learning that is being assessed, what you are allowed to assume as common knowledge. Also, not all plagiarism is cheating, and not all cheating is plagiarism. Dealing with plagiarism must be done in an appropriate and proportionate way. To start with, understanding issues of academic honesty is an important learning outcome from a degree course, but it needs to be taught, it is not innate. It cannot be taught as an optional add-on ("the plagiarism lecture") but instead it must be embedded in the context of the subject early in a degree program. (By and large the OU does this in its introductory course.) When plagiarism happens, the response has to be proportionate. Early in a program of study, students need to be given feedback to help develop good academic practices. Later on, students should be expected to understand these issues, and plagiarism becomes a disciplinary issue, but still their need to be a range of fair and proportionate penalties.

The OU has clearly had a group of people thinking hard about this over the last few years, and that has resulted in a new plagiarism policy that was adopted last June. This conference was part of disseminating that policy more widely. There are now a range of penalties for various types of plagiarism, and a new role, 'Academic Conduct Officers' who can deal with most issues of plagiarism that go beyond what the student's individual tutors can deal with. This lets many cases be dealt with quickly, so that only the most serious cases have to be referred to the Central Disciplinary Committee.


Another part of the OU's response to this issue is a new web site Developing good academic practices, which I am afraid you will only be able to see if you have some sort of OU login, student or staff. That is a Moodle site in our VLE. It is a mixture of text to read, with a few audio clips, with a quiz at the end of each section. There is also a summary quiz at the end which draws questions randomly from all the other quizzes. The estimated study time for the whole site is 2 hours. It seems to be well done, and for me it is pleasing to see the quiz module used well, even if it is only multiple-choice and matching questions. The course has about 2300 quiz attempts at the moment.

Finally, the OU does use two technological means to detect plagiarism, CopyCatch and Turnitin, on a proportion of submitted work. However, this is only used to flag up possible cases that are then reviewed by a human. The third form of detection is that tutors marking the students' work can often spot problems. However, overall, the amount of plagiarism detected at the OU seems to be lower than that reported elsewhere. It is interesting to speculate why that might be.

Wednesday, October 14, 2009

What I'm working on

I've been meaning to post here for a while about what I am working on, but I keep posting in the Moodle Quiz Forum instead, because what I have to say seems more relevant there.

Anyway, the short answer is that I am rewriting the Moodle Question Engine. That document explains the what and the why. I want to post here about the how.

The question engine is the part of Moodle that processes what happens as a student attempts questions as part of a quiz (or other activity). What makes this a relatively difficult problem is that there are two independent degrees of variation. A question may be one of many different question types (multiple choice, short-answer, essay, ...) and there are various ways that a student may interact with it (adaptive mode, non-adaptive mode, manually graded, ...). Plus, people seem to think that when quizzes are used for summative assessments, it is quite important that there are no bugs and that performance is good ;-).

The complexity of the problem, and my natural bias, means that I am taking a very object-oriented approach. More so than is normally used in Moodle. Separate objects, with clearly defined responsibilities, helps me understand the problem and have confidence that what I am doing will work.

If you read the document linked to above, you will learn that I have been dissatisfied with this part of the quiz code for years, however it is really only within the last year that I have worked out the overall approach to solving this problem nicely. The key realisation is that you need to apply the strategy pattern twice, once for the bits that vary with the question type, and once for the bits that vary with the interaction model. The other important part of the plan is a cleaner database structure that more closely maps to data we need to store.

However, to implement all that requires quite a lot of code. I need to build the core system that can support all the different interactions and question types. Then I need to create all the interactions to replicate Moodle's existing functionality and the new features we want to add (Certainty Based Marking, and what we are calling the Interactive mode). Then I need to modify all the existing Moodle question types to fit into the new system. And I need to do all this with robust bug-free code that we can confidently use for summative assessment.

Over the last year I become increasing coverted to test-driven development. Certainly I wrote a lots of tests while eating my elephant, and this time round I am being very thorough with testing, although I must admit I still don't always write the tests first.

I am also proceeding very iteratively. I started with a test that took a single true-false question through being attempted according to the simplest of the interactions, and wrote some code that used roughly the right classes to make that test pass. Then for about a week I did little more than refactor that code.

It turned out that most of my variables and classes had the wrong name. For example, at one point I had classes called question_state and question_states, which was clearly a recipe for confusion. It was only after I had working code that I was able to think of the name question_attempt_step (one step in a question_attempt) for one of those. I also came to realise that grades were stored in three separate ways in different places, and so I ought to make sure I was using a consistent naming scheme for my variables. Then I did a bit of moving methods between classes as the exact boundaries of responsibility of different classes became clearer. Should I have been able to get all that right first time? Well I think that would be impossible.

I also started gradually removing simplifying assumptions that worked for my initial test true-false question, but not more generally. That was the point at which I started extending my code to other question types and interactions. For example I wrote a test case a walking an essay question through the manual grading interaction, and exposed some new issues that had to be accounted for; and just today I introduced multiple-choice questions, which randomise the order of the choices when you start an attempt, but then you have to store that random order somewhere so it can be used throughout the attempt. I have also already implemented the certainty based marking interaction.

There is still much to do. My code that outputs the HTML for a question is still a rough hack of the current code that lets the tests pass. I need several rounds of refactoring before that will be cleaned to my satisfaction. There are sill five more interactions to write, although each one is getting quicker than the one before. Then I have to convert all the existing question types. I need code that stores the current state of everything in the database and later reads it back. Finally, the nasty bits, which are upgrading from the current database structure to the new one, and re-implementing backup and restore, including restore of old backups.

If you were paying attention in that last paragraph, you will have deduced that my code does not yet store anything in the database! I am taking an approach with the domain objects completely unaware of the database (which makes testing very easy) and I am planning to use the data mapper pattern to coordinate loading and saving the necessary data efficiently. I am pretty sure I know how to do that, however, I have not tried to write it yet because I want to re-read the relevant chapters of Patterns of Enterprise Application Architecture, and my copy only recently got back from Australia, and I won't be able to retrieve it from my parents' house until this weekend.

In terms of time-scales, I am really hoping to get this done by about Christmas, and in time to go into Moodle 2.0, since it breaks a number of APIs. However, I first have to implement this in the OU's Moodle 1.9 code-base and then port it to Moodle 2.0, so no promises. This may have to wait until Moodle 2.1.

Anyway, that is what I have been doing, and so far it has been very enjoyable. Write tests, make them pass, then refactor aggressively with the tests as a safety net is an approach that is working really well for me. These are quite big scary changes, but right now I am feeling confident that it will all work out, and we will end up with a really solid question engine upon which to base an enhanced quiz and other Moodle activities.