Tim's blog: education technology

Showing posts with label education technology. Show all posts

Thursday, March 2, 2017

The online assessment Turing test

In the back-channel for yesterday's Transforming Assessment webinar (which I would recommend) Geoff Crisp asked me:

"Tim - what about the Turing test - what if a student could not tell the difference between a computer giving them feedback and the teacher?"

I think this is a really nice question. Food for some quite wide-ranging thoughts about what online assessment should be.

On the whole, I stand my the snap answer I gave at the time. Computers and human markers (at least currently) have different strengths. The computer (having been set up by a human teacher) can be there at any time the student wants, able to give immediate feedback on a range of more or less basic practice activities. A human teacher in only available at certain times, but is able to give feedback in a more holistic way. They may know the student, and have some concept about how their subject is best learned, on on that bases give the student some really meaningful advice about how best to improve.

I know there is adaptive learning hype about computers being able to know the students and therefore offer contextual advice, but I will believe that when (if) I see it.

If you are thinking about designing a course today, you are much better off understanding the strengths and weaknesses of both computer-marked and conventional assessment, and using each for where they work best. There is currently nothing to be gained by trying to hide where you are using computer marking.

I think a reasonable analogy is with searching for information. You might do a Google search, which will give you the kind of results that a computer can give. Alternatively, you could ask a friend who knows more about the subject, and they will give you a different sort of advice about what to read. Neither is necessarily better. In some cases one of the two approaches might be clearly more appropriate. In other cases either would do. If you really want to understand something in depth, you probably want use both approaches, and it is an advantage that each will give different results that help in different ways.

If we are trying to create self-regulating learners, then it can be a merit that a computer only gives basic templated feedback, which could be as little as just right/wrong. The learner needs to do more work themself to get from the feedback to an action to take to improve. This is not always a benefit, but it could be.

So, while the idea of an assessment Turing test is usefully thought provoking, I don't think it is educationally useful, at least not for the foreseeable future. Having said that, the nicest thing anyone said about an online assessment system I helped build is still

"It's like having a tutor at your elbow."

The key word there is "like", which is not the same as "indistinguisable from".

Thursday, June 25, 2015

The Assessment in Higher Education conference 2015

I am writing this on a sunny evening, sitting in a pub overlooking Old Turn Junction, part of the Birmingham Canal Navigations, with a well-earned beer after two fascinating and exhausting days at the Assessment in Higher Education conference.

It was a lovely conference. The organising committee had set out to try to make it friendly and welcoming and they succeeded. There was a huge range of interesting talks and since I could not clone myself I was not able to go to them all. I am not going to describe individual talks in detail, but rather draw out what seemed to me to be the common themes.

A. It is all just assessment

The first keynote speaker (Maddalena Taras) said this directly, and there were a couple of other things along the same lines: the split between formative and summative assessment is a false dichotomy. If an assessment does not actually evaluate the students (give them a grade, hence summative) then it misses the main function of an assessment. This is not the same as saying that every assessment must be high stakes. Conversely, in the words of a quote Sally reminded me of:

“As I have noted, summative assessment is itself ‘formative’. It cannot help but be formative. That is not an issue. At issue is whether that formative potential of summative assessment is lethal or emancipatory. Does formative assessment exert its power to discipline and control, a power so possibly lethal that the student may be wounded for life? … Or, to the contrary, does summative assessment allow itself to be conquered by the student, who takes up a positive, even belligerent stance towards it, determined to extract every human possibility that it affords?” (Boud & Falchikov (2007) Rethinking Assessment in Higher Education: Learning for the Longer Term)

The first keynote was a critique of Assessment for Learning (AfL). Not that assessment should not help students learn. Of course it should. Rather, the speaker questioned some of the specific recommendations from the AfL literature in a thought-provoking way.

The 'couple of other things' were a talk from Jill Barber of School of Pharmacy at Birmingham, about giving students quite detailed feedback after their end of year exams; and Sally Jordan’s talk (which I did not go to since I have heard it internally at the OU) about the OU Science faculty's semantic wranglings about whether all their assessment gets called “summative” or “formative”, and hence how the marks for the separate assignments are added up, without changing what the assessed tasks actually are.

B. Do students actually attend to feedback?

The second main theme came out many times. On the one hand, students say they like feedback and demand more of it. On the other hand, there is quite a lot of evidence that many students don’t spend much time reading it, or that when they do, it does not necessarily help them to improve. So, there were various approaches suggested for getting students to engage more with feedback, for example by

giving feedback via a screen-cast video, talking them through their essay highlighting with the mouse (David Wright & Damian Kell, Manchester Metropolitan University). Would students spend 19 minutes reading and digestion written feedback on an essay? Well, they got a 19 minute (on average) video - one of the few cases where some students thought it was too much feedback!
making feedback a dialogue. That is, encouraging students to write questions on the cover sheet when they hand the work in, for their tutor to answer as part of the feedback. That was what Rebecca Westrup from the University of East Anglia was doing.
Stefanie Sinclair from the OU religious studies department talked about work she had one with John Butcher & Anactoria Clarke assessing reflection in an access module (a module to designed to help students with limited prior education to develop the skills they need to study at Level 1). Again, this was to encourage students to engage in a dialogue with their tutor about their learning.
Using peer and self assessment, so that students spend more time engaging with the assessment criteria by applying them to their own and other’s work. Also the suggestion from Maddalena Taras was that initially you give the student’s work back without the marks or feedback (but after a couple of weeks of marking) so that they read it with fresh eyes before they get the feedback (first) then the marks.
There was another peer assessment talk, by Blazenka Divjak of the University of Zagreb, using the Moodle Workshop tool. The results were along the same lines as other similar talks I have seen (for example at the OU where we are also experimenting with the same tool). Peer assessment activities do help students understand the assessment criteria. It helps them appreciate what teachers do more. Students’ grading of their peers, particularly in aggregate, is reliable, and comparable to the teacher’s grade.
A case of automated marking (in this case of programming exercises) where students clearly did engage with the feedback because they were allowed to submit repeatedly until they got it right. In computer programming this is authentic. It is what I do when doing Moodle development. (Stephen Nutbrown, Su Beesley, Colin Higgins, University of Nottingham and Nottingham Trent University.)
It was also something Sally touched on in her part of our talk. With the OU's computer-marked questions with multiple tries, students say the feedback helps them learn and that they like it. However, if you look at the data or usability lab observations, you see that in some cases some students are clearly paying not attention to the feedback they get.

C. The extent to which transparency in assessment is desirable

This was the main theme of the closing keynote by Jo-Anne Baird from the Oxford University Centre for Educational Assessment. The proposition is that if assessment is not transparent enough, it is unfair because students don’t really understand what is expected of them. A lot of university assessment is probably towards this end of the spectrum.

Conversely, if assessment is too transparent it encourages pathological teaching to the test. This is probably where most school assessment is right now, and it is exacerbated by the excessive ways school exams are made hight stakes, for the student, the teacher and the school. Too much transparency (and risk averseness) in setting assessment can lead to exams that are too predicable, hence students can get a good mark by studying just those things that are likely to be on the exam. This damages validity, and more importantly damages education.

Between these extremes there is a desirable balance where students are given enough information about what is required of them to enable them to develop as knowledgable and independent learners, without causing pathological behaviour. That, at least, is the hope.

While this was the focus of the last keynote, it resonated with several of the talks I listed in the previous section.

D. The NSS & other acronyms

The National Student Survey (NSS) is clearly a driver for change initiatives at a lot of other universities (as it was two years ago). It is, or at least it is perceived to be a, big deal. Therefore it can be used as a catalyst or leaver to get people to review and change their assessment practices since feedback and assessment is something that students often give low ratings for. This struck me as odd, since I am not aware of this happening at the OU. I assume that is because the OU has so far scored highly in the NSS.

The other acronym floating around a lot was TESTA. This seems to be a framework for reviewing the assessment practice of a whole department or degree programme. In one case, however (a talk by Jessica Evans & Simon Bromley of the OU faculty of Social Science) their review was done before TESTA was invented, though along similar lines.

Finally

A big thank-you to Sue Bloxham and the rest of the organising team for putting together a great conference. Roll on 2017.

Friday, May 1, 2015

eSTEeM conference 2015

eSTEeM is an organising group within the Open University which brings together people doing research into teaching and learning in the STEM disciplines, Science, Technology, Engineering and Maths. Naturally enough for the OU, a lot of that work revolves around educational technology. Once a year they have an annual conference for people to share what they have been doing. I went along because I like to see what people have been doing with our VLE, and hence how we could make it work better for students and staff in the future.

It started promisingly enough in a way. As I walked in to get my cup of coffee after registration, I was immediately grabbed by Elaine Moore from Chemistry who had two Moodle Quiz issues. She wanted the Combined question type to use the HTML editor for multiple choice choices (good idea, we should put that on the backlog) and a problem with a Pattern-match questions which we could not get to the bottom of over coffee.

But, on to the conference itself. I cannot possibly cover all the keynotes and parallel sessions so I will pick the highlights for me.

Assessment matters to students

The first was a graph from Linda Price’s keynote. Like most universities, at the end of every module we give have a student satisfaction survey. The graph showed the student's ratings in response to three of the questions:

Overall, I am satisfied with the quality of this module.
I had a clear understanding of what was required to complete the assessed activities.
The assessment activities supported my learning.

There was an extremely strong correlation between those. This is nothing very new. We know that assessment is important in determining the ‘hidden curriculum’, and hence we like to think that ‘authentic assessment’ is important. However, it is interesting to see how much this matters this is to students. Previously, I would not even have been sure that they could tell the difference.

The purpose of education

Into the parallel sessions. There was an interesting talk from the module team for TU100 my digital life, the first course in the computing and technology degrees. Some of the things they do in that module’s teaching is based around the importance of language, even in science. Learning a subject can be thought of as learning to construct the world of that subject through language, or as they put it, humanities style thinking in technology education. Unsurprisingly, many students don’t like that “I came to learn computing, not writing.” However, there is a strong correlation between students language use and their performance in assessments. By the end of the module some students do come to appreciate what the module is trying to do.

This talk triggered a link to back to another part of Linda Price’s keynote. An important (if now rather cliched question) for formal education is “What is education for everything is now available on the web?” (or one might put that more crudely as “Why should students pay thousands of pounds for one of our degrees?”). The answer that came to me during this talk was “To make them do things they don’t enjoy, because we know it will do them good.” OK, so that is a joke, but I would like to think there is a nugget of truth in there.

Peer assessment

On to more specifically Moodle-related things. A number of modules have been trying out Moodle’s Workshop activity. That is a tool for peer review or peer assessment. The talk was from the SD815 Contemporary issues in brain and behaviour module team. Their activity involved students recording a presentation (PowerPoint + audio) that critically evaluated a research article. Then they had to upload them to the Moodle Workshop, and review each others presentations as managed by the tool. Finally, they had to take their slide-cast, the feedback they had received, and a reflective note on the process and what they had learned from it, and hand it all in to be graded by their tutor.

Now for OU students (at least) collaborative activities, particularly those tied to assessments, are typically another thing we make them do that they don’t enjoy. This activity added the complexities of PowerPoint and/or Open Office and recording audio. However, it seems to have worked remarkably well. Students appreciated all the things that are normally said about peer review: getting to see other approaches to the same task; practising the skills of evaluating others’ work and giving constructive feedback. In this case the task was one that the students (healthcare workers studying at postgraduate level) could see was relevant to their vocation, which brings us back to visibly authentic assessment, and the student satisfaction graph from the opening keynote.

For me the strongest message from this talk, however, is what was not said. There was very little said about the Moodle workshop tool, beyond a few screen-grabs to show what it looked like. It seems that this is a tool that does what you need it to do without getting in the way, which is normally what you want from educational technology.

Skipping briefly over

There are many more interesting things I could write about in detail, but to keep this post to a reasonable length I will just skim over the posters with lunch. For example,

The S207 The physical world module team giving more information about the worrying gender differences in achievement in level 2 physics. It is still early days as they try to work out what is going on there, and what they might be able to do about it, but it is being taken very seriously.
An effort to analyse all the 2000+ figures and diagrams in S215 Chemistry: essential concepts to try to work out how best to make Chemistry modules accessible to the visually impaired.
A study that captured and analysed all the emails we sent to a group of new students.

And, some of the other talks:

a session on learning analytics, in this case with a neural net, to try to identify early on those students (on TU100 again) who get through all the continuous assessment tasks with a passing grade, only to fail the end of module assessment, so that they could be targeted for extra support.
a whole morning on the second day, where we saw nine different approaches to remote experiments from around the world. For example, the Open University's remote control telescope PIRATE. I was left me with the impression that this sort of thing is much more feasible and worthwhile than I had previously thought.

Our session on online Quizzes

The only other session I will talk about in detail is the one I helped run. It was a ‘structured discussion’ about the OU’s use of iCMAs (which is what we call Moodle quizzes). I found this surprisingly nerve-wracking. I have given plenty of talks before, and you prepare them. You know what you are going to say, and you are fairly sure it is interesting. Therefore you are pretty sure what is going to happen. For this session, we just had three questions, and it was really up to the attendees how well it worked.

We did allow ourselves two five-minute presentations. We started with Frances Chetwynd showing some the different ways quizzes are used in modules’ teaching and assessment strategies. This set up a 10-minute discussion of our first question: “How are iCMAs best be used as part of an assessment strategy?”. For this, delegates were seated around four tables, with four of five participants and a facilitator to each table. The tables were covered with flip-chart paper for people to write on.

We were using a World Café format, so after 10 minutes I rang my bell, and all the delegates move to a new table while the facilitators stayed put. Then, in new groups, they discussed the second question: "How can we engage students using iCMAs?" The facilitators were meant to make a brief bridge between what had been said in the previous group at their table, before moving on to the new question with the new group.

After 10 minutes on the second question, we had the other five-minute talk from Sally Jordan, showing some examples of what we have previously learned through scholarship into how iCMAs work in practice. (If you are interested in that, come to my talk at either MoodleMoot IE UK 2015 or iMoot 2015). This lead nicely, after one more round of musical chairs, to the third question: "Where next for iCMAs? Where next for iCMA scholarship?". Finally we wrapped up with a brief plenary to capture they key answers to that last question from each table.

By the end, I really had no idea how well it had gone, although each time I rang my bell, I felt I was interrupting really good conversations. Subsequently, I have written up the notes from each table, and heard from some of the attendees that they had found it useful and interesting, so that is a relief. We had a great team of facilitators (Frances, Jon, Ingrid, Anna) which helped. I would certainly consider using the the same format again. With a traditional presentation, you are always left with the worry that perhaps you got more out of preparing and delivering the presentation than any of the audience did out of listening. In this case, I am sure the audience got much more out of it than me, which is no bad thing.

Tuesday, December 9, 2014

Is learning design like UML?

A couple of weeks ago, I attended the #design4learning conference, which was conveniently on my doorstep at the Open University. Jenny Gray has already written her summary of the conference (and she though she was a bit late writing it up!)

I would like to highlight the point the organisers made with the conference name. Calling learning design "learning design" is a misnomer. You cannot design learning. Learning is something that goes on inside the student's head, perhaps most effectively under the support and guidance of a teacher. Therefore, you can only "design for learning", whatever it is you are designing: a course, a activity, a learning community, …. I think this is more than just semantic pedantry. We should all remember this, particularly when thinking about educational technology. There is no magic bullet that guarantees learning will occur. Just things that are more or less likely to encourage students to learn. (Having said this, I am going to just write "learning design" in the rest of this post, since it is so much easier!)

The main thought I wanted to share here is, however, something else. After two interesting days at a conference all about learning design, I cannot recall a single diagram shown by any speaker where I thought, "that is a graphical representation of the design of a bit of learning." Was I right to expect to see that? I don't know, but I have seen other presentation about tools like CompendiumLD in the past so I know it can be done. Pondering this as I cycled home, I got to thinking about the type of design I do know about: design of software, and though of an interesting comparison.

Software developers have a well-established way to draw the design of their software, called UML (better description on Wikipedia). Let me say immediately that I am not trying to suggest UML as a way to represent learning designs. Rather, I think it is interesting to think about how developers do (or more often don't) use UML to help their work. Can that tell us anything about how and whether teachers might engage with learning design?

There are two different ways to use UML. There is the quick-and-dirty, back-of-the-envelope way, where you draw of a part of the system to help explain or communicate a particular aspect of your design. This is the way I use UML as can be seen, for example, in this documentation I wrote. You include the details that are relevant to making your point, and leave out anything that does not help.

The other way to use UML is much more elaborate. It is called "Model Driven Architecture" which I studied as part of OU module M885. In MDA, you try to draw complete diagrams of the design of your system using a very precise dialect of UML, dotting all the 'i's and crossing all the 't's. Then, using a software tool (that you probably had to buy at great expense) you press the magic button, and it creates all your classes and interfaces for you. Then you just need to fill in all the implementations. At least, that is the promise. As I say, I studied this as part of a postgraduate computing course. It was of some academic interest, but I have never seen anyone write software this way (though a some people do, if the references in the course are to be believed). I expect more people have bought expensive MDA tools than have actually used them. In a previous generation, the same was true of CASE tools that also failed to live up to their promises.

So what, if anything, can this tell us about learning design? Well, I can see exactly the same split happening. There will be hype about magic systems where you input your learning outcomes, and sketch your learning design, press a magic button, and hey, presto, there is your Moodle course. It won't work outside of research labs, but some vendors will try to commercialise it, and a some institutions will fall for it and end up with expensive white elephants.

On the other hand, it would be good to see a common notation emerge to represent learning designs. This would help teachers communicate with each other, and perhaps with students, about how their teaching is supposed to work. A good feature of UML is that it is really very natural. Most developers can understand most of a UML diagram without having to be taught a lot of rules. There are several types of diagram to represent different things, but they are the kinds of things people drew anyway before UML was invented. The creators of UML just picked one particular way of drawing each sort of diagram, and endorsed it, in an attempt to get everyone talking (drawing) a common language. If you want to draw highly detailed UML diagrams, you need to learn a lot of rules, but you can get a long way just by copying what you see other people do, which is a sign of an effective language. It would be nice to see such a language for communicating about learning.

Thursday, February 27, 2014

Reflections on listening to conference presentations in German

I am at the MoodleMaharaMoot in Leipzig listening to people talk about Moodle.

First, the good news is that about half the words in English came from the same roots as German, so there are a fair number of words you can recognise, at least if you have time to read them from the screen. For words that seem really key, there is Google translate. Also, the Germans seems like using English phrases for eLearning-related things, like Learning Analytics, or Multiple Choice.

However, I don’t think I was even understanding 10% of the words. What really makes a difference to intelligibility is what is on the screen. If speaker just had powerpoint slides with textual bullet points, that does not help. If the speaker uses the screen to show you what they are talking about - screen grabs or live demos - that is much better. Of course, this is just: show, don’t tell.

It also makes a big difference whether you already know a little bit about what is being said. I talked to some people from University of Vienna two years ago when they started building their offline quiz activity, so I already knew what it was supposed to do. I followed that presentation (which contained many screen-grabs) better than most. What they have done looks really slick, by the way.

Regarding my presentation, I feel vindicated in my plan to spend almost all of the presentation doing a live demonstration of the question types I was talking about. Of course, I am sure that almost everyone in the audience has better English than I have German. Also, I apologies that I talked for the whole time, and did not leave an opportunity for questions.

Finally, I have been speculating (without reaching any conclusions) about whether the experience of sitting there, failing to understand almost everything that is being said, and just picking some scraps from the slides, is giving me any empathy for people with severe disabilities who need major accessibility support to use software? As I say, these thoughts are inconclusive. What does anyone else think?

By the way, Germans applaud by rapping on the table with their knuckles. Your trivia fact for the day.

Wednesday, July 3, 2013

Assessment in Higher Education conference 2013

Last week I attended the Assessment in Higher Education conference in Birmingham. This was the least technology and most education conference that I have been to. It was interesting to learn about the bigger picture of assessment in universities. One reason for going was that Sally Jordan wanted my help running a 'masterclass' about producing good computer-marked assessment on the first morning. I may write more about that in a future post. Also I presented a poster about all the different online assessment systems the OU uses. Again a possible future topic. For now I will summarise the other parts of the conference, the presentations I listed to.

One thing I was surprised to discover is how much the National Student Survey (NSS) is influencing what universities do. Clearly it is seen as something that prospective students pay attention to, and attracting students is important. However, as Margaret Price from Oxford Brookes University, the first keynote speaker said, the kind of assessment that students like (and so rate highly in NSS) is not necessarily the most effective educationally. That is, while student satisfaction is something worth considering, students don't have all the knowledge to evaluate the teaching they receive. Also, she suggested that the NSS ratings have made universities more risk-averse in trying innovative forms of assessment and teaching.

The opening keynote was about "Assessment literacy", making the case that students need to be taught a bit about how assessment works, so they can engage with it most effectively. That is, we want the students to be familiar with the mechanics of what they are being asked to do in assessment, so those mechanics don't get in the way of the learning; but more than that, we want the students to learn the most from all the tasks we set them, and assessment tasks are the ones students pay the most attention to, so we should help the students understand why they are being asked to do them. I dispute one thing the Margaret Price said. She said that at the moment, if assessment literacy is developed at all, that only happens serendipitously. However, in my time as a student, there were plenty of times when it was covered (although not by that name) in talks about study skill and exam technique.

Another interesting realisation during the conference was that, at least in that company (assessment experts), the "Assessment for learning" agenda is taken as a given. It is used as the reason that some things are done, but there is no debate that it is the right thing to do.

Something that is a hot topic at the moment is more authentic assessment. I think it is partly driven by technology improvements making it possible to capture a wider range of media, and to submit eportfolios. It is also driven by a desire for better pedagogy, and assessments that by their design make plagiarism harder. If you are being asked to apply what you have learned to something in your life (for example in a practice-based subject like nursing) it is much harder to copy from someone else.

I ended up going to all three of the talks given by OU folks. Is it really necessary to go to Birmingham to find out what is going on in the OU? Well, it was a good opportunity to do so. The first of these was about an on-going project to review the OU's assessment strategy across the board. So far a set of principles have been agreed (for example affirming the assessment for learning approach, athough that is nothing new at the OU) and they are about to be disseminated more widely. There was an interesting slide (which provoked some good discussion) pointing out that you need to balance top-down policy and strategy with bottom up implementation that allows each faculty use assessment that is effective for their particular discipline. There was another session by people from Ulster and Liverpool Hope universities that also talked about the top-down/bottom-up balance/conflict in policy changes.

In this OU talk, someone made a comment along the lines, "why is the OU re-thinking its assessment strategy? You are so far ahead of us already and we are still trying to catch up." I am familiar with hearing comments like that at education technology conferences. It was interested to learn that we are also held in similarly high for policy. The same questioner also used the great phrase "the OU effectively has a sleeper-cell in every other university, in the associate lecturer you employ". That makes what the OU does sound far more excitingly aggressive than it really is.

In the second OU talk, Janet Haresnape described a collaborative online activity in a third level environmental science course. These are hard to get right. I say that having suffered one as a student some years ago. This one seems to have been more successful, at least in part because it was carefully structured. Also, it started with some very easy tasks (put your name next to a picture and count some things in it), and the students could see the relationship between the slightly artificial task and what would happen in real fieldwork. Janet has been surveying and interviewing students to discover their attitudes towards this activity. The most interesting finding is that weaker students comment more, and more favourably, on the collaboration than the better students. They have more to learn from their peers.

The third OU talk was Sally Jordan talking about the ongoing change in the science faculty from summative to formative continuous assessment. It is early days, but they are starting to get some data to analyse. Nothing I can easily summarise here.

The closing keynote was about oral assessment. In some practice-based subjects like law and veterinary medicine it is an authentic activity. Also, a viva is a dialogue, which allows the extent of the student's knowledge to be probed more deeply than a written exam. With an exam script, you can only mark what is there. If something the student has written is not clear, then there is no way to probe that further. That reminded me of what we do in the Moodle quiz. For example in the STACK question type, if the student has made a syntax error in the equation they typed, we ask them to fix it before we try to grade it. Similarly, in Pattern-match questions, we spell check the student's answer and let them fix any errors before we try to grade it. Also, with all our interactive questions, if the student's first answer is wrong, we give them some feedback then let them try again. If they can correct their mistake themselves, then they get some partial credit. Of course computer-marked testing is typically used to assess basic knowledge and concepts, whereas an oral exam is a good way to test higher-order knowledge and understanding, but the parallel of enabling two-way dialogue between student and assessor appealed to me.

This post is getting ridiculously long, but I have to mention two other talks. Calum Delaney from Cardiff Metropolitan University reported on some very interesting work trying to understand what academics think about as they mark an essays. Some essays are easy to grade, and an experienced marker will rapidly decide on the grade. Others, particularly those that are partly right and partly wrong, take a lot longer weighing up the conflicting evidence. Overall though, the whole marking process struck me, a relative outsider, as scarily subjective.

John Kleeman, chair of QuestionMark, UK, summarised some psychology research that shows that the best way to learn something so that you can remember it again is to test yourself on it, rather than just reading it. That is, if you want to be able to remember something, then practice remembering it. It sounds obvious when you put it that way, but the point is that there is strong evidence to back up that statement. So, clearly you should all now go and create Moodle (or QuestionMark) quizzes for your students. Also, in writing this long rambling blog post I have been practising recalling all the interesting things I learned at the conference, so I should remember them better in future. If you read this far, thank you, and I hope you got something out of it too.

Monday, July 1, 2013

Open University question types ready for Moodle 2.5

This is just a brief note to say that Colin Chambers has now updated all the OU question types to work with Moodle 2.5. Note that we are not yet running this code ourselves on our live servers, since we are on Moodle 2.4 until the autumn, but Phil Butcher has tested them all and he is very thorough.

You can download all these question types (and others) from the Moodle add-ons database.

Thanks to Dan Poltawski's Github repository plugin, that is easier than it used to be. Still, updating 10 plugins is pretty dull, so I feel like I have contributed a bit. I also reviewed most of the changes and fixed the unit tests.

I hope you enjoy our add-ons. I am wondering whether we should add the drag-and-drop questions types to the standard Moodle release. What do you think? If that seems like a good idea to you, I suggest posting something enthusiastic in the Moodle quiz forum. It will be easier to justify adding these question types to standard Moodle if lots of non-OU Moodlers ask for it.

Friday, June 21, 2013

Book review: Computer Aided Assessment of Mathematics by Chris Sangwin

The book cover Chris is the brains behind the STACK online assessment system for maths, and he has been thinking about how best to use computers in maths teaching for well over ten years. This book is the distillation what he has learned about the subject.

While the book focusses specifically on online maths assessment, it takes a very broad view of that topic. Chris starts by asking what we are really trying to achive when teaching and assessing maths, before considering how computers can help with that. There are broadly two areas of mathematics: solving problems and proving theorems. Computer assessment tools can cope with the former, where the student performs a calculation that the computer can check. Getting computers to teach the student to prove theorems is an outstanding research problem, which is touched on briefly at the end of the book.

So the bulk of the book is about how computers can help students master the parts of maths that are about performing calculations. As Chris says, learning and practising these routine techniques is the un-sexy part of maths education. It does not get talked about very much, but it is important for students to master these skills. Doing this requires several problems to be addressed. We want randomly generated questions, so we have to ask what it means for two maths questions to be basically the same, and equally difficult. We have to solve the problem of how students can type maths into the computer, since traditional mathematics notation is two dimensional, but it is easier to type a single line of characters. Chris precedes this with a fascinating digression into where modern maths notation came from, something I had not previously considered. It is more recent than you probably think.

Example of how STACK handles maths input

If we are going to get the computer to automatically assess mathematics, we have to work out what it is we are looking for in students' work. We also need to think about the outcomes we want, namely feedback for the student to help them learn; numerical grades to get a measure of how much the student has learned; and diagnostic output for the teacher, identifying which types of mistakes the students made, which may inform subsequent teaching decisions. Having discussed all the issues, Chris them brings them together by describing STACK. This is an opportune moment for me to add the dislaimer that I worked with Chris for much of 2012 to re-write STACK as a Moodle question type. That was one of the most enjoyable projects I have ever worked on, so I am probably biassed. If you are interested, you can try out a demo of STACK here.

Chris rounds off the book with a review of other computer-assissted assessment systems for maths that have notable features.

In summary, this is a facinating book for anyone who is interested in this topic. Computers will never replace teachers. They can only automate some of the more routine things that teachers do. (They can also be more available than teachers, making feedback on their work available to students even when the teacher is not around.) To automate anything via a computer you really have to understand that thing. Hence this book about computer-assessted assessment gives a range of great insights into maths education. Highly recommended. Buy it here!

Monday, April 8, 2013

Do different media affect the effectiveness of teaching and learning

Here is some thirty-year-old research that still seems relevant today:

Richard E. Clark, 1983, "Reconsidering Research on Learning from Media", Review of Educational Research, Vol. 53, No. 4 (Winter, 1983), pp. 445-459.

This paper reviews the the seemingly endless research trying to ask whether teaching using Media X inherrently more effective than the same instruction in Media Y. Given the age of the paper, you will not suprised to learn that the research cited covers media like Radio for education (hot research topic in the 1950s), Television (1960s) and early computer-assessted assessment (1970s). Clark's earliest citation, however, is "since Thorndike (1912) recommended pictures as a labor saving device in instruction." Images as novel educational technology! Well, they were once. The point is that basically the same reserach was done for each new media to come along, and it was all equally inconclusive.

Here are some choice quotes that nicely summarise the article:

Based on this consistent evidence, it seems reasonable to advise strongly against future media comparison research. Five decades of research suggest that there are no learning benefits to be gained from employing different media in instruction, regardless of their obviously attractive features or advertised superiority.

Where learning benefits are at issue, therefore, it is the method, aptitude, and task variables of instruction that should be investigated.

The best current evidence is that media are mere vehicles that deliver instruction but do not influence student achievement any more than the truck that delivers our groceries causes changes in our nutrition

Clark does not miss out on the fact that effectiveness of the learning is the only problem in education:

Of course there are instructional problems other than learning that may be influenced by media (e.g., costs, distribution, the adequacy of different vehicles to carry different symbol systems, equity of access to instruction).

Since this paper is a thorough review of a lot of the available literature, it contains a number of other gems. For example:

Ksobiech (1976) told 60 undergraduates that televised and textual lessons were to be (a) evaluated, (b) entertainment, or (c) the subject of a test. The test group performed best on a subsequent test with the evaluation group scoring next best and the entertainment group demonstrating the poorest performance.

Hess and Tenezakis (1973) ... Among a number of interesting findings was an unanticipated attribution of more fairness to the computer than to the teacher.

I wonder how much later research fell it to the trap outlined in this paper. I am not familiar enough with the literature, but presumably there was lots of papers about the world-wide web, VLEs, social media, mobiles and tablets for education. I wonder how novel they really were?

Today, computers and the internet have made media cheaper to produce and more readily accessible than ever before. This removes many constraints on the instructional techniques available, but what this old paper is reminding us is that when it comes to teaching, it is not the media that matters, but the instructional design.

Wednesday, June 20, 2012

Interesting workshop about self-assessment tools

About 10 days ago, I took part in a very interesting workshop about the use of assessment tools to promote learning:

Self-assessment: strategies and software to stimulate learning

The day was organised by Sally Jordan from the OU, and Tony Gardner-Medwin from UCL, and supported by the HEA, so thanks to all of them for making it happen.

People talked about different assessment tools (not all Moodle), how they were getting students to use them, and in some cases what evidence there was for whether that was effective.

Parts of the event were recorded, and you can now access the recordings at http://stadium.open.ac.uk/stadia/preview.php?whichevent=1955&s=1. There is a total of 3.5 hours of video there, so you may not want to watch it all. My presentation is in Part 3, which also includes the final discussion, all in 30 minutes, and provides a reasonable summary of the day.

Despite having spent the whole day at the event, and discussed various aspects of what self-assessment is, I don't think we reached a single definition for what is self-assessment. Actually, I think it is clear that it is not one thing, but it is a useful way of looking at many different things, from the point of view of what is the most useful thing to help students learn.

One of the tools discussed during the day was PeerWise. If you have not come across that yet, then you should take a look, becuase it looks like a very interesting tool. There is a good introduction on Youtube:

Tuesday, September 27, 2011

What I want to build next

Earlier this summer I finally finished the new Moodle question engine, which was released as part of Moodle 2.1. As you might expect with such a large change, a number of minor bugs were not spotted until after the release, but I (and others) have fixed quite a lot of them, and we will continue to fix more. I want to say "thank you" to everyone who has taken the time to report the problems they encountered. Pleasingly, some people, including Henning Bostelmann, Tony Levi, Pierre Pichet, Jamie Pratt, Joseph Rézeau and Jean-Michel Vedrine have not only been sending in bug reports, but also submitting bug fixes. I would like to thank them in particular. I don't know whether this means that the new Moodle development processes are working well and encouraging more contributors, or that I released the new question engine full of trivial bugs.

At the moment, apart from fixing bugs, we are about two months away from the end of the OU's one-year project to move from Moodle 1.9 to 2.x and implement a lot of new features at the same time. In the eAssessment area, we had about 30 work-packages to do, of which finishing the question engine was by far the biggest, and we have about 6 left to go. Most of the remaining tasks are at least started, but finishing them is what I, and the developers on my team, will be doing in the near future.

I have, however, been thinking ahead a bit, and I have an idea for what I would like to build, should I be given the opportunity. Honesty compels me to say these are not my ideas. I stole them from other people, and there are proper acknowledgements at the end of this post. I wanted to post about this because: 1. in my experience, if you post about your half-baked ideas, people will be able to suggest ways to make them better; and 2. I am hoping that at least one course-team at the OU will see this and say "we would love to use this in our teaching" because that might persuade the powers that be to let me build this.

Rationale

The Moodle quiz is a highly structured, teacher-controlled tool for building activities where students attempt questions. What I want to create is a more open activity where students can take charge of their learning using a bank of questions to practice some skill where the computer can mark their efforts and give feedback. For the sake of argument, I have been calling this the "Question practice" activity module.

The entry page

When a student goes into a Question practice activity, they see a front screen that lists all the categories in the question bank for this activity.

Next to each category, there are statistics for how the student has performed on that category so far. For example, it might say "recently you scored 19/21 (90%); all time you scored 66/77 (86%).” The categories are nested, and there is a subtotal for each category.

At the bottom of the page is an Attempt some questions… button. This takes the student to the …

Start a session form

… where they set up what practice they would like to do. Students can select which categories they want to attempt questions from. They may also be able to choose how many questions they want. For example "Give me 10 questions", "As many as possible in 20 minutes", or "Keep going until I say stop". The teacher will probably be able to constrain the range of options available here.

Once they are satisfied, the they clicks the "Start session" button. This takes them to the …

Attempt page

… which shows the student the first question, chosen according to the criteria they set. There will probably be a display of running statistics "In this session you have attempted 0 questions so far". The question will contain the controls necessary for attempting the question. There will also probably be a "Please stop, I'm bored" button, so the student can leave at any time.

When they get back to the front page, the statistics will have been updated.

If the student crashes out of a session, then when they go back in, the front page will have a "Continue current session" button.

Overall activity log

One batch of attempting questions will be called a 'practice session'. The system will keep track of all the sessions that the student has done, and what they achieved during each session.

The front page will have a link to a page that lists all of the student's sessions, showing what they achieved in each. This provides more detail than is visible on the front page.

Possible extensions

That is the key idea. Here are some further things that could be added to the basic concept.

Milestones

The system could recognise targets, goal, or achievement (I'm not sure of the best name). That would be something like "Attempt more than 10 questions from the Hard category, and score more than 90%". If the student achieves that target at any time, they system would notice, and the achievement would be recorded on the front page and in the session log in an ego-boosting way (e.g. a medal icon).

The whole point of this activity is to be as student-driven as possible, so should students be able to define their own targets or goals? Should students be able to set goals for each other?

Locks / Conditional access

The activity could also have locks, so that the student cannot access the questions in the Multiplication category until after they have scored more than 80% in the Hard addition category. Of course, unlocking a new category could be an achievement. We appear to be flirting with the gamification buzz-word here, so I will stop.

Performance comparison

Should there by any way for students to compare their performance, or achievements, with their peers? We are definitely getting to features that should be left until version 2.0. Let's get a basic system working first, but make sure it is extensible.

How hard would this be to build

I think this would not require too much work because a lot of the necessary building blocks already exist in Moodle. The question bank already handles questions organised into categories, and we would just use that. Similarly, the attempt page and practice sessions are very easy to manage with the new question engine.

The real work is in two places. First, building the start attempt form, and then writing the code that randomly selects questions based on the options chosen. Second, deciding what statistics to compute, and then writing the code to compute them.

Of course, before we can start writing any code, there are still a lot of details of the design to decide. Also one most not forget things like backup and restore, creating the database, and all the usual Moodle plumbing.

Overall, I think it would take a few months work to get a really useful activity built.

Credit where credit is due

I said earlier that I got most of these ideas from other people. To start with, things like this have been mooted in the Moodle quiz forum over the years. The discussions there usually start from Computerised Adaptive Testing, whereas this idea is about student-driven use of questions. I think the latter is more interesting. (As a mathematician, I think CAT is an interesting concept. I just don't think it would make a useful Moodle activity.)

The real inspiration for this came at a meeting in London at the start of 2011. That meeting was at UCL with Tony Gardiner-Medwin who has already built a system something like this, but stand-alone, not in Moodle; and David Emmett from University of Queensland, Brisbane (who was giving a seminar). David had been hoping to get a grant to build something like this proposal (in Moodle) but that did not pan out. We did, however, have a very interesting discussion, and that is where I got the key idea that this sort of question practice was most interesting if you could give the student control of their own learning as much as possible.

We have also discussed ideas like this on-and-off for a long time at the OU. There has, however, been a lot of other things we needed to deal with first. We had to do a lot of work getting the quiz system working to our satisfaction (a strand of work that eventually lead to the new question engine). We had to sort out the reporting of grades, including working with Moodle HQ on the new gradebook in Moodle 1.9, and integrating Moodle with our student information system. We had to make a new question types that our users wanted. Only now can we start to think seriously about the last piece of the jigsaw: more activities that use all the question infrastructure we have built. I hope this post is a useful starting point for discussing what one of those activities might be.

Wednesday, February 23, 2011

Etiquette for questions

I have been working hard at converting the new Moodle question engine to work in Moodle 2.0, aiming at a deadline this Friday (25th February). On Friday we should have a first OU version of Moodle 2.0 with all the key features so that we can start testing, even though students won't get onto the new system before July. I have have basically finished the question engine, give or take a few features that are not needed for testing, and this week I am just doing some final tidying up of the code.

Hopefully, next week I can start the process of getting it reviewed for inclusion in Moodle 2.1. As I say, there are some gaps in the functionality that will need to be filled in before it can actually be committed, but there is a lot of code to be reviewed (lucky Eloy!) and so I hope we can kick off the process.

So, my excuse for not blogging about the new question engine recently is that I have been too busy working on it to write about it. In the last few days, however, I encountered a couple of nice ideas that would be easy to implement using the flexibility the new question engine gives, and I want to describe them. First, I need to remind you of one key point about the new system:

Question behaviours

As I explained before a key idea in the new question engine is that of question behaviours. Whereas a question type lets you have either a multiple-choice, a drag and drop, or a short-answer question, a behaviour controls how the student interact with the questions, of whatever type. For example, the student may have to type in answers to each question in the quiz, then submit everything, and only then are the questions marked. This is known as the "Deferred feedback" behaviour. Alternatively, the student may answer one question, have their answer marked immediately. If they are wrong, they get a hint and can then immediately have another go. If they get it right on the second or third try, they get fewer marks. This is called the "Interactive with multiple tries" behaviour.

When I was first working on this, I did wonder whether it was perhaps over-kill to make behaviours fully-fledged Moodle plugins. It seemed to me that I had already implemented all the types of behaviour anyone was likely to want. It turns out I was wrong. Here are three ideas for new behaviours have I have come across since had that naive thought.

Explain your thinking behaviour

The concept here is that, in addition to presenting the question to the student for them to answer in the usual way, you also give them a text-area with the prompt "Explain your answer". When the submit the question is graded as usual. Moodle does not do anything with the explanation, other than to store it, and re-display it later when the student or their teacher reviews their attempt. The point is that the student should reflect upon and articulate their thought processes, and the teacher can then see what they wrote, which might be useful for diagnosing what problems the students are having.

I'm not sure that this would really work. Would the students really bother to write thoughtful comments if there were no marks to be had? However, this would be relatively easy to implement, so we should build it and see what happens in practice. The teacher could always manually adjust the marks based on the quality of the reflection, if that was necessary to incentivise students.

I'm afraid I cannot remember who suggested this idea. It was a post in the Moodle quiz forum some time ago, just after I had implemented the behaviour concept and was thinking that my initial list of behaviours was all anyone could possibly want.

gnikram desab-ytniatreC

This idea I only came across yesterday evening, in a blog post from people in the OU's technology faculty. It is a slightly strange twist on certainty-based marking.

With classic CBM, the students answers the questions, and also says how certain they are they got it right (for example, on a three-point scale). The student will only get full marks if they get the question right, and are very certain that they were right. If, however, they express high certainty and are wrong, they are penalised heavily with a big negative mark. To maximise their score, the student must accurately gauge their level of knowledge. This hopefully promotes reflection, and self awareness by the student of their level of knowledge.

The idea from the OU technology faculty is to do this backwards, for multiple choice questions. Rather than getting the student to answer the question and then select a certainty, you first show them just the question stem without the choices, and get them to express a certainty. Only then do you show them the choices and let them chose what they think is the right answer.

Again, I am not sure if this would work, but it is sufficiently easy to do by creating a new behaviour plug-ing (and a some change to the multiple-choice question type so that you can output just the question, without the choices) that it has to be worth a try.

Free text responses with a chance to act on feedback

This last idea I only heard about this morning. There was a session of the OU's "eLearning community" all about eAssessment, which naturally I attended. This is a monthly gathering with a number of different presentations on some eLearning topic. The first three talks were about specific courses that have recently adopted eAssessment, and how students had engaged with that, what effect the effect had been on retention and pass rates, and so on. That was interesting, but not what I want to talk about here. The final talk was by Denise Whitelock from the OU's Institute of Educational Technology who has just completed a review of recent research into technology-enhance assessment for HEA that should be published soon. Here, I just want to pick up on one specific idea from her talk.

I'm afraid that again, I don't recall who deserves credit for this idea. (Once Denise's review is published, I will have a proper reference, but I did not take notes this morning.) It was another UK university that had done this. It was in the context of language teaching. The student had to type a sentence in answer to the question, then the computer graded that attempt and gave some feedback. Then, the student was immediately allowed to revise their sentence in light of the feedback, and get it re-marked. The final grade for the question is then a weighed sum of the first mark and the second mark. You need to get the weights right. The weight for the first try has to be big enough that the student tries hard to get the question right on their own before seeing the hints, and the weight for the second try, though smaller, also has to be big enough so that the student bothers to revise their response.

Now, the OU is currently creating a Moodle question type that can automatically grade sentence length answers using an algorithm that my colleague Phil Butcher implemented the first version of in 1978! (When I say we are creating this, what I actually mean is that we have contracted Jamie Pratt a free-lance Moodle developer to implement it to our specification.) Anyway, once you have that, the idea of allowing two tries, with feedback after the first try, and a final grade that is a weighted sum of the marks for the two tries, is just another behaviour.

So, my initial thought that people would not have many ideas for interesting new behaviours seems to have been wrong. The flexibility I built into the system is worth having.

Wednesday, February 9, 2011

Should you listen to futurologists?

Educause just published their annual survey describing "six areas of emerging technology that will have significant impact on higher education and creative expression over the next one to five years".

This got circulated round the team at work and I rather cynically asked "so, what did they predict last year then?" My colleague Pete Mitton took that question and ran with it to produce the following analysis:

OK, as I have a full set of Horizon Reports on my hard disk, here's a summary of their predictions for the years 2004-11.

I've pushed some titles together where the wording is different but the intent is the same (for example they've used mobile computing/mobiles/mobile phones in the past with the same meaning).

The numbers in the table are the time-to-adoption horizon in years.

	2004	2005	2006	2007	2008	2009	2010	2011
User-created content			1	1	1
Social Networking		4-5	1	1
Mobiles			2-3	2-3		1	1	1
Virtual Worlds				2-3
New Scholarship and Emerging forms of publication				4-5
Massively Multiplayer Educational Gaming				4-5
Collaboration Webs					1
Mobile Broadband					2-3
Data Mashups					2-3
Collective Intelligence					4-5
Social Operating Systems					4-5
Cloud Computing						1
The Personal Web						2-3
Semantic-Aware Applications						4-5
Smart Objects						4-5
Open Content							1
Electronic Books							2-3	1
Simple Augmented Reality		4-5	4-5				2-3	2-3
Gesture-based computing							4-5	4-5
Visual Data Analysis							4-5
Game-based learning		2-3	2-3					2-3
Learning analytics								4-5
Learning Objects	1
Scaleable Vector Graphics	1
Rapid Prototyping	2-3
Multimodal Interfaces	2-3
Context Aware Computing aka Geostuff	4-5		4-5			2-3
Knowledge Webs	4-5
Extended Learning		1
Ubiquitous Wireless		1
Intelligent Searching		2-3

Of course, the purpose of a report like this is not to accurately predict the future. The aim is rather to stimulate informed debate about the technologies that are coming up. Within our team, at least, they seem to have succeeded.

I thought, however, that this analysis was interesting enough to share. It provides some context for year's predictions. More generally it shows how difficult it is predict future technology trends.