Friday, April 25, 2014

Load-testing Moodle 2.6.2 at the OU

At the start of June we will upgrade the OU Moodle sites to Moodle 2.6. Before then we need to know that it will still perform well when subjected to our typical peak load of 100,000 page-views per hour. This time, I got 'volunteered' to do the testing.

The testing servers

To do the testing, we have a set of 10 servers that are roughly similar to our live servers. That is six web servers for handling normal requests, one web server that handles 'administrative' requests. That is, any URL starting /admin, /report or /backup. Those pages are often big, long-running processes, rather than quick page views, so it is better to put them on a different server that is tuned differently. There is one 'web' server is just for running the cron batch processes. Finally we have a database server and a file server.

In order to be able to make easy comparisons, we make two copies of our live site onto these servers. That is, we have two different www_root folders, which correspond to different URLs lrn2-perf-cur and lrn2-perf-upg. In due course we will upgrade one of the copies to the new release while leaving the other open running the current version of the code. This make it easy to switch back and forth when comparing the two.

In addition to the servers running Moodle, we have 6 virtual machines to generate the simulated load.

The testing procedure

We test using JMeter. In order to test Moodle, you need to send lots of requests for different pages, many of which include numeric ids in the URLs. Therefore, the JMeter script needs to be written specifically for the site being tested. Fortunately, our former colleague James Brisland made a script that automatically generates the necessary JMeter script. We shared that script with the community, and you can find a copy here. However, we shared it a long time ago, and since then our version has probably diverged from the community version a bit. Oops!

I say this tool automatically generates the necessary JMeter script, but sadly that is an oversimplification. It fails in certain cases like if a forum is set to separate groups mode. So, having generated the JMeter script, you need to run it and check that it actually works. If not, you have to go into the courses and activities being tested and modify the settings. We really ought to automate that, but no one has had the time. Anyway, eventually (and this took ages) you have a working test script.

Tuning the test script

Once the test script works, in that it simulates users performing various actions without error, one at a time, then you have to start running it at high load. That is, simulating lots of users doing lots of things simultaneously. After it has settled down, you let it run for 15 or 20 minutes, and then look at what sort of load you are generating. The goal is to get about the same number of requests per second for each type of page (course view, forum view, post to forum, view resource, ...) in the test run as in real use on the live system. If not, you tweak the time delays, or number of threads, and then run again. It took about four runs to get to a simulated load that was close (actually slightly higher) than the target request rates we had taken from the live server logs.

All that creation and tuning of the tests scripts is done on the lrn2-perf-cur copy of the site. Once that is OK, then you run the same script against lrn2-perf-upg. That should give exactly the same performance, and before proceeding we want to verify that is the case. It turned out at first that it was slightly different. I had to find the few admin settings that were different between the two servers. Once the configuration was the same, the performance was the same, and we were finally in a position to start comparing the old and new systems.

Upgrade to the new version of Moodle

The next step is to upgrade lrn2-perf-upg to the new code. This code is still work-in-progress. Final testing of the code before release happens next month, but we try to keep our code in a releasable state, so it should be OK for load-testing. However, this is the first time we have run the upgrade on a copy of all our data. Unsurprisingly, we found some bugs. Fortunately they were easily fixed, and better to find them now than later.

Also, a new version of Moodle comes with a lot of new configuration options. This is the moment to consider what we should set them to. Luckily, most of the default values were right, so there was not a lot to do. Moodle prompts you for most of the new settings you need to make as part of the upgrade. However, it does not prompt you to configure any new caches, so you have to remember to go and do that.

Compare performance

At long last (about one and a half weeks into the process) you are finally ready to run the test you want. How does 2.6 performance compare to 2.5? Here is a screen-grab of today's testing:

Good news: Moodle 2.6 is mostly a bit faster (5-10%) than Moodle 2.5. Bad news: every 15 minutes, it suddenly goes slow for about 15 seconds. What?!

Problem solving

Actually, there is a logical explanation. We have cron set to run every 15 minutes, so surely the problem is caused by cron, right? No. Wrong! We stopped cron running, and the spikes remained. We tried various things to see what it might be, and could not make any sense of it. One thing we discovered was that the spikes were about as large as the spikes you get by clicking the 'Purge all caches' button. OK, so something is purging caches, but what?

To cut a long story short, you need to remember that our two test sites lrn2-perf-cur and lrn2-perf-upg are sharing the same servers. Therefore they are sharing the same memcache storage. It appears that something in cron in Moodle 2.5 purges at least some of the caches. When we stopped cron on our Moodle 2.5 site the spikes went away on our 2.6 site. I am afraid we did not try to work out why Moodle 2.5 cron was purging caches, but there is probably a bug there. It turns out that purge caches does not cause a measureable slow-down in Moodle 2.5, at least not for us, which is worth knowing.

Why does Purge caches cause a slow-down in 2.6 but not in 2.5? I am pretty sure the reason is MDL-41436. When things slowed down, it was the course page that slowed down the most, and that is the one most dependent on the modinfo cache.

Summary

  • Moodle 2.6 is about 5-10% faster than 2.5, at least on our servers, which are RHEL5 + Postgres + memcache cluster store. (MDL-42071 - why has that not been integrated yet?)
  • In Moodle 2.5, doing Purge caches when your system is running at high load seems to cause remarkably little slow-down.
  • In Moodle 2.6, doing Purge caches does slow things down a lot, but only very briefly. Performance recovered within about 15 seconds in our test, but then the test was only using a few courses.
  • In Moodle 2.6, clicking Clear theme caches (at the top of Admin -> Appearance -> Themes -> Theme selector) causes no noticeable slow-down.

The bit about what happens when you clear the caches is important because sometimes, when you patch the system with a bug fix, you need to purge one or more caches to make the fix take effect. In the past, we did not know what effect that had. We were cautious and had people waiting up until after midnight to click the button at a time of low system load. It turns out now that is probably not necessary. We can clear caches during the working day, when staff are in the office to pick up the pieces if anything does go wrong.

Wednesday, April 23, 2014

The four types of thing a Moodle developer needs to know

In order to write code for Moodle, there is an awful lot you need to know. Quite how much was driven home for me when I taught a Moodle developers' workshop at the German MoodleMaharaMoot in February. When preparing for that workshop, I though you could group that knowledge into three categories, but I have since added a fourth to my thinking.

1. The different types of Moodle plugin

The normal way you add functionality to Moodle is to create a plug-in. There are many different types of plug-in, depending on what you want to add (for example, a report, an activity module or a question type). Therefore, the first thing to learn is what the different types of plug-in are and when you should use them. Then, once you know which type of plug-in to create, you need to know how to make that sort of plug in. For example, what exactly do you need to do to create a new question type?

2. How to make Moodle code do things

Irrespective of what sort of plug-in you are creating, you also need to know how to make your code do certain things. Moodle is written in PHP, so generic PHP skills are a prerequisite, but Moodle has its own libraries for many common tasks, like getting input from the user, loading and saving data from the database, displaying output, and so on. A developer need to know many of these APIs.

3. How to get things done in the Moodle community

If you just want to write Moodle code for your own use, then the two types of know-how above are enough, but if you want to take full advantage of Moodle's open source nature, then you need to learn how to interact with the rest of the Moodle development community. For example how to ask for help, how to report a bug, how to submit the changes for a bug you have fixed or a feature you have implemented, get another developer to review your proposed code change, how to update your customised Moodle site using git, and so on.

4. Something about education

Those three points were what I thought of when trying to work out what I needed to teach during the developer workshop I ran. Since then, while listening to one of the presentations at the UK MoodleMoot as it happens, I realised that there was a fourth category of knowledge required to be a good Moodle developer. It matters that we are making software to help people teach and learn. I am struggling to think of specific concepts here, with URLs for where you can learn about them, as I gave in the previous sections, but there is a whole body of knowledge about what makes for effective learning and teaching and it is useful to have some appreciation of that. You also need some awareness of how educational institutions operate. If you hang around the Moodle community for any length of time you will also discover the educational culture is different in different countries. For example in the southern hemisphere the long summer holiday is also the Christmas holiday, and in America, they expect to award grades of more than 100%.

Summary

Does this subdivision into categories actually help you learn to be a Moodle developer? I am not sure, but it was certainly useful when planning my workshop. The workshop was structured around creating three plugins on the first day, a Filter, a Block and then a Local plug-in. However, those exercises were structured so that while moving through different types of category-one knowledge, we also covered key topics from categories two and three in a sensible order. So it helped me, and I thought it was an interesting enough thought to share.