Wednesday, June 9, 2010

Computational Knowledge

I spent today at the London Computational Knowledge Summit. A rather pretentious name for a meeting organised by Wolfram about Wolfram|Alpha, Mathematica, and what these sorts of tools can do. They way they see it is that if Google is a means to retrieve facts, Wolfram|Alpha aspires to be a way to use information. For example, governments have started dumping a lot of data into the public domain, but how easy is it for a citizen to extract meaningful information from that data?

For someone used to Moodle and open source, I was deeply struck by how Cathedral (as opposed to Bazaar) the whole Wolfram/Mathematica world is. That may be, in part, a natural consequence of when they started. Wolfram has been working on Mathematica for twenty years. That is an impressive achievement, and it means they have been working on it since before Linux was a gleam in Linus Torvald's eye. They started at a time when the commercial model was how most software was written, and they are clearly still an American commercial software development business at heart.

It is interesting to ask what would have happened of Mathematica had been GPL. Would it have developed more or less than it has as a commercial project? Wolfram would not have been able to pour as much money into it, but would a community have done as much, or more? I am not sure. Clearly the commercial model has been highly successful for driving the development of Mathematica. It is a very cool, and very sophisticated tool. On the other hand, one can dream about a world where the Mathematica engine is free, and as a result shipped on every OLPC XO. What effect would that have on the world over the next twenty years? (Quiz, can you name Mathematica's closes open source rival? I can, but only because it is used in STACK. Speaking of which, it was nice to renew my acquaintance with Chris Sangwin today.)

Another things that struck me about the Wolfram world is how insular it is. Some of the speakers we heard clearly spend all their time thinking about Mathematica. They have a really amazing tool, and, having what they hope is the ultimate sledge-hammer, they seem to view most problems as nuts to be cracked. It made me worry about whether the Moodle community seems that insular from the outside.

However, the day was not just about Maths and Mathematica, but more about the democratisation of knowledge. How can a concerned citizen reach meaningful conclusions about the world using the data that is now available? Conrad Wolfram set the scene by talking about this mission. He started with two slides, one showing a page from Principia Mathematica, and then a page from a modern mathematical paper. They both comprised text, diagrams and equation. Certainly, modern type-setting is better, and the diagrams are now in colour, but that is not much to show for 350 years progress. Is this the best way to publish scientific knowledge in the 21st Century? No, it isn't. For instance, it is silly to show a static graph. It should be a graph that the reader can manipulate, and the raw data should be hidden behind it in a form that you can extract to perform you own analyses on if you choose. Is the paper about a model? If so the model should be embedded there, and you should be able to experiment with varying the inputs and the assumptions to see what happens. If you are interested, you should be able to get at the source code to see exactly what was implemented. So, rather than a document being a static, one-way, low-bandwidth form of information transmission, it should be an interactive things that allows the "reader" to engage in a two-way dialogue.

Of course, Mathematica is the perfect tool for authoring such documents, or so Wolfram would like you to think. To encourage this, they are talking about a CDF - computational document format - which would be an 'open' standard based on a subset of Mathematica's capabilities, with a 'free' player for all common platforms. Think PDF and Adobe. Will it take off? Well, does the rest of the world trust Wolfram enough to adopt their format? Can the offer enough more than HTML5? I am not sure.

The next talk was by Andrew Dilnot from Oxford University. He talked very engagingly about the difficulty with statistics and gave some nice topical examples of the need to interpret numbers in context. Presented with a numerical fact, you need to ask some basic questions: is this big or small? going up or down? where did it come from? For example, the smoking ban has saved the NHS £8million. Well, that is small. The NHS annual budget is £100billion. Why are the media making such a big deal of the story? His talk could be summarised as: All numbers are wrong, but using numbers is still much better than not using numbers.

There were a couple of talks about that were more pretty than interesting. John Barrow from Cambridge talked about the history of scientific images, and Alan Joyce showed some of the cool things you can do with Mathematical. There were also some glimpses behind the scenes of Wolfram|Alpha. As I say, very Cathedral. There is no web spidering or crowd-sourcing here. At one end it is about Wolfram people going out and finding the most authoritative data sources they can get, and loading them into a format their Mathematica code can use. At they other end they are training the Alpha system to take natural language queries, decide which methods to apply to what date, and select visualisation to display the results, in order to automatically give the best possible answer to that question. They gave an interesting parallel for this. In the past, all actors had to appear on stage to whatever live audience could fit into the theatre. Today, film and television allow most people to see the best actors in the world. Can the expertise of the best thinkers in different fields be made accessible in a similar way? Well, that is their goal. The initial results are impressive, but these are only the very early stages of a highly ambitious project and there is a lot more to do.

In the afternoon, Conrad Wolfram talked about his thoughts on maths education. This has traditionally been focussed on teaching students how to perform computations by hand. But, said Conrad, this is only one part of what you need to know to be a mathematician. The full steps are:
  1. Ask an interesting question about the world.
  2. Translate that real world question into mathematics.
  3. Compute the solution.
  4. Translate that solution back into real-world terms, and find some way to validate whether your answer makes sense.
(OU Maths students, note the similarities to the Modelling cycle as taught in OU courses.) Historically, maths education has focussed on Step 3. However, these days computers are just better than humans at that. At the same time, there has never been a greater demand for people who are good at the other three steps (in business, to guide public policy, and so on). So we need to rethink the maths curriculum. To put it another way, is it cheating if you use Wolfram|Alpha to do your maths homework? Well, if it is, then we are probably assessing the wrong things.

At the end, Stephen Wolfram gave a presentation by video link summarising some of the themes of the day.

Overall, it was a very interesting day. While I don't think it quite lived up to its pretentious title, it was certainly more than just a Wolfram marketing exercise. I would like to thank the Open University for sending me. I can see myself using Wolfram|Alpha in future for certain sorts of queries, but it will not replace Google as my day-to-day search engine. (I don't think it aspires too.)