The problems with Wikipedia

February 18th, 2007

Wikipedia is one of the most incredible developments that I’ve ever seen. An encyclopedia that anyone can edit? Sounds like a fruitbat notion doomed to failure. But it’s worked.


There are a few fault points developing. And these may show the beginning of the end for Wikipedia as we know it. A couple of examples:

Exhibit one – the ongoing webcomics war on wikipedia.

There have been a strange, ongoing series of fights in Wikipedia over the question of notability in Wikipedia. The story in a nutshell (see Websnark for a few more examples. ) is that some of the Wikipedia editors are systematically deleting the articles for many web comics. Now web comics are probably not a hugely broad phenomena, but they’re certainly pretty popular with a large number of people (even if the proportion is small).

The argument for deleting is ‘notability’. The guidelines state that there need to be multiple, non-trivial mentions in newspapers, magazines and so forth. Given that there is little coverage of web comics outside of the community (which, of course, doesn’t count for this) then it’s very hard for even a relatively famous web comic to qualify.

(I could, by the way, go on about the strange anti-democratic nature of requiring some media gatekepper to decide what’s notable, but that’s not really the point).

The real problem I have with this is that the rules for websites don’t really match up with the rules for fiction.

There is, for instance a fairly long article on the Kobayashi Maru, a fairly minor plot point from a Star Trek film. There are almost five hundred articles on different Pokemon species. There are many pages of Harry Potter characters, including Sir Patricky Delaney-Podmore.

Personally I don’t object to any of these being included in the Wiki. They are, after all, of interest to some people. But what I find deeply perplexing is the low standard for inclusion set here compared to elsewhere. So long as an article is well written and encyclopedic, does it really matter how many people it’s of interest to? Just hit ‘Random Page’ a dozen times in a row to see what kinds of things do make the cut.

So, problem one: where do you draw the line about what to include, especially when you need lines that make sense for both apples and oranges?

Exhibit two – Microsoft hiring someone to edit articles in their favour.

A brief blog-storm came up recently over Rick Jeliffe’s revelation that Microsoft had offered to pay him for a few days work to correct entries on their OOXML file format. Microsoft was more or less asking him to go and improve the articles, without any editorial control. And, by all accounts, the articles were in a pretty bad state of anti-Microsoft hysteria (although Rick does not appear to be hugely impartial in the ODF versus OOXML fight).

There are a bunch of questions about the ethics or otherwise of this. But the bigger issue is that it’s pretty much impossible that a lot of companies are not actively out there doing exactly this sort of thing with far less disclosure than we see here. It is pretty trivial, after all, to use an anonymising web surfing service to disguise your point of origin. And with the number of fanboys (and -girls) out there for pretty much anything these days (I’m looking at you, Nintendo and Sony!) it would be very hard to tell for certain that a person was an actual company shill.

This is a very important issue for trust in the Wiki. A very wise person once described Wikipedia as being:

a kind of quantum encyclopedia, where genuine data both exists and doesn’t exist depending on the precise moment I rely upon your discordant (…) mob for my information.

If you’ve got agents in the system trying to actively mislead this problem can become a lot worse.

The objective of ‘Neutral Point of View’ that Wikipedia espouses can help, but it’s very hard to achieve in the real world. Sometimes even attempting to create balance can itself violate NPOV – for instance, should pure creationist arguments be given equal time with evolutionary arguments? And if not, exactly how much less should they get? ‘NPOV’ is a lovely ideal, but it’s not really practical.

Exhibit 3 – Wikipedia hates experts

This is a simple case – there’s an article on a fairly abstruse bit of theoretical philosophy. A genuine expert in the subject (tenured, wrote one of the books the article refers to) comes along and make some suggestions. But gets (effectively) jeered out of the room.

Under the Wikipedia model all input has equal weight. So if I go and edit articles on economics (with a background of a degree and seven years work experience) I get the same weight as someone who can barely remember high school economics. I was actually a bit involved in Wikipedia for a while, but gave up because getting quality edits to stick was just too hard. The normal answer to this is that the expert has an obligation to convince people, which is fine – no one should take my word for it just because I say so. But on the other hand, these are complex areas, which can takes years of study to understand fully. If I wanted to teach my subject to people I’d be at university, and I’m not prepared to conduct a tutorial online just to improve an article on Wikipedia – I just don’t care enough.

This problem is particularly acute in areas which appear to be accesible to lay persons – economics is one case in point, philosophy is another. These are both complex subjects with a lot of theoretical background to them, but there are also both subjects where a lot of people think they have all the answers.

The requirement for citations helps a lot here, and things are a lot better than they were 12 months ago. But in cases where there are two conflicting citations, how do you decide what version is right? You need experts to make Wikipedia work, and Wikipedia makes it very hard to be an expert without getting very cranky.

So what does all this mean?

The three examples here all about the ‘meta’ layer of Wikipedia, the governance mechanisms that make the whole thing keep ticking over. I think these three examples are all signs that Wikipedia is not scaling well. When you have an encyclopedia of a few thousand articles it’s easy for everyone to build a common consensus across the whole system about the rules that apply. But when there are (as of the most recent count) over 1.6 million articles it’s not longer possible for a single group of people to look over them all. Instead you end up with a collection of overlapping communities within different sections of the Wiki. But different communities end up with different rules and values, even if they operate within a common set of formal, written rules.

The breadth of wikipedia makes abuse easier, makes policing it far more difficult.

Can Wikipedia survive? My guess is that it can’t, not in its current form. More formal governance mechanisms will be needed, a greater distinction between the ‘live’ sections of the site and the ones the public see by default, and more accountability for those people who work in the governance structure.

I hope it does survive – I don’t think there’s a site (other than Google) that I use more frequently.