Category Archives: Wikipedia

Think you know Wikipedia? You might… or you might just think you do

Up until about two weeks ago, I was a Wikipedia snob. I thought that I knew what it was and how it worked. I had looked at the site, browsed through a few entries, and edited a couple of test pages anonymously to see how easy it was to screw with the entries. I had read a few articles & blog posts (including in ACRLog) that were skeptical about the site. I would say things like, “Sure, Wikipedia has its place. Just leave it at home.” In my opinion, Wikipedia was a project of the unwashed masses who had no idea what real information was.

I thought I could sum up the complex creature called Wikipedia in a few dismissive phrases, but I was wrong. I think differently now.

After sitting in on a workshop with an inspiring colleague — Glenda Phipps from the Miami Dade College Libraries — I find myself actually excited about Wikipedia. Better late than never, thank you Glenda. As she worked through her informative talk about the site, I surfed. I hit the “Random Article” link over and over again just to see what would come up. And after a while, dense as I am, it began to dawn on me: this thing is incredible. The energy and care and passion that have gone, and continue to go, into creating this open, free, public encyclopedia… wow. I mean, where else can you find so many people who are so passionate about knowledge? (A library, perhaps?)

True, it is not an authoritative resource. There will always be a debate about its reliability, and it is my prediction that no one will ever solve that problem with Wikipedia. So don’t think of it that way. Think of it as an ever-evolving massive collection of popular knowledge. And give it a chance.

It might help if I mention here a few things I have recently learned about Wikipedia that helped to change my opinion:

1. Anyone who creates an account can also create a “watch list” of entries that you have created or otherwise feel some ownership of. So if somebody makes a change to one of those entries, you’ll get an alert.

2. Those who have (like me and Alexander M.C. Halavais) tested the system by purposefully adding misinformation have found that our planted errors are corrected quickly.

3. It’s fun! Go ahead, try it. Search for an entry on something you care about. If it already exists, add your knowledge. If it doesn’t, create it. Then see what you think about Wikipedia.

Computing Wikipedia’s Authority

Michael Jensen has predicted

In the Web 3.0 world, we will also start seeing heavily computed reputation-and-authority metrics, based on many of the kinds of elements now used, as well as on elements that can be computed only in an information-rich, user-engaged environment.

By this he means that computer programs and data mining algorithms will be applied to information to help us decide what to trust and what not to trust, much as prestige of publisher or reputation of journal performed this function in the old (wipe away tear) information world.

It’s happening. Two recent projects apply computed authority to Wikipedia. One, the University of California Santa Cruz Wiki Lab, attempts to compute and then color-code the trustworthiness of a Wikipedia author’s contributions based on the contributor’s previous editing history. Interesting idea, but it needs some work. As it stands the software doesn’t really measure trustworthiness, and the danger is that people will trust the software to measure something that it does not. Also, all that orange is confusing.

More interestingly, another project called Wikipedia Scanner, uses data mining to uncover the IP addresses of anonymous Wikipedia contributors. As described in Wired, Wikipedia Scanner:

offers users a searchable database that ties millions of anonymous Wikipedia edits to organizations where those edits apparently originated, by cross-referencing the edits with data on who owns the associated block of internet IP addresses. …

The result: A database of 34.4 million edits, performed by 2.6 million organizations or individuals ranging from the CIA to Microsoft to Congressional offices, now linked to the edits they or someone at their organization’s net address has made.

The database uncovers, for example, that the anonymous Wikipedia user that deleted 15 paragraphs critical of electronic voting machines originated from an IP address at the voting machine company Diebold.

Both of these projects go beyond the “popularity as authority” model that comes from Web 2.0 by simultaneously reaching back to an older notion of authority that tries to gauge “who is the author” and fusing it with the new techniques of data mining and computer programming. (Perhaps librarians who wake up every morning and wonder why am I not still relevant? need to get a degree in computer science.)

If you prefer the oh-so-old-fashioned-critical-thinking-by-a-human approach, Paul Duguid has shown nicely that one of the unquestioned assumptions behind the accuracy of Wikipedia–that over time and with more edits entries get more and more accurate–is not necessarily so. Duguid documents how the Wikipedia entry for Daniel Defoe actually got less accurate over a period of time due to more editing. Duguid shows how writing a good encyclopedia article can actually be quite difficult, and that not all the aphorisms of the open source movement (given enough eyeballs all bugs are shallow) transfer to a project like Wikipedia. Duguid also provides a devastating look at the difficulties Project Gutenberg has with a text like Tristram Shandy.

Evaluating authority in the hybrid world calls for hybrid intelligences. We can and should make use of machine algorithms to uncover information that we wouldn’t be able to on our own. As always, though, we need to keep our human critical thinking skills activated and engaged.