My head’s been buzzing since I first read yesterday on the New York Times Bits Blog that coder and activist Aaron Swartz was indicted under federal hacking laws for illegally downloading millions of articles from JSTOR (the full text of the indictment is embedded at the bottom of the post). Since then I’ve read through lots of articles and tweets, news about the case having all but taken over my Twitter stream, including a more in-depth story in today’s Times. And I’m finding that with every article I read I have more questions than answers.
Why’d he do it? Swartz is well known as an information activist and open access advocate, so this question’s not hard to answer. I’d hazard that it’s also not a stretch for many librarians to sympathize with Swartz at least a little bit. After all, we spend our days helping people find information, and we know all too well the frustrations of not being able to access the information we and our patrons need. I’ve read that Swartz wanted to use the data for research, but as JSTOR points out in the official statement, there are procedures in place for scholars who want to use large parts of JSTOR’s database for research.
What, exactly, did he do? This has been difficult to tease out, and the information in the many articles around the internet is highly varied. The indictment accuses Swartz of installing a laptop in a wiring closet at MIT to download large portions of JSTOR’s content. But it’s interesting to see terms like “hacking” and “stealing” used as synonyms with “illegal downloading” and “violating license terms” in many articles describing the case. As noted in an article in Wired:
Swartz used guest accounts to access the network and is not accused of finding a security hole to slip through or using stolen credentials, as hacking is typically defined.
On the other hand, Demand Progress, the progressive political organization founded by Swartz, has compared Swartz’s actions to “allegedly checking too many books out of the library” (a quote that’s been heavily retweeted). Of course, this analogy doesn’t really hold up, since books and databases operate under very different ownership models.
Why JSTOR? I’d guess that this is a question only a librarian would have, but I can’t help wondering why JSTOR? Why didn’t Swartz pick on one of the giant scholarly journal publishers with well-publicized huge profit margins? Perhaps JSTOR was easiest for him to access? Or maybe, because JSTOR isn’t one of the biggies, he suspected that if he got caught they wouldn’t press charges? It’s been reported that JSTOR secured the return of the downloaded content and did not press charges; the case is being brought by the U.S. Attorney’s Office.
What does this mean for libraries? And for the open access movement? As I was sitting down to finish writing this my CUNY colleague Stephen Francoeur sent out a link to this post on the Forbes blog that terms Swartz’s actions “reckless and counterproductive.” The post gets at something that’s been nagging at me since yesterday: it points out the possibility that the reputation of the open access movement could be damaged by association. And I’m still not sure how exactly to articulate it, but I worry that there may be fallout from this event that could have a negative effect on academic libraries, too.