Stranger Than Fiction

My head’s been buzzing since I first read yesterday on the New York Times Bits Blog that coder and activist Aaron Swartz was indicted under federal hacking laws for illegally downloading millions of articles from JSTOR (the full text of the indictment is embedded at the bottom of the post). Since then I’ve read through lots of articles and tweets, news about the case having all but taken over my Twitter stream, including a more in-depth story in today’s Times. And I’m finding that with every article I read I have more questions than answers.

Why’d he do it? Swartz is well known as an information activist and open access advocate, so this question’s not hard to answer. I’d hazard that it’s also not a stretch for many librarians to sympathize with Swartz at least a little bit. After all, we spend our days helping people find information, and we know all too well the frustrations of not being able to access the information we and our patrons need. I’ve read that Swartz wanted to use the data for research, but as JSTOR points out in the official statement, there are procedures in place for scholars who want to use large parts of JSTOR’s database for research.

What, exactly, did he do? This has been difficult to tease out, and the information in the many articles around the internet is highly varied. The indictment accuses Swartz of installing a laptop in a wiring closet at MIT to download large portions of JSTOR’s content. But it’s interesting to see terms like “hacking” and “stealing” used as synonyms with “illegal downloading” and “violating license terms” in many articles describing the case. As noted in an article in Wired:

Swartz used guest accounts to access the network and is not accused of finding a security hole to slip through or using stolen credentials, as hacking is typically defined.

On the other hand, Demand Progress, the progressive political organization founded by Swartz, has compared Swartz’s actions to “allegedly checking too many books out of the library” (a quote that’s been heavily retweeted). Of course, this analogy doesn’t really hold up, since books and databases operate under very different ownership models.

Why JSTOR? I’d guess that this is a question only a librarian would have, but I can’t help wondering why JSTOR? Why didn’t Swartz pick on one of the giant scholarly journal publishers with well-publicized huge profit margins? Perhaps JSTOR was easiest for him to access? Or maybe, because JSTOR isn’t one of the biggies, he suspected that if he got caught they wouldn’t press charges? It’s been reported that JSTOR secured the return of the downloaded content and did not press charges; the case is being brought by the U.S. Attorney’s Office.

What does this mean for libraries? And for the open access movement? As I was sitting down to finish writing this my CUNY colleague Stephen Francoeur sent out a link to this post on the Forbes blog that terms Swartz’s actions “reckless and counterproductive.” The post gets at something that’s been nagging at me since yesterday: it points out the possibility that the reputation of the open access movement could be damaged by association. And I’m still not sure how exactly to articulate it, but I worry that there may be fallout from this event that could have a negative effect on academic libraries, too.

New and Improved – or Not?

One of the lovely surprises awaiting those who have been away from the reference desk for a while is the numerous spanking new database interfaces that have sprouted up. There seem to be more than usual this year, and while some are improvements, others, frankly, need a good spanking. One that has us particularly flummoxed is the new JSTOR interface that defaults to searching material your library doesn’t have and offers new layers of confusion. (“Is this article available at my library in another database?” “Sorry, we can’t tell you that, but we can provide a handy link through our publisher sales service to purchase articles.”)

As an aside, do publishers seriously expect people to purchase articles for $12, $25, or $35 a pop? Really? They have not met my patrons. But I digress.

I was coasting along in blissful ignorance until I got this guest post from our occasional correspondent from Bowling Green State University, Amy Fry. I have a feeling JSTOR will be getting a lot of feedback on their “improvements.” Here are some thoughts to start the conversation.

—-

What Were They Thinking?
Amy Fry
Electronic Resources Coordinator
Bowling Green State University

Today is the first day of the new semester at BGSU, and also the first school day of the new JSTOR interface.

What were they thinking?

JSTOR began life as a journal archive, but librarians have long treated it as an all-full-text, all-scholarly database for journal literature. While its search interface lagged, with limited options to weed out unwanted items or zero in on the most relevant results, its content was stellar, and librarians felt confident promoting it to students as a reliable place to find full-text scholarly sources. As a result, JSTOR has a strong brand not only with librarians, but with faculty and students at all kinds of institutions. Those days appear to be over, at least for now.

Last year, JSTOR embarked on a “current scholarship” endeavor, which allows libraries to use JSTOR as a portal for current subscriptions to some titles. The interface upgrade that went into effect this weekend was meant to support that program. But now that the upgraded interface is live, I can see what this means for JSTOR libraries.

JSTOR has added several confusing layers to its formerly reliable content archive that are guaranteed to confound the most experienced JSTOR user. The search screen contains two limiters – “include only content I can access” and “include links to external content.” The first is unchecked by default and the second is checked by default. This guarantees the broadest journal searching in the archive, but it also means that, after doing a search, users at many institutions will see three kinds of results – ones that are full text, ones that give citation and “access options,” and ones indicating there may be full text on an “external site.”

These last are the “current issues,” and have appeared in JSTOR search results (for titles in libraries’ subscribed JSTOR modules) since last year. Clicking on one of these in the results list shows its citation, abstract and references. Since we have enabled openURL on JSTOR, it also shows our openURL button (which will allow users to link to full text or interlibrary loan). Next to our openURL button, however, there is a box that says “you may not have access,” and to “select the ‘article on external site’ link to go to a site with the article’s full text.” Nowhere on this page do I see an “article on external site” link, but at least the openURL button is there.

The real problem is with the other articles – the ones that only offer “citations and access options.” These are articles from the modules of JSTOR to which my institution does not subscribe. Formerly, articles from non-subscribed JSTOR modules did not even appear in my institution’s JSTOR search results. This was certainly preferable to how these are handled now: now when users click on them, they see the first page of the pdf and have the option to show the citation information, but at the top of the screen is a yellow box containing the text, “You do not have access to this item. Login or check our access options.” Clicking on “login” takes users to the MyJSTOR login screen which asks for your MyJSTOR username and password or gives users the option to choose their institution from a list of Athens/Shibboleth libraries. Clicking on “access options” informs the user he or she must be a member of a participating institution, links to a list of participating institutions, then gives the user the option to purchase individual articles or subscriptions. Worse, newer articles display a price and direct link to purchase the article right next to the first page of the pdf.

Nowhere on this screen do users have the option to use openURL to link to full text or interlibrary loan. In effect, JSTOR has pre-empted library subscriptions to current content for links to purchase articles directly from publishers. For example, if I found an article from The Reading Teacher in JSTOR, I will see the option to purchase it, but be offered no other way to access the full text. If the openURL button for my library appeared there, I would know that my library has access to this article in half a dozen other databases and I would never have the need to purchase it.

Academic librarians at institutions like mine – non-Athens/Shibboleth, non-full-JSTOR-archive subscribers, can expect to get a ton of questions now from students. Expecting JSTOR to be (at least mostly) full text as it has always been, these students will log in upon accessing the database (if they are off campus). When they find one of these “access options” articles in JSTOR, they will try logging in again, then, when that doesn’t work, they will look for our institution in the list of Athens/Shibboleth institutions. Then, if it’s an article they really want, they will call or IM the library and explain that JSTOR is asking them for a login, which will be a troubleshooting struggle as this usually only happens when students try to access JSTOR from Google or Google Scholar. In the worst-case scenario, they will waste their money on content we already purchase elsewhere. In an even worse worst-case scenario, they will abandon JSTOR as another confusing and misleading library website and turn to other sources. Students are not terribly likely to purchase individual articles – they are more likely to move on and try to find something that is full text, even if it is less relevant. This may turn out to be a boon to EBSCO, but it’s going to frustrating as hell for libraries, and could turn sour for JSTOR.

JSTOR apologists will no doubt point out that individual users can change their limiter options on the initial search screen and search only content that will give them full-text results in JSTOR. But they will only do this if they understand what “include only content I can access” and “include links to external content” mean and, despite the explanatory text linked to the latter, I am not even entirely sure what these mean. Is “content I can access” just my institution’s JSTOR modules, or does it include “current issues” links for titles in my institution’s JSTOR modules, and, if so, are all of these indeed titles I have full-text access to through my institution’s current subscriptions? Good question. Do the “links to external content” mean just current issues and, if so, are they current issues for just titles in my library’s JSTOR modules, or for those in all JSTOR modules? I have made notes to ask JSTOR these questions when they get back to me about why the heck my openURL button doesn’t appear in results with the other “access options” for articles outside our JSTOR modules, but most users don’t even realize JSTOR has modules, and likely will not be able to understand what these two limiters mean, even after they’ve done a search.

So, what is JSTOR thinking? It seems like they are trying to move the archive towards being an expanded content platform in order to become an expanded platform for discovery, but have skipped some vital steps along the way. Let’s not forget, JSTOR has no administrative module, it has certainly not fully implemented openURL (as this platform upgrade shows), and though it does offer COUNTER Journal reports, it still offers no COUNTER-compliant statistics for sessions and searches.

—–

I think Amy has nailed it by describing this as a fundamental shift from journal archive to “discovery platform.” I don’t know how your users will respond, but I predict mine will be confused and unhappy – at least until they get the hang of manually selecting “content I can access” every time they search. (There is no option for libraries to set that as a default.) Much as I respect JSTOR, I’m not looking forward to the questions we’ll be getting.

What do you think?

Illustration courtesy of autumn_bliss.