Complex or clickbait?: The problematic Media Bias Chart

This guest post was submitted by Candice Benjes-Small, Head of Research at William & Mary, and Nathan Elwood, Library Administrator at the Missouri Legislative Library.

The Media Bias Chart, commonly referred to simply as “The Chart,” has become ubiquitous in discussion of information literacy and news evaluation. The Chart, for those unaware, attempts to differentiate trustworthy and untrustworthy media sources based on two axes: bias and reliability. 

Despite the popularity of this memetic tool, it raises a whole host of issues that must be addressed as part of our larger information literacy conversations. 

The Chart promotes a false equivalency between left and right, lionizes a political “center” as being without bias, reinforces harmful perceptions about what constitutes “news” in our media ecosystem, and is ignored by anyone that doesn’t already hold a comparable view of the media landscape. 

The Chart is a meme, not an information literacy tool, and as librarians we need to be clear-eyed about these flaws. As Ad Fontes Media released version 7.0 last month, we thought it was a good time to explore our concerns. 

Origins of The Chart

First published in December 2016 by Vanessa Otero, The Chart was originally simple and informal, placing sources on a “liberal” to “conservative” left-right axis, and along a vertical axis of credibility ranging from “complex” to “clickbait.” As with all iterations of The Chart, this resulted in sources arranged in a rough pyramid, with sources ranked the most “mainstream” and “complex” as being of the highest information value. 

Creator Vanessa Otero does not come from an information literacy background. While currently an intellectual property lawyer, her previous professional experience was in pharmaceutical sales and as a Regional Advisor for Noveau Riche, a non-accredited vocational school specializing in real estate investing.  In 2010, amidst accusations of being a multi-level marketing scam, Nouveau Riche dissolved. In 2011, the founders of the company were fined more than $5 million by the Arizona Corporation Commission for defrauding students. 

Otero says The Chart is a “passion project” and could be useful to consumers and advertisers.

Within weeks of the first iteration’s release, The Chart became a viral phenomenon. It also received pushback from far-right outlets after seeing Infowars, Breitbart, and The Daily Caller all grouped in the bottom-far right, a quadrant labeled as not credible. 

However, criticism of the original meme wasn’t exclusive to the far-right. Left-wingers noticed the conspiracy site “Natural News” grouped at the bottom left of the liberal/conservative axis. 

Natural News, it was quickly pointed out, was a known purveyor of far-right conspiracy theories, such as the Marjory Stoneman Douglas High School shooting being a false-flag. The far-left/extremely “liberal” grouping for the site, Otero justified through the site’s “anti-corporate and popular liberal pseudo-science positions.” Natural News has since fluctuated across the spectrum, before arriving on the far-right in the current iteration. 

On neutrality

In the original iterations of The Chart, all evaluation of sources was conducted by Otero herself. However, after her formation in 2018 of Ad Fontes Media, analysis is conducted by a team of writers, journalists, and other professionals. 

Whenever a new item is evaluated, it is analyzed by a team of at least 3 of these analysts, “with an equal number from left-leaning, center-leaning, and right-leaning perspectives.”

One of the most common points of justification for this project and similar endeavors is that the analysis they conduct is “bipartisan” in this manner. This is something that has been left uninterrogated within the library profession for far too long. It may seem like a strange question, but what is actually “good” about a bipartisan analysis?

When Donald Trump claims that there were “very fine people on both sides” of the Charlottesville riots, we can easily identify what a facile, deceptive framing this is. So why do we allow it within our media analysis?

Say you have, like Ad Fontes Media does, a “bipartisan” group of analysts; evenly mixed between liberals/leftists, conservatives, and centrists. For the purposes of this example, feel free to dismiss that liberals aren’t actually classified as “Left” in most understandings of political science. Instead, consider what the conservative viewpoint genuinely brings to the table.

On January 6th, a majority (68%) of Republican lawmakers, the representative body of the conservative viewpoint in American politics, voted to overturn a free and fair presidential election based on unsubstantiated and proven-false conspiracies. They did this only hours after an attempted coup against our government, based on the same premises, left five people dead.

The consensus view among the American conservative movement is that the attack was justified in its reasoning, if not its method. 

As Eugene Robinson said in his recent Washington Post editorial, “Bipartisanship is nice, but you can’t negotiate with fantasy and lies.” 

The problem with pyramids

Projects like the Media Bias Chart all portray the political center “unbiased,” feeding into what cultural theorist Mark Fisher labels as “capitalist realism,” in which the status quo power structure is the only system that can feasibly exist, and even the thought of alternative systems is seen as inherently radical.

In the structure of The Chart, the “center” or “status quo” is portrayed as the most preferable, least problematic option. It is, visually, the top of the pyramid. It is “biased” (and therefore less credible) to hold views outside reinforcement of this status quo. 

Within this framing, the Democratic Party represents the left end of the spectrum, and the Republican Party the entirety of the right. However, according to the work of the Manifesto Project, the Democratic Party tracks to the political center, and the Republican Party to the far-right. . 

Within this framing, right-wing and left-wing views are both held as equally “extreme,” despite the fact that the U.S. Department of Homeland Security singled out right-wing extremists as “the most persistent and lethal threat in the Homeland” 

Mainstream or Utter Garbage?  


Another flaw of the balanced, pyramid structure of The Chart is that it fails to take into account the centralization of the media landscape, as described in the Propaganda Model. The corporate monopolizing that we see in the US media, rather than furnishing us with diverse viewpoints across a variety of sources, has collapsed our media ecosystem into a small set of acceptable views, portrayed by dozens of sources that differ only aesthetically. Our media ecosystem, put bluntly, presents an “illusion of choice,” oriented largely to the benefit of a pro-business status quo.   

What’s the objective? 

Also worth noting is how the “objective, view from nowhere” standard that The Chart reinforces was developed by and for white, cis males, and that enforcing that “neutral” POV can often be fundamentally inequitable.. 

Consider when a reporter for the City Desk program in Chicago accused Malcolm X of being “personally prejudiced” and incapable of being “academic” in his arguments regarding the Ku Klux Klan, simply because they had burned down his home and murdered his father. Or more recently, when Black journalist Wesley Lowery revealed how he had been “muzzled” by editors at the Washington Post.

In the wake of these events, Lowery has written compellingly on the failures of our current conception of “objectivity” in newsrooms, a conception that The Chart fortifies by design.  

The problems of source as shorthand

While the outlet providing an article is certainly an essential consideration when it comes to evaluation, we reject that it is the most important indicator. A media company is not a monolith, but an organization of people. 

Divergence from editorial direction is common. When the NYT published Senator Tom Cotton’s opinion piece calling for the military be sent in to control protests, or the Wall Street Journal’s Op-Ed questioned Dr. Jill Biden’s use of the “Doctor” title, journalists at both organizations spoke out against pieces. 

Sources are also divided into different areas, with different specializations and audiences. This makes it very difficult to generalize a source’s credibility. For example, Buzzfeed and Teen Vogue have published excellent political reporting while also drawing eyeballs through listicles and pop culture pieces. 

The simple layout of The Chart does not allow for this kind of context or nuance. 

What is included

It’s difficult to tell how Ad Fontes selects the media which appear on The Chart. Natural News and others have transitioned on and off The Chart several times. Many sources in Version 7.0’s “green box” are household names, but just beneath them in the “mixed reliability category” The Chart has previously included outlets like Epoch Times, a pro-Trump outlet with ties to the Falun Gong cult and a penchant for spreading Covid-19 conspiracy theories.

Currently occupying the same space, and even outranking established publications like The Nation in terms of credibility, is Quillette, a publication that has promoted racial pseudo-science on multiple occassions.

In her essay Lizard People in the Library, Barbara Fister argues that librarians must educate learners to differentiate between news platforms which serve as watchdogs for society, and outlets which prioritize profits over any kind of social contract. Ad Fontes amplifies outlets like Epoch Times and Quillette through their inclusion, leading the casual observer to assume that, while problematic, these are legitimate news organizations worthy of inclusion in a normal media diet. 

Just as harmful as these impacts is how The Chart also reinforces the concept of “news” being exclusively a national affair. This is to the great detriment of local news outlets, which often provide not only high quality information, but information more directly relevant to people’s lives.


This is a real problem, because the death of news at the local level has allowed for the propagation of far-right propaganda outlets in the vacuums created. 

Tabula Rasa

Some have argued that The Chart is helpful for students who are new to research and are a ‘blank slate’ when it comes to sources; The Chart gives them guidance as they conduct their research online. But this makes little sense; as a visual source, The Chart can only include a tiny fraction of sites. 

Internet searches will bring up stories from thousands of different sources not on The Chart. Local media sources are one example of a source type that is ignored by The Chart’s methodology, but there are even extremely popular information and disinformation sources that don’t show up. 

Given the variable nature of the chart’s inclusion of sources, how are readers supposed to interpret a source’s absence in relation to its credibility? 

Check your bias

In one of the earliest mainstream media articles about the newly formed Ad Fontes Media, MarketWatch asserted in their headline “How biased is your news source? You probably won’t agree with this chart.” 

From the beginning, the biggest flaw in this project has been viewers’ own confirmation bias. Frequent consumers of sources that The Chart claims to be untrustworthy or biased will often dismiss The Chart entirely. Conversely, the centrist consumer who reposts The Chart to their social media page will often ignore the unscientific and haphazard nature of the work.

So what chart should I use instead?

While we have focused our discussion on the Media Bias Chart’s flaws, many of the same critiques apply to other websites that claim to rate media outlets’ biases. Professors and librarians are looking for a ‘silver bullet’ that will help students become more discerning consumers of media. As educators, we must transition away from crutches like these, and instead endorse comprehensive, skill-based evaluation of information sources.

While Nathan does not recommend any methodology in particular, he has found that the Five W’s as framed by Jessica Olin are a helpful tool when training students to read sources critically. The easy recognizability of the framework helps it to stick with students, and promotes a constant and variable interrogation of sources rather than a standardized checklist. He has also regularly talked about the misinformation categories identified by media professor Melissa Zimdars, whose work was popularized around the same time as Otero’s meme. In addition, he feels that information literacy, as a skill designed to create more informed citizens, must be coupled with a comprehensive and rigorous study of the basics of political science and civics. 

Candice advocates people use Mike Caulfield’s SIFT method when evaluating a news article, since it emphasizes lateral reading and the need to recontextualize information. While media bias charts try to provide a heuristic that encourages people to trust or distrust a source in isolation, SIFT recognizes that we must view each story within the greater information ecosystem. This is not something that can be done with a meme – and to suggest information literacy can be so simplistic is insulting. 

“Wait a minute Honey, I’m gonna add it up:” Kanopies, DRM, and the Permanence of the Collection

In my new position at the University of Washington I have a long commute, as one would expect, in a large city like Seattle. On this commute I listen to music and read and on the bus last week I reached for an old Midwestern standby, The Violent Femmes only to find that their first album, Violent Femmes (1983) had been removed from streaming platforms and, despite my purchase of the album electronically, had been removed from iTunes for me to listen to. (Reader, don’t worry many of the songs are available on their greatest hits record, aptly titled, Permanent Record.)

The Violent Femmes performing in 2006 (Wikimedia Commons)

In the last few weeks librarians have been confronted in various ways with the difficulties surrounding streaming and licensed materials. Kanopy, one of the largest and most popular streaming services available for library users, was recently and publicly dropped by the New York Public Library (NYPL). How we found this news out, and how it became well known, was a result of Kanopy sending an email to NYPL users who had registered for the service prior to NYPL’s own statement on the issue. In part, their message explained “The New York, Queens, and Brooklyn Public Libraries have decided to discontinue Kanopy’s film streaming service to its patrons…Film as a public resource is a critical part of New York’s culture and communities. We have enjoyed furthering the New York City Libraries’ mission of providing open access to knowledge… [emphasis mine].” 

Kanopy's Letter to Patrons
The Kanopy Letter sent to New York Public Library Patrons

Setting aside for a moment the frankly gross overstep of a vendor directly reaching out to library patrons about library budgetary or mission changes, let’s focus in on the language that Kanopy uses to describe their service: public resource and open access. For those of us who work in academic libraries and have dealt with the ongoing difficulties with providing access to streaming media for our communities, and especially those who are aware of Kanopy’s expensive nature, these kinds of words might make us take a pause.

As a cinema librarian I can say that film is an important part of cultural legacy and should be a public resource and that access to film should be part of any library’s collection mission. Yet, for many of us the way we consume and purchase media has dramatically changed in the past decade, as streaming and licensing digital files have become the norm for the majority of consumers. Kanopy fits into this very nicely. It’s interface looks remarkably similar to any other streaming platform, and invites users to click through its offerings like they would for Netflix, Criterion Channel, and Amazon Prime. It’s hidden cost, as we know, only triggers when a user clicks on a film and watches a certain amount of it. For users it seems free; like the offerings from Netflix that despite the monthly cost allows users to peruse and sample any film in the catalog. For the most part this is how streaming platforms like Kanopy have advertised themselves to our users.


I want to be clear that it is not my intention to pile on to Kanopy, because I truly believe that Kanopy provides a great service for spreading art and indie film to the widest audience. Rather I want us to think about how we are building collections and gathering materials in this new digital age of instant gratification and expectations.

Over the last year and especially in the wake of NYPL’s decision, I have seen many articles touting and promoting the great new “free” hidden service provided by the library. This article from Entertainment Weekly https://ew.com/movies/2019/01/18/free-streaming-service-kanopy/ emphasizes the accessibility and the free cost as these pillars of why Kanopy is amazing for users. And Kanopy for their part makes a pretty compelling case for this kind of access  CEO Olivia Humphrey states “‘We have such a wide audience,’ says Humphrey. ‘We have people who can’t afford an internet connection that go down to the local public library to watch…. That’s a really important demographic for us, [as much as] cinephiles in L.A. and New York.’ Part of serving that audience is finding what Humphrey calls “content gaps” in other streaming platforms and trying to fill the void” This is something that I think is a really wonderful part of Kanopy, is that it allows access to art and indie film through public libraries but at what cost?

Well…we often don’t know what the cost is. The model is certainly different at Academic institutions but one of the cited figures for public libraries is $2 a watch for each film, and some libraries have limited how often users can watch films a month in order to keep these figures down (
https://www.indiewire.com/2019/06/new-york-public-library-drops-kanopy-netflix-alternative-too-expensive-1202153550/ ) . For Academic institutions, Kanopy, and other services like it, are fairly reminiscent of our licensing agreements with our ebooks, and costs can be astronomical. In a Film Quarterly article critiquing the “freeness” of Kanopy, Chris Cagel, a film historian at Temple University, writes “Instead, Kanopy’s platform drives “patron-driven acquisition” in which three viewings (defined as 30 seconds or more of a title) trigger a library license fee per title. (The figures I’ve seen are $150 for a year, $350 for a 3-year license, though the price might vary or change over time.) (see: https://filmquarterly.org/2019/05/03/kanopy-not-just-like-netflix-and-not-free/ )” These costs can quickly go out of control for many libraries, and the larger the population and the more articles about how this “free service” is provided by libraries, complicate this matter. It leads us to the moment where we are forced to cancel subscriptions because our patrons are using it, rather than how we often weed in our collections based on lack of use or usefulness in a general sense.


…the larger expectation for our library within the community is that we are permanent repositories for information (see the issues we generally see when library’s weed their collections) digital media is anything but permanent, and we have to reconcile this fact with our user expectations.

I want to be clear that it is not my intention to pile on to Kanopy, because I truly believe that Kanopy provides a great service for spreading art and indie film to the widest audience. Rather I want us to think about how we are building collections and gathering materials in this new digital age of instant gratification and expectations and how we tell that story to our users. Our users will start to feel the loss of licenses when materials start to leave our collections, just as they are starting to see their own digital materials lost in their personal collections. On the same day that Kanopy and NYPL parted ways it was reported that ebooks purchased through the Microsoft Store would be deleted this month from those who had purchased them. https://gizmodo.com/ebooks-purchased-from-microsoft-will-be-deleted-this-mo-1836005672

Digital items with DRM (digital rights management) are never fully owned, instead they are licensed. You can read more about DRM from the grassroots anti-DRM movement Defective By Design. They even wrote an open letter to libraries https://www.defectivebydesign.org/LetterToLibraries. These objects can be locked to prevent sharing of the material to other users and they can be taken away, like the ebooks or like my precious Violent Femmes album. My institution is well off enough to encourage our subject liaisons to purchase ebooks without DRM (which increases the costs substantially), but many public libraries or smaller academic libraries cannot afford to pay an extra $150 to make sure digital items are the community’s to keep. But the larger expectation for our library within the community is that we are permanent repositories for information (see the issues we generally see when library’s weed their collections) digital media is anything but permanent, and we have to reconcile this fact with our user expectations.

In my own life I have begun collecting materials for myself in non-digital form. This means that I have spent money buying twenty-year-old video games, hard to find DVDs, and vinyl because I am aware of the tenuous grip that we have on our digital files and media. It is essential that libraries work to make our communities aware of the restrictions and the fugitive nature of digital licensed materials and platforms and work with our users to ensure their needs are met in this changing time. NYPL for their part explained their decision to move away from Kanopy stating that “The Library made this decision after a careful and thorough examination of its streaming offerings and priorities. We believe the cost of Kanopy makes it unsustainable for the Library, and that our resources are better utilized purchasing more in-demand collections such as books and e-books (https://www.nypl.org/press/press-release/june-24-2019/statement-about-kanopy

For a city of 8 million people, Kanopy was perhaps unsustainable, but NYPL is also making a point about how they see their collections growing and that is in books and ebooks. For libraries providing for the public good means making these kinds of decisions, and we need helpful partnerships with our vendors to provide this access. While I do not know what was going on in the minds of the directors of NYPL, it sure does not seem like the library system was wanting to wage this battle in the open prior to Kanopy’s patron email. Yet, this has become a moment where librarians can have conversations with patrons about the costs and limitations of streaming and digital materials. I, for one, have received several messages from my faculty colleagues about how the NYPL decision impacts us at the University of Washington, and I tell them that while it won’t change the way we interact with Kanopy (that decision was made long before I came here) but that this is an important teaching moment in our current climate. While the vendor spurred this conversation, I believe that libraries can have an important voice to share in this new media age.

Thoughts on the OWL/Chegg partnership

On the ILI-L (Information Literacy Instruction) listserv, there’s been a discussion of the relatively new partnership between the Purdue University Online Writing Lab (OWL) and Chegg, the for-profit textbook rental company (also the creator of the Citation Machine service). The folks on the listserv caught me up on the implications of this partnership and Chegg’s reputation.

Until this story, I didn’t know that Chegg offered more than textbook rentals. They own Citation Machine, but they’ve also acquired BibMe, EasyBib, and Cite This For Me. Looking at these websites and seeing “a Chegg service” at the top of each page unnerved me. The top 4 Google results for “citation generator” all come from the same for-profit website. How had I never known this before? I was about to learn even more about how students use Chegg.

Educator and blogger John Royce’s post, “Not such a wise OWL” captures my reaction to this partnership. “Chegg makes me feel uneasy. It advertises “24/7 homework help,” online tutors and other study help and solutions manuals (solutions to problems posed in textbooks).” These tutors and study tools are behind a paywall, so I don’t have personal experience with them, but this makes me feel uneasy, too.

Another librarian shared this presentation about Chegg, which explores Chegg’s reputation for helping students cheat. The researcher links to college student tweets about Chegg’s homework help; “while Chegg claims to help students do their homework, students on Twitter are very clear that they use the site to do their homework for them.”

I wondered what this partnership would actually mean for the reliability of the OWL. Visiting the OWL’s MLA formatting and style guide, there’s now a widget at the top of each page that offers to cite your source automatically with MLA, disclosing underneath the box that it’s powered by Citation Machine. I noticed the OWL does link to a page about using citation machines responsibly, but I doubt many students would click or read that warning.

At my community college library, source documentation is a major instruction focus. Our institution uses NoodleBib and our own handouts, so I wouldn’t recommend Chegg’s Citation Machine either way. But I’ve used the Purdue OWL for answering particular or unfamiliar questions about citation styles; it’s a quick search and has plentiful examples for students to model their citations after. When you’re pressed for time, an online tool is easier than thumbing through a citation manual.

This integration of Chegg services into OWL guides reminds me of native advertising. I imagine many students wouldn’t notice that disclosure under the automatic citation box. They have come to trust the OWL for those late-night writing questions. Librarians (like myself!) have also relied on and trusted the OWL for precise citation information. This is my opinion, but I see Purdue incorporating this for-profit tool as a betrayal of that trust.

Anytime I teach information literacy, I encourage students to ask, “Who published this and why?” We talk about how advertising and sponcon have a clear self-interest that should make a user think twice about the impartiality of that information. So what to do with the OWL? The ILI-L listserv suggested a few OWL-like alternatives, like this one from Excelsior College and this Massey University resource. Other folks say they still link to the Purdue OWL on their research guides, but with a word of caution for the citation generator. I’m very curious about other library workers’ thoughts on this. Is citation education a part of your library’s responsibilities or priorities? What do you think of Chegg and/or this effort to monetize the OWL?

What We Know and What They Know: Scholarly Communication, Usability, and Un-Usability.

Over the past handful of years, a lot of digital ink has been spilled on library responses to #icanhazpdf, SciHub, and, most recently, the #Twitterlibraryloan movement. This hit home in my life because  in recent discussion with students at my University, we found that students told us outright that they used SciHub because of its ability to “get most things.”

How we talk about piracy with our patrons is an important topic for discussion, and places a tremendous amount of emphasis on the ethics of a for-profit publishing model. But it places librarians in a precarious situation defending publishing practices that build barriers to research.

SciHub Pirates, from the Rjiksmuseum in Amsterdam. Schip van de schrijver Jean de Thevenot door zeerovers overmeesterd, Jan Luyken, 1681

 

 

 

 

 

 

 

 

 

 

 

 

 

Lydia Thorn wrote an excellent piece about teaching professors and students about the importance of legal means of acquisition, pointing to an expectation of immediate access and declining library budgets as culprits in this explosion of piracy. Thorn suggests pointing to the ways in which piracy hurts small presses and not-for-profit publishers and how the library can and should fill these needs. She also suggests that we point to several open models that provide access to materials without the illegality of piracy.

Switching gears slightly, it reminds me of the difficulties I have in working with faculty on online scholarly profiles. Because I administer DigitalCommons@USU, and its profiling system Selected Works, I am often confronted with faculty and students who use the for-profit academic profiling systems (I’m using this difficult phrase to talk about the systems that we all know but I’d rather not name) that are extremely popular across the world and across disciplines.

What brings these two examples and issues together is the way in which we, as librarians, promote ourselves as experts in this realm and how, in a lot of ways, our strategies for promoting our services fall flat. Faculty are not cynical monsters who actively search for ways to be “anti-library,” but make rational choices that fit what they need. They aren’t very often knowledgeable about the inner working of collection development or the serials crisis but they are knowledgeable about what they need right now in their academic careers.

I explain to my faculty, much like Thorn suggests, that the for-profit profiling systems are sometimes deceptive, corporate, and, often times, include illegal materials. While the illegality of the for-profit profiles often reaches faculty, who want to avoid any legal entanglements, the prevalence of these systems does not seem to be waning. The library’s 100% legal version pales in popularity in comparison to the others, who are often much more popular in certain fields. Who am I to tell professors not to choose these options in academic areas where for-profit profiles are more valuable than the library’s resources? Despite my feelings to the contrary, sometimes the for-profit profiles fit certain scholars well.

This brings me back to the issues surrounding SciHub and #Icanhazpdf. The important thing to remember about our users is that they spend much less time than we do worrying about these things. For them, the ease of use of a for-profit profile or a pirated pdf warehouse is an issue of access and not a preference towards profits or not-profits. While each choice we make as actors is political, I do not believe that our faculty who use these platforms are willfully ignorant or disloyal to their institutions, libraries, or librarians. They just want what they want, when they want it.

Carolyn Gardner and Gabriel Gardner speak to this in their College and Research Libraries article from earlier this year:

“Poor usability is also hindering our patrons from gaining access to materials. Librarians need to apply user experience thinking to all our online systems. At our respective libraries, we have to click multiple times just to discover if an item is own. Besides complicated discovery methods, software or holdings errors are possible…Librarians need to view these crowdsourced communities as alternatives that fill a gap that we have yet to meet as opposed to purely underground and shadowy communities.” (CRL February 2017 pg 144)

When the film and television industries felt the crunch from piracy they invested in Netflix and created Hulu, and when the music industry faltered we got Spotify and other streaming platforms. Each of these systems allowed for the quick access to media that users stole to gain access to. Libraries should view SciHub and for-profit profiling systems not as a betrayal but as a call to change and action. If SciHub is easier to use than the library we cannot blame our users if they use it over our complicated systems. If the for-profit profiling systems are superior to the library administered in someways, perhaps that is what our faculty are looking for.

We as librarians shouldn’t  “teach” our patrons to adapt to our obtuse and oftentimes difficult systems but libraries should adapt to the needs of our patrons. I really do not want to be at odds with my colleagues who call for education on these issues, because education is needed on these issues. After all, we are in the business of education. Yet, I believe that, in some ways, we should respect our faculty for what they do know. They know that they need resources to do their job. They should know that the library is often the best source for these resources. They also know that there are some platforms that provide easier access to these materials. I do not begrudge faculty who seek easier paths towards the resources they need to do their jobs, as much as I don’t begrudge undergraduates (or librarians) who use Wikipedia as a first source of quick info. It is a symptom of the age of easy access to materials online, and it is something that we as librarians should learn about what our scholars are looking for.

The second part of this is adpatation. We should not only respect our patron’s decision making processes but we should listen when  faculty seek sleazier means towards library services, and adapt to this need. If the for-profit profiles do something that my profiles don’t, I should think about ways to build my system to reflect those needs. If access to materials needs to be quicker than three clicks through our system, we should work to make it easier to gain legal access to materials. We shouldn’t claim that we know more than they do just because we deal with our obtuse systems on the daily, we should adapt to their needs when they arise.

 

No, Fair! Evolving Perspectives on Excessive Use in Research

Midterm brings its share of bustle to the library with last minute research questions to ask and copiers and printers to locate.  Library staff are also busy negotiating licenses, finalizing renewals, and troubleshooting access to the resources on which faculty and students rely. I’d like to shed some light on a subtler side of the troubleshooting task that, while not a frequent occurrence, is a growing concern for me as a librarian and researcher. The technologies that enable this bustle of research activity can at times inadvertently trigger what publishers call excessive use or excessive downloading.  This is considered a breach of contract according to the licenses for these resources.  Remedying this breach usually involves working with university IT security to identify, inform, and prevent such use, assuring publishers that the breach is cured, and publishers then unblocking the network IP or IP range necessary to restore access to content.

Recently, I’ve been contemplating researchers’ expectations when working with scholarly content and technology.  What technologies are they using?   Are they compatible across content provider platforms?  How might they trigger excessive use breaches?  What exactly is excessive use or excessive downloading in an online research environment?

What publishers think

Sometimes the publisher’s license language specifies the use of bots, link-checker, crawlers, spiders, automated software, and even indexing as excessive or unauthorized.  But more often, breaches associated with this activity are not explicitly defined, nor are they put in context of excessive use within the license. This leaves it fairly open to interpretation.

Publishers must consider the perspective of copyright holders, and typically enforce equivalent limitations for online use that they would for physical print materials uses.  It sounds reasonable, but because in reality we use print and online resources very differently, such licenses terms may give up fair use and other scholarly exceptions granted by copyright law.  Publishers take an even heavier hand when responding to excessive use breaches.  Blocking the user’s IP access, or sometimes an entire campus IP range, presumes malicious intent (which it almost never is).  This response also exaggerates the stakes involved and misunderstands what is necessary to perform digital research. Strict reinterpretation of print use restrictions in the online environment denies advances in research technology, from basic citation management software to APIs used for text and data mining.  It also ignores the very structure of the linked-data world we live in.

What most people think

When users learn that their actions violate library license agreements, their reactions are  surprised, apologetic, and most often confused.  While some may be aware of the technologies that makes excessive downloading possible, most don’t believe they constitute unethical or unlawful actions.  Breach of contract itself is kind of a boogey-man phrase that brings more readily to mind data breaches like Equifax.  If people are aware of breaches occurring in academia, attention more often goes to those involving individual student records.

According to one IT security expert I asked, the kinds of scholarly content breaches I’m talking about don’t even register on the scale of data sensitivity or security.  Unless credentials were stolen in order to download excessively, it is not security issue; it’s a copyright issue.  Publishers who treat copyright infringement as a security issue might be mitigating risk, but they are not serving or educating their customer.

What librarians think

Librarians, naturally, do approach this from the service and education mindset. Increasingly that means a not just serving end-users within the academy, but the general public who pay for the research through their tax dollars. As researchers assert the right to retain copyright of their own content and share it more widely, more diverse collaboration is possible, increasing potential for innovative research discoveries.  Libraries assert copyright exceptions and expose inequities in traditional publishing structures in order to make openness for innovation possible as well.

Aaron Swartz profileBy Fred Benenson - User: Mecredis [CC BY 2.0 (http://creativecommons.org/licenses/by/2.0)], via Wikimedia Commons

I’ll digress briefly to the story of Aaron Swartz  for illustration and comparison.  He was an advocate of openness, yet his deliberate action to hack and release scholarly content provides, I suppose, a perfect case for publishers’ insistence to treat copyright as a security issue.  In this case, the breach involved 4 million documents.  The scope in numbers (less than 3% of the Equifax breach) pales by comparison, especially considering nature of the data and the consequences (or lack of) to those responsible and to those harmed.

Rarely are scholars’ actions as deliberate or the stakes of intellectual property loss as high as  this scholarly breach (or breaches of individuals’ personal data).  In fact many legitimate uses of scholarly research technologies are being blocked even to those with “rights” to use them.  Some examples of technology uses I’ve seen publishers block include citation management software like EndNote that indexes and stores full text where available.  As early as 2006, librarians reported browser technologies that link and open an articles’ cited references, triggering such use.  What about mining text and data  to discover disciplinary concepts across time and from journal publications that span multiple publishers?  Innovating digital researchers  are developing their own programming for this, but can they use it?  Are there alternatives, and are they open or proprietary?

My role as an acquisitions librarian means I must balance the needs of publishers supplying the content we license with needs of users who access that content for their research and study.  That balance falls somewhere between stoic realism and OAnarchy for me.  But I’m still a teacher at heart, so educating all sides remains my goal. In the traditional, profit-based publishing system, where flat library budgets mean buying power decreases each year,  I must follow open access developments carefully, just as I must work to negotiate the best deal within these existing structures.  There is always room in this to educate publishers, librarians, and users.

Learning more about the tools researchers use, wish they had, or wish they could use without being blocked from access is my next goal. In my troubleshooting experience so far,  tools like EndNote, Papers on Mac, Abstraktr, RedCap, WGET are just a few.  So tell me…

What digital research
(or reference citation management)
technologies are your researchers using?