Generative AI & the Evolution of Academic Librarianship

During my first week as an academic librarian, many faculty discussions on campus were regarding the issue of generative AI software, such as ChatGPT. A majority of the faculty at a panel discussion held on campus about AI expressed concerns over plagiarism, copyright, academic integrity, etc. Those on the panel, however, commented on how beneficial using AI was. When asked more specifically on what faculty should do to combat potential cheating from using generative AI, the panel seemed in agreeance on an answer: educate your students on how to responsibly use AI.

I will admit; prior to starting my career as an academic librarian, I had never used generative AI. Of course, I saw generative AI blasted all over the news and saw updates on sites and apps like Snapchat, but I never understood what generative AI was. I did not have any interest in learning about it either. After attending the panel discussion, however, I was reminded of a book I read called Who Moved My Cheese? by Dr. Spencer Johnson. I was assigned to read Who Moved My Cheese? by a professor in graduate school and often refer back to it (I highly recommend reading it if you have not already done so). The book explains how change can happen unexpectedly, and when it does, it is better to adapt and move forward than be left behind. Feeling like I was being left behind while other faculty embraced generative AI, I decided to learn as much as I could about it.

Although I read numerous articles and watched hours of YouTube videos, I was still confused as to how generative AI worked. Near the end of August, my dean notified the library faculty of a course offered through ALA’s eLearning platform. The course was titled Exploring AI with Critical Information Literacy and taught by Sarah Morris. I enrolled in the course and learned about the development and usage of generative AI and machine learning, current discussions around AI, opportunities and challenges for AI usage in higher education, and how to engage AI as an academic librarian. Throughout the course, we examined AI through a critical lens and discussed strategies for AI to be incorporated at our own institutions. I enjoyed the course and found the lesson on prompt engineering to be the most intriguing.

One of the ways in which academic librarians can enter the generative AI realm in higher education is through teaching faculty and students prompt engineering. Prompt engineering is strategizing your generative AI input to obtain your desired output. While one can simply ask ChatGPT a standard question, prompt engineering recommends telling ChatGPT through what lens to answer the question. For example, if I was wondering how to craft a lesson for my class on implicit bias, I could plainly input:

“What lesson on implicit bias could I give my college class?”

Using prompt engineering, a better input would be:

“Act like an Academic Librarian teaching a college course on critical thinking. Design a lesson about implicit bias. Include topics for the class to discuss in small groups.”

While the results appeared similar, the detailed prompt elicited a result more applicable to my course by covering topics such as bias in information sources and media literacy.

Another way academic librarians can educate faculty and students on generative AI is on responsible use. More specifically, we can create lessons and workshops around copyright, academic integrity, and the reliability of the output. I tried this with my critical thinking class. I first introduced the university’s academic integrity policy, including definitions of cheating and plagiarism. Because the majority of my class was unfamiliar with generative AI, I briefly explained how generative AI worked. Afterwards, I had the students discuss the potential benefits and challenges of using generative AI. Using my personal account (my university does not support the use of ChatGPT), I asked ChatGPT and had the students read the output. I stressed that when used responsibly, ChatGPT can be a great resource for brainstorming; however, I cautioned my students from using it for writing assignments due to plagiarism, copyright infringement, and incorrect information. To illustrate this point further, I informed my students of the two attorneys in New York who acquired case law through ChatGPT. The attorneys did not fact-check the case law, and the judge discovered that the case law actually did not exist. The cases ChatGPT cited were made up. Overall, the lesson was a success. Many students chose to explore generative AI in more depth for the final projects.

By embracing generative AI, academic librarians can increase their skillset and become a useful resource for faculty and students navigating the rapidly evolving world of AI. It will be interesting to learn about how varying universities respond, if they have not done so already. I imagine we will see new policies implemented on campus, positions established, and roles altered.

“Wait a minute Honey, I’m gonna add it up:” Kanopies, DRM, and the Permanence of the Collection

In my new position at the University of Washington I have a long commute, as one would expect, in a large city like Seattle. On this commute I listen to music and read and on the bus last week I reached for an old Midwestern standby, The Violent Femmes only to find that their first album, Violent Femmes (1983) had been removed from streaming platforms and, despite my purchase of the album electronically, had been removed from iTunes for me to listen to. (Reader, don’t worry many of the songs are available on their greatest hits record, aptly titled, Permanent Record.)

The Violent Femmes performing in 2006 (Wikimedia Commons)

In the last few weeks librarians have been confronted in various ways with the difficulties surrounding streaming and licensed materials. Kanopy, one of the largest and most popular streaming services available for library users, was recently and publicly dropped by the New York Public Library (NYPL). How we found this news out, and how it became well known, was a result of Kanopy sending an email to NYPL users who had registered for the service prior to NYPL’s own statement on the issue. In part, their message explained “The New York, Queens, and Brooklyn Public Libraries have decided to discontinue Kanopy’s film streaming service to its patrons…Film as a public resource is a critical part of New York’s culture and communities. We have enjoyed furthering the New York City Libraries’ mission of providing open access to knowledge… [emphasis mine].” 

Kanopy's Letter to Patrons
The Kanopy Letter sent to New York Public Library Patrons

Setting aside for a moment the frankly gross overstep of a vendor directly reaching out to library patrons about library budgetary or mission changes, let’s focus in on the language that Kanopy uses to describe their service: public resource and open access. For those of us who work in academic libraries and have dealt with the ongoing difficulties with providing access to streaming media for our communities, and especially those who are aware of Kanopy’s expensive nature, these kinds of words might make us take a pause.

As a cinema librarian I can say that film is an important part of cultural legacy and should be a public resource and that access to film should be part of any library’s collection mission. Yet, for many of us the way we consume and purchase media has dramatically changed in the past decade, as streaming and licensing digital files have become the norm for the majority of consumers. Kanopy fits into this very nicely. It’s interface looks remarkably similar to any other streaming platform, and invites users to click through its offerings like they would for Netflix, Criterion Channel, and Amazon Prime. It’s hidden cost, as we know, only triggers when a user clicks on a film and watches a certain amount of it. For users it seems free; like the offerings from Netflix that despite the monthly cost allows users to peruse and sample any film in the catalog. For the most part this is how streaming platforms like Kanopy have advertised themselves to our users.


I want to be clear that it is not my intention to pile on to Kanopy, because I truly believe that Kanopy provides a great service for spreading art and indie film to the widest audience. Rather I want us to think about how we are building collections and gathering materials in this new digital age of instant gratification and expectations.

Over the last year and especially in the wake of NYPL’s decision, I have seen many articles touting and promoting the great new “free” hidden service provided by the library. This article from Entertainment Weekly https://ew.com/movies/2019/01/18/free-streaming-service-kanopy/ emphasizes the accessibility and the free cost as these pillars of why Kanopy is amazing for users. And Kanopy for their part makes a pretty compelling case for this kind of access  CEO Olivia Humphrey states “‘We have such a wide audience,’ says Humphrey. ‘We have people who can’t afford an internet connection that go down to the local public library to watch…. That’s a really important demographic for us, [as much as] cinephiles in L.A. and New York.’ Part of serving that audience is finding what Humphrey calls “content gaps” in other streaming platforms and trying to fill the void” This is something that I think is a really wonderful part of Kanopy, is that it allows access to art and indie film through public libraries but at what cost?

Well…we often don’t know what the cost is. The model is certainly different at Academic institutions but one of the cited figures for public libraries is $2 a watch for each film, and some libraries have limited how often users can watch films a month in order to keep these figures down (
https://www.indiewire.com/2019/06/new-york-public-library-drops-kanopy-netflix-alternative-too-expensive-1202153550/ ) . For Academic institutions, Kanopy, and other services like it, are fairly reminiscent of our licensing agreements with our ebooks, and costs can be astronomical. In a Film Quarterly article critiquing the “freeness” of Kanopy, Chris Cagel, a film historian at Temple University, writes “Instead, Kanopy’s platform drives “patron-driven acquisition” in which three viewings (defined as 30 seconds or more of a title) trigger a library license fee per title. (The figures I’ve seen are $150 for a year, $350 for a 3-year license, though the price might vary or change over time.) (see: https://filmquarterly.org/2019/05/03/kanopy-not-just-like-netflix-and-not-free/ )” These costs can quickly go out of control for many libraries, and the larger the population and the more articles about how this “free service” is provided by libraries, complicate this matter. It leads us to the moment where we are forced to cancel subscriptions because our patrons are using it, rather than how we often weed in our collections based on lack of use or usefulness in a general sense.


…the larger expectation for our library within the community is that we are permanent repositories for information (see the issues we generally see when library’s weed their collections) digital media is anything but permanent, and we have to reconcile this fact with our user expectations.

I want to be clear that it is not my intention to pile on to Kanopy, because I truly believe that Kanopy provides a great service for spreading art and indie film to the widest audience. Rather I want us to think about how we are building collections and gathering materials in this new digital age of instant gratification and expectations and how we tell that story to our users. Our users will start to feel the loss of licenses when materials start to leave our collections, just as they are starting to see their own digital materials lost in their personal collections. On the same day that Kanopy and NYPL parted ways it was reported that ebooks purchased through the Microsoft Store would be deleted this month from those who had purchased them. https://gizmodo.com/ebooks-purchased-from-microsoft-will-be-deleted-this-mo-1836005672

Digital items with DRM (digital rights management) are never fully owned, instead they are licensed. You can read more about DRM from the grassroots anti-DRM movement Defective By Design. They even wrote an open letter to libraries https://www.defectivebydesign.org/LetterToLibraries. These objects can be locked to prevent sharing of the material to other users and they can be taken away, like the ebooks or like my precious Violent Femmes album. My institution is well off enough to encourage our subject liaisons to purchase ebooks without DRM (which increases the costs substantially), but many public libraries or smaller academic libraries cannot afford to pay an extra $150 to make sure digital items are the community’s to keep. But the larger expectation for our library within the community is that we are permanent repositories for information (see the issues we generally see when library’s weed their collections) digital media is anything but permanent, and we have to reconcile this fact with our user expectations.

In my own life I have begun collecting materials for myself in non-digital form. This means that I have spent money buying twenty-year-old video games, hard to find DVDs, and vinyl because I am aware of the tenuous grip that we have on our digital files and media. It is essential that libraries work to make our communities aware of the restrictions and the fugitive nature of digital licensed materials and platforms and work with our users to ensure their needs are met in this changing time. NYPL for their part explained their decision to move away from Kanopy stating that “The Library made this decision after a careful and thorough examination of its streaming offerings and priorities. We believe the cost of Kanopy makes it unsustainable for the Library, and that our resources are better utilized purchasing more in-demand collections such as books and e-books (https://www.nypl.org/press/press-release/june-24-2019/statement-about-kanopy

For a city of 8 million people, Kanopy was perhaps unsustainable, but NYPL is also making a point about how they see their collections growing and that is in books and ebooks. For libraries providing for the public good means making these kinds of decisions, and we need helpful partnerships with our vendors to provide this access. While I do not know what was going on in the minds of the directors of NYPL, it sure does not seem like the library system was wanting to wage this battle in the open prior to Kanopy’s patron email. Yet, this has become a moment where librarians can have conversations with patrons about the costs and limitations of streaming and digital materials. I, for one, have received several messages from my faculty colleagues about how the NYPL decision impacts us at the University of Washington, and I tell them that while it won’t change the way we interact with Kanopy (that decision was made long before I came here) but that this is an important teaching moment in our current climate. While the vendor spurred this conversation, I believe that libraries can have an important voice to share in this new media age.

Open Access and the Benevolence of Multinational Corporations

As with much of its history the academic library is at a crossroads. The exploding budgets for journal subscriptions which are necessary to the living and breathing research institution is slowly strangling libraries. This, of course, is obvious and much maligned and talked about. Getting back to the perceived roots of librarianship and the values of intellectual and learning freedom is an increase in open access publishing and learning in the minds of our left-leaning colleagues. The narrative has been pretty simple; open access moves the dissemination of information away from large corporate publishers and into the hands of “radical” faculty members who use their clout and expertise to provide information for the masses.

Gold open access (journals which publish fully open with little or no strings attached) is hardly the norm, and is outpaced in all metrics by Green open access (the self-archiving of pre or post print versions from non-open access journals). Gargouri, Larivière, Gingras, Carr, and Harnad (2010) found that unsurprisingly that subscription-based journals dominated STEM fields for publications, and only about 21% of their articles were available by green open access means. At the time of their study, only ~3% of publications were fully open access, evidence suggests this number has grown but not by much. While this number has surely grown in many fields, currently OA is dominated by Green and the dreaded hybrid journals.

Oftentimes, green OA is only possible with copyright strings that make it difficult for scholars to keep straight the versions, the citations, and the identifiers necessary to comply with author’s agreements. The burden is on the scholar to provide the necessary versions to libraries or other disciplinary repositories for the green model to work. While this can be seen as an open path set forth by the publishers, the hurdles and the arcane rules behind it makes the benevolence more of a blind eye. Some scholars I’ve spoken with do not want work viewed as “unfinished” or “unpolished” out on the internet, which is a far assumption to make. The “pre-print” especially because of its lack of peer-review and editing is very unappealing in some disciplines, while others, with long standing histories in open science have embraced it (looking at you Physics). On a practical side, how do we cite pre-prints and post-prints? I’m a librarian and I’m not actually sure the best action on that. When a journal owns the copyright on the very page numbers, how can I cite a passage I glean from an IR?

This has led me to often wonder whether green OA operates under the assumptions that overworked faculty and librarians will not follow through on the rules and therefore keep the article behind subscription walls.

The present and future of Open relies heavily on the benevolence of corporations to provide avenues for their content to be openly accessible. The success that libraries and scholars have had with green open access is limited by the rules set up by journals as well as the initiative of individual scholars. With many of the larger publishers showing anything from reluctance to open hostility to open access measures, this is a precarious proposition for libraries. Pressure from researchers and the past Presidential administration has made OA an important part of the scholarly communication environment yet we as researchers and as librarians are at the mercy of the large publishers to make this happen and need their partnerships and the continued patiences of our patrons to make this happen. Publishers, knowing the field’s love affair with open, have provided for open access in a pay-to-play model known as “hybrid.”

For many librarians, hybrid journals are seen as double dipping. Institutions are asked to provide extra money on top of growing subscription fees to make locked access articles fully open. APCs, the most common way to pay for these articles to be made open, range from a couple hundred dollars to upwards of $3000 depending on the field. For libraries chaffing under the threat of rising subscription fees this is not something many are willing to pay for no matter what our good intentions are to do. The elitist and competitive nature of publications and tenure requirements reinforce the need to publish in certain journals published expensively by certain publishers. The best journal in your field will allow you to have an open access version with rules that are complicated and impossible to understand or with the low price of several thousands of dollars make it gold open access for you. Wealthier scholars will soon pay the APC rather than jump through the hoops of green open access, if they know such a path even really exists.

What we are left with is a system that is built to perpetuate the subscription crises without any real and easy solution to full open accessibility. We either pay for subscriptions, pay for APCs, or pay for both. International and national boycotts, like the ones striking Western Europe  hurt the bottom line of publishers but harm faculty who need the journals to survive in this current scholarly climate. Pirate websites prey on our log in systems to provide “open” access to every published article but put our institutions, as well as researchers, at risk. While green avenues might be appealing, they are only the most common method of providing open access materials because of their inherently difficult nature. A journal wanting you to pay their hybrid fee would be happy to provide you with many hoops to jump through for a post-print. Relying on faculty to provide the correct versions is like relying on faculty respond to your Friday afternoon emails during the Summer; some will be pros at it but most will ignore you.

For now, we wait with baited breadth for the benevolence of publishers like the cave children who could be saved by Elon Musk’s submarine.

 

 

 

 

 

An instruction librarian, a digital scholarship librarian, and a scientist enter a Twitter chat…

A quick note to preface this post: Thank you, Dylan Burns. After reading your post–What We Know and What They Know: Scholarly Communication, Usability, and Un-Usability–I can’t stop thinking about this weird nebula of article access, entitlement, ignorance, and resistance. Your blog post has done what every good blog post should do: Make me think. If you haven’t read Dylan’s post yet, stop, go back, and read. You’ll be better for it. I promise.

I am an instruction librarian, so everything that I read and learn about within the world of library and information science is filtered through a lens of education and pedagogy. This includes things like Dylan Burns’ latest blog post on access to scholarship, #TwitterLibraryLoan, and other not-so-legal means of obtaining academic works. He argues that faculty who use platforms like #Icanhazpdf or SciHub are not “willfully ignorant or disloyal to their institutions, libraries, or librarians. They just want what they want, when they want it,” and that “We as librarians shouldn’t  ‘teach’ our patrons to adapt to our obtuse and oftentimes difficult systems but libraries should adapt to the needs of our patrons.”

My initial reaction was YES, BUT…which means I’m trying to think of a polite way to express dissent. Thankfully, Dylan’s always up for a good Twitter discussion, so here’s what ensued:

My gut reaction to libraries giving people “what they want, when they want it” is always going to be non-committal. I’ve never been one to subscribe to what a colleague a long time ago referred to as “eat your peas librarianship” (credit: Michelle Boulé). I don’t think things should be difficult just for the sake of being difficult because things were hard for me, and you youngin’s should have to face hardships too! But I am also enough of a parent to know that giving people what they want when they want it without telling them how it got there is going to cause a lot of problems (and possibly temper-tantrums) later on. Here’s where the education librarian in me emerges: I don’t want scholars to just be able to get what they want when they need/want it without understanding the deeper problems within the arguably broken scholarly publishing model. In other words, I want to advocate for Lydia Thorne’s model of educating scholars about scholarly publishing problems. To which Dylan responds:

To which I can only respond:

Point: Dylan. Those of us who teach have all had the experience of trying to turn an experience into a teaching moment, only to be met by rolling eyes, blank stares, sighs, huffs, etc. Is the scholarly publishing system so broken that even knowing about the problems with platforms like SciHub, scholars will still engage in the piracy of academic works because, well, it’s all a part of the game they need to play? Is this even an issue of usability then? Creating extremely user-friendly library systems won’t change the fact that some libraries simply can’t afford the resources their community wants/needs, but those scholars still need to engage in the system that produces that resources. Is it always going to be a lose-lose for libraries?

At this point a friend of mine enters the Twitter discussion. Jonathan Jackson is an instructor of neurology and researcher at Massachusetts General Hospital:

Prior to this conversation I’d not thought about #TwitterLibraryLoan and similar efforts at not-so-legal access to scholarship as acts of resistance, but Jonathan’s entrance into the discussion forced me to think about the power of publicly asking for pdfs. I’ll admit that part of me skeptical that all researchers are as politically conscious as Jonathan and his colleagues. I’m sure there are some folks who just need that article asap and don’t care how they get it. But there is power in calling out that one publisher or that one journal again and again on #ICanHazPDF because your library will never be able to afford that subscription.

I’ll admit that the whole Twitter exchange made me second guess motivations all around, which is what a good discussion should do, right?

No, Fair! Evolving Perspectives on Excessive Use in Research

Midterm brings its share of bustle to the library with last minute research questions to ask and copiers and printers to locate.  Library staff are also busy negotiating licenses, finalizing renewals, and troubleshooting access to the resources on which faculty and students rely. I’d like to shed some light on a subtler side of the troubleshooting task that, while not a frequent occurrence, is a growing concern for me as a librarian and researcher. The technologies that enable this bustle of research activity can at times inadvertently trigger what publishers call excessive use or excessive downloading.  This is considered a breach of contract according to the licenses for these resources.  Remedying this breach usually involves working with university IT security to identify, inform, and prevent such use, assuring publishers that the breach is cured, and publishers then unblocking the network IP or IP range necessary to restore access to content.

Recently, I’ve been contemplating researchers’ expectations when working with scholarly content and technology.  What technologies are they using?   Are they compatible across content provider platforms?  How might they trigger excessive use breaches?  What exactly is excessive use or excessive downloading in an online research environment?

What publishers think

Sometimes the publisher’s license language specifies the use of bots, link-checker, crawlers, spiders, automated software, and even indexing as excessive or unauthorized.  But more often, breaches associated with this activity are not explicitly defined, nor are they put in context of excessive use within the license. This leaves it fairly open to interpretation.

Publishers must consider the perspective of copyright holders, and typically enforce equivalent limitations for online use that they would for physical print materials uses.  It sounds reasonable, but because in reality we use print and online resources very differently, such licenses terms may give up fair use and other scholarly exceptions granted by copyright law.  Publishers take an even heavier hand when responding to excessive use breaches.  Blocking the user’s IP access, or sometimes an entire campus IP range, presumes malicious intent (which it almost never is).  This response also exaggerates the stakes involved and misunderstands what is necessary to perform digital research. Strict reinterpretation of print use restrictions in the online environment denies advances in research technology, from basic citation management software to APIs used for text and data mining.  It also ignores the very structure of the linked-data world we live in.

What most people think

When users learn that their actions violate library license agreements, their reactions are  surprised, apologetic, and most often confused.  While some may be aware of the technologies that makes excessive downloading possible, most don’t believe they constitute unethical or unlawful actions.  Breach of contract itself is kind of a boogey-man phrase that brings more readily to mind data breaches like Equifax.  If people are aware of breaches occurring in academia, attention more often goes to those involving individual student records.

According to one IT security expert I asked, the kinds of scholarly content breaches I’m talking about don’t even register on the scale of data sensitivity or security.  Unless credentials were stolen in order to download excessively, it is not security issue; it’s a copyright issue.  Publishers who treat copyright infringement as a security issue might be mitigating risk, but they are not serving or educating their customer.

What librarians think

Librarians, naturally, do approach this from the service and education mindset. Increasingly that means a not just serving end-users within the academy, but the general public who pay for the research through their tax dollars. As researchers assert the right to retain copyright of their own content and share it more widely, more diverse collaboration is possible, increasing potential for innovative research discoveries.  Libraries assert copyright exceptions and expose inequities in traditional publishing structures in order to make openness for innovation possible as well.

Aaron Swartz profileBy Fred Benenson - User: Mecredis [CC BY 2.0 (http://creativecommons.org/licenses/by/2.0)], via Wikimedia Commons

I’ll digress briefly to the story of Aaron Swartz  for illustration and comparison.  He was an advocate of openness, yet his deliberate action to hack and release scholarly content provides, I suppose, a perfect case for publishers’ insistence to treat copyright as a security issue.  In this case, the breach involved 4 million documents.  The scope in numbers (less than 3% of the Equifax breach) pales by comparison, especially considering nature of the data and the consequences (or lack of) to those responsible and to those harmed.

Rarely are scholars’ actions as deliberate or the stakes of intellectual property loss as high as  this scholarly breach (or breaches of individuals’ personal data).  In fact many legitimate uses of scholarly research technologies are being blocked even to those with “rights” to use them.  Some examples of technology uses I’ve seen publishers block include citation management software like EndNote that indexes and stores full text where available.  As early as 2006, librarians reported browser technologies that link and open an articles’ cited references, triggering such use.  What about mining text and data  to discover disciplinary concepts across time and from journal publications that span multiple publishers?  Innovating digital researchers  are developing their own programming for this, but can they use it?  Are there alternatives, and are they open or proprietary?

My role as an acquisitions librarian means I must balance the needs of publishers supplying the content we license with needs of users who access that content for their research and study.  That balance falls somewhere between stoic realism and OAnarchy for me.  But I’m still a teacher at heart, so educating all sides remains my goal. In the traditional, profit-based publishing system, where flat library budgets mean buying power decreases each year,  I must follow open access developments carefully, just as I must work to negotiate the best deal within these existing structures.  There is always room in this to educate publishers, librarians, and users.

Learning more about the tools researchers use, wish they had, or wish they could use without being blocked from access is my next goal. In my troubleshooting experience so far,  tools like EndNote, Papers on Mac, Abstraktr, RedCap, WGET are just a few.  So tell me…

What digital research
(or reference citation management)
technologies are your researchers using?