ChatGPT Can’t Envision Anything: It’s Actually BS-ing

 Since my first post on ChatGPT way back at the end of January (which feels like lifetimes ago), I’ve been keeping up with all things AI-related. As much as I can, anyway. My Zotero folder on the subject feels like it doubles in size all the time. One aspect of AI Literacy that I am deeply concerned about is the anthropomorphizing of ChatGPT; I have seen this more generally across the internet, and now I am seeing it happen in library spaces. What I mean by this is calling ChatGPT a “colleague” or “mentor” or referring to its output as ChatGPT’s thoughts.   

I am seriously concerned by “fun” articles that anthropomorphize ChatGPT in this way. We’re all librarians with evaluation skills that can critically think about ChatGPT’s answers to our prompts. But our knowledge on large language models varies from person to person, and it feels quite irresponsible to publish something wherein ChatGPT is referred to as a “colleague.” Even if ChatGPT is the one that “wrote” that.  

Part of this is simply because we don’t have much language to describe what ChatGPT is doing, so we resort to things like “what ChatGPT thought.” A large language model does not think. It is putting words in order based on how they’ve been put in order in its past training data. We can think of it like a giant autocomplete, or to be a bit crasser: a worldclass bullshitter.  

Because natural language is used both when engaging with ChatGPT and when it generates answers, we are more inclined to personify the software. In my own tests lately, my colleague pointed out that I said “Oh, sorry,” when ChatGPT said it couldn’t do something I asked it to do. It is incredibly difficult to not treat ChatGPT like something that thinks or has feelings, even for someone like me who’s been immersed in the literature for a while now.  Given that, we need to be vigilant about the danger of anthropomorphizing.  

I also find myself concerned with articles that are mostly AI-generated, with maybe a paragraph or two from the human author.  Given, the author had to come up with specific prompts and ask ChatGPT to tweak its results, but I don’t think that’s enough. My own post back in January doesn’t even list the ChatGPT results in its body; I link out to it, and all 890 words are my own thoughts and musings (with some citations along the way). Why are we giving a large language model a direct platform? And one as popular as ChatGPT, at that? I’d love to say that I don’t think people are going to continue having ChatGPT write their articles for them, but it just happened with a lawyer writing an argument with fake sources (Weiser, 2023).  

Cox and Tzoc wrote about the implications of ChatGPT for academic libraries back in March, and they have done a fairly good job with driving home that ChatGPT is not a “someone.” It’s continuously referred to as a tool throughout. I don’t necessarily agree that ChatGPT is the best tool to use in some of these situations; reference questions are one of those examples.  I tried doing this with my own ChatGPT account many times, and with real reference questions we’ve gotten at the desk here at my university. Some answers are just fine. There obviously isn’t any teaching going on, just ChatGPT spitting out answers. Students will come back to ChatGPT again and again because they aren’t being shown how to do anything, not to mention that ChatGPT can’t guide them through a database’s user interface. It will occasionally prompt the user for more information on their question, just like we as reference librarians do. It also suggests that users evaluate their sources more deeply (and to consult librarians).  

I asked it for journals on substance abuse and social work specifically, and it actually linked out to them and suggested that the patron check with their institution or library. If my prompt asks for “information from a scholarly journal,” ChatGPT will say it doesn’t have access to that. If I ask for research though, it’s got no problem spawning a list of (mostly) fake citations. I find it interesting what it will or won’t generate based on the specific words in your prompt. Due to this, I’m really not worried about ChatGPT replacing librarians; ChatGPT can’t do reference.  

We need to talk and think about the challenges and limitations that come with using ChatGPT. Algorithmic bias is one of the biggest challenges. ChatGPT is trained on a vast amount of data from the internet, and we all know how much of a cesspool the internet can be. I was able to get ChatGPT to give me bias by asking it for career ideas as a female high school senior: Healthcare, Education, Business, Technology, Creative Arts, and Social Services. In the Healthcare category, physician was not a listed option; nurse was first. I then corrected the model and told it I was male. Its suggestions now included Engineering, Information Technology, Business, Healthcare, Law, and Creative Arts. What was first in the Healthcare category? Physician.  

ChatGPT’s bias would be much, much worse if not for the human trainers that made the software safer to use. An article from TIME magazine by Billy Perrigo goes into the details, but just like social media moderation, training these models can be downright traumatic.  

There’s even more we need to think about when it comes to large language models – the environmental impact (Li et al, 2023), financial cost, opportunity cost (Bender et al, 2021), OpenAI’s clear intention to use us and our interactions with ChatGPT as training data, and copyright concerns. Personally, I don’t feel it’s worth using ChatGPT in any capacity; but I know the students I work with are going to, and we need to be able to talk about it. I liken it to SpellCheck; useful to a certain point, but when it tells me my own last name is spelled wrong, I can move on and ignore the suggestion.  I want to have conversations with students about the potential use cases, and when it’s not the best idea to employ ChatGPT. 

We as academic librarians are in a perfect position to teach AI Literacy and to help those around us navigate this new technology. We don’t need to be computer experts to do this – I certainly am not. But the first component of AI Literacy is knowing that large language models like ChatGPT cannot and do not think. “Fun” pieces that personify the technology only perpetuate the myth that it does.  

References 

Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? ?. Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610–623. https://doi.org/10.1145/3442188.3445922 

Cox, C., & Tzoc, E. (2023). ChatGPT: Implications for academic libraries. College & Research Libraries News, 84(3), 99. https://doi.org/10.5860/crln.84.3.99

Li, P., Yang, J., Islam, M. A., & Ren, S. (2023). Making AI Less “Thirsty”: Uncovering and Addressing the Secret Water Footprint of AI Models (arXiv:2304.03271). arXiv. http://arxiv.org/abs/2304.03271 

Perrigo, B. (2023, January 18). Exclusive: The $2 Per Hour Workers Who Made ChatGPT Safer. Time. https://time.com/6247678/openai-chatgpt-kenya-workers/ 

Weiser, B. (2023, May 27). Here’s What Happens When Your Lawyer Uses ChatGPT. New York Times (Online). https://www.proquest.com/nytimes/docview/2819646324/citation/BD819582BDA74BAAPQ/1 

Moving Forward, Not Backward

Editor’s note: We are pleased to welcome Cynthia Mari Orozco to the ACRLog team. Cynthia is the Equity + OER Librarian at East Los Angeles College and PhD Candidate in Information Studies at UCLA. She is a former resident librarian at Loyola Marymount University and tenure-track librarian at California State University Long Beach before moving into community college librarianship. Her interests include OER and open pedagogy, information literacy, and scholarly communications in community colleges.

I graduated from library school in 2011, and I recall library workers bemoaning the overuse of the term “21st century library” in publication and presentation titles. At the time, I mostly agreed with this criticism. However, having now been in the profession for over 10 years, it sometimes feels like some spaces are not operating in the present century.

But I get it. When I started my online MLIS at San José State University, I remember thinking, “Let’s try this out for a semester. If it doesn’t work out, we can apply to an in-person program.” I had previously attempted to take an online course at my local community college and failed miserably. The course wasn’t hard, I don’t think; I just forgot to sign into the learning management system (LMS) and do the work. Entering the SJSU program, I was now confronted with another LMS and many new technologies that I had long resisted or never even heard of: teleconferencing, blogging, wikis, and Second Life, among others. In this first semester, I also started my first and current Twitter account! I lurked for at least two years before actively tweeting, or at least attempting to tweet. These technologies were all new and scary to me, and it took years to develop the comfort and ease that I now am privileged to have. Through SJSU, I learned how to thrive in an online environment by communicating synchronously and asynchronously, working across time zones and geographies, working on projects with various groups of people, and developing a curiosity for utilizing and assessing new technologies to improve my library work.

In early 2020, I was pulled into a Zoom meeting in my library not because I was involved with this group in any way but rather to connect to and facilitate Zoom in one of our classrooms. I logged into my Zoom account as no one else in the room had one at the time, I brought my microphone as the computer had none, and I pushed buttons as requested by the group. While slightly annoyed at the time, again, I’ve had the privilege of having years of experience using teleconferencing for my work. And throughout the last several years, I’ve witnessed all of my colleagues rise to the occasion to work effectively online with even the Zoom resisters now able to fully use Zoom on their own. I’m incredibly proud of my colleagues, and I have been looking forward to seeing how our experience in working remotely and providing remote services will affect our work from this point forward.

At East Los Angeles College, we have a smaller campus 10 miles away in the city of South Gate, where I often work. The South Gate campus library is one large room with one librarian and one library technician who both work at a public-facing desk during all operating hours. Our campus conversations often include advocating for equitable student services at this campus, which is often overshadowed by our Monterey Park campus (known by most as the “Main Campus,” which further perpetuates the relative importance of this campus). Do you know where a lot of this advocacy work happens? In meetings! In conversations with the ELAC community! And through teleconferencing, employees at South Gate have been able to attend meetings that one would otherwise have to miss when tied to a physical location at a specific time. While the pandemic has admittedly been very isolating, remote meetings have raised attendance and participation, an amazing opportunity for advocacy and diversity of thought.

I see our campus starting to revert to our default in-person modalities and assumptions that everything in-person is “better.” Advocacy for our smaller South Gate campus is just one example of how online technology has allowed us to improve our work, thus improving campus conditions for our students. It would be an utter shame to dismiss the progress we’ve made over the last several years. It’s also worth remembering that 12-hour days on Zoom is not normal and probably way too much! But it also doesn’t mean that we ought to completely discard remote or hybrid options for meetings and conversations.

Herd Immunity

I’ll add to the post-ACRL 2019 conference reflection writing with a nod to the presentation I can’t stop thinking about and sharing with colleagues:

When Research Gets Trolled: Digital Safety for Open Researchers
by Reed Garber-Pearson, Verletta Kern, Madeline Mundt, Elliot Stevens, and Madison Sullivan

This group of librarians from the University of Washington advocate for educating scholars on digital safety and privacy, particularly those who make their work publicly accessible, do research with or about people from marginalized groups, and/or identify as a member of a marginalized group. They acknowledge the risk that public intellectuals, or scholars who seek to make their work open, take on in this world of targeted online harassment, doxxing, and offline threats. People of color, and women of color in particular, are most likely to be impacted by these acts of sabotage and harassment; we need only look at Roxane Gay‘s Twitter feed at any given moment to see this kind of gross activity.

It is, quite frankly, terrifying.

The presenters make the case that this kind of trolling can have a serious impact on academic and intellectual freedom: If a researcher is brutally bullied online and threatened offline, will they be less likely to continue their line of research and make their work publicly available? For all that we in libraries push for open access to research, we need to be equally concerned about the safety and well-being of the researchers we are asking to share their work. In advocating for their safety and sharing information about protecting themselves online, librarians can help boost what the panelists’ referred to as “herd immunity.” Researchers who protect themselves online also protect their colleagues, friends, and families, as online harassers often jump between networks to target others.

As a woman of color who does most of her thinking and writing openly online, I will admit that this presentation hit me hard. I have friends and acquaintances who have been horribly bullied on social media and in comments (yes, I always read the comments and know it is the wrong thing to do). I always thought this was to be endured. Trolls gonna troll. I am so appreciative of this collective of librarians who are sharing ways to prevent, or at least mitigate this harm and harassment. I thought the presenters struck the right tone–not alarmist, but informative and considerate. They had the best interests of researchers–and yes, that includes us as librarians–in mind. Their goal was to embolden us, not frighten us into retreating. This presentation was a good reminder that supporting researchers doesn’t end when the research concludes. If we want to push for open access and a public discourse of scholarship we have a professional obligation promote the digital safety that allows this open exchange to flourish.

You can read notes from the panel on a collaborative GoogleDoc, view their presentation slides online, and begin thinking about how you can create digital herd immunity at your institution.

Digital Musings on the High School to College Transition

This year my kid is a senior in high school, and we’ve spent the past month recuperating from the flurry of college application activity last fall. As should not be a surprise, college admissions have changed lots since I applied to colleges in the pre-internet era, though I somehow still found parts of the process surprising.

It’s 2019, so of course all colleges use online applications. All of the schools my kid applied to accepted one of the common applications, which allow applicants to use one platform to submit the same application to multiple schools. My kid took the SAT and several subject tests, which required registering and sending scores to colleges via the College Board’s website. We were also required by his high school to use an online platform to manage their part of the application process — sending teacher recommendations and transcripts — by linking up that platform to the common application platform. And don’t even get me started on the FAFSA.

There are about 1,500 students in my kid’s senior class, and four (4!) guidance counselors. He attends one of New York City’s public specialized high schools and lots of students apply to selective schools, each of which require additional essays, video uploads, or other materials. Throughout this whole process last fall — which we were fortunate to be able to complete in our apartment where we have broadband internet access and laptops — I could not stop thinking about all of the kids in his school who don’t have that kind of access. They’re filling out college applications in the school library, the public library, maybe at their parents’ workplaces. They may have questions; they definitely have questions, it’s a complicated process on platforms that are not always intuitive to use, and they might have to make several appointments with counselors to have their questions answered.

Throughout my kid’s high school years I’ve thought about the digital divide. The classes he’s taken have required multiple accounts on multiple online systems, some provided by the NYC Dept of Education, some homework systems offered by other entities, and of course the everpresent Google for his high school email account. From talking with other parents in and outside of NYC it seems like most K-12 students are required to use multiple different digital platforms throughout their schooling. In our experience there has been little guidance or training for students or parents on how to use these systems, and no way to opt out of their use.

While I’m concerned with digital literacy, and the assumptions that the persistent “digital native” trope encourages us to make about how students use these required platforms, I’m also concerned about data privacy. My kid’s high school and all of these various college application systems have so much information about him and created by him. Each college he applied to required him to set up an account on their system to communicate admissions decisions. How many schools — primary through higher ed — have digital information about students who are no longer enrolled or perhaps won’t even be admitted? Yes, educational institutions retained student (or prospective student) data in the past, but file cabinets full of paper applications in an admissions office don’t have the same information security implications as a digital database.

While it’s certainly been cathartic for me to write out my frustrations, how does this connect to libraries? I continue to keep in mind our students’ experiences with technologies, remembering that they’ve likely had varying exposure to training on digital platforms for school use, as well as varying access to the technology needed to use those platforms. Not every student has a computer with broadband internet access at home. It also feels ever more urgent to me for libraries to strengthen our data privacy practices, a huge issue that we don’t have complete control over, with so many of our digital platforms controlled by vendors. I’m cheered that there are librarians and others doing great work on data privacy issues, including the National Web Privacy Forum (which I was fortunate to participate in), focusing on how we might protect patrons from third-party tracking, and the Data Doubles project, which is examining students’ perspectives on data collection by libraries and institutions of higher ed. I’m looking forward to digging into the results of this work as these projects progress. And in the meantime, perhaps I’ll work with my kid to see what data we might delete from all of these systems once he no longer needs them to have it.

What We Know and What They Know: Scholarly Communication, Usability, and Un-Usability.

Over the past handful of years, a lot of digital ink has been spilled on library responses to #icanhazpdf, SciHub, and, most recently, the #Twitterlibraryloan movement. This hit home in my life because  in recent discussion with students at my University, we found that students told us outright that they used SciHub because of its ability to “get most things.”

How we talk about piracy with our patrons is an important topic for discussion, and places a tremendous amount of emphasis on the ethics of a for-profit publishing model. But it places librarians in a precarious situation defending publishing practices that build barriers to research.

SciHub Pirates, from the Rjiksmuseum in Amsterdam. Schip van de schrijver Jean de Thevenot door zeerovers overmeesterd, Jan Luyken, 1681

 

 

 

 

 

 

 

 

 

 

 

 

 

Lydia Thorn wrote an excellent piece about teaching professors and students about the importance of legal means of acquisition, pointing to an expectation of immediate access and declining library budgets as culprits in this explosion of piracy. Thorn suggests pointing to the ways in which piracy hurts small presses and not-for-profit publishers and how the library can and should fill these needs. She also suggests that we point to several open models that provide access to materials without the illegality of piracy.

Switching gears slightly, it reminds me of the difficulties I have in working with faculty on online scholarly profiles. Because I administer DigitalCommons@USU, and its profiling system Selected Works, I am often confronted with faculty and students who use the for-profit academic profiling systems (I’m using this difficult phrase to talk about the systems that we all know but I’d rather not name) that are extremely popular across the world and across disciplines.

What brings these two examples and issues together is the way in which we, as librarians, promote ourselves as experts in this realm and how, in a lot of ways, our strategies for promoting our services fall flat. Faculty are not cynical monsters who actively search for ways to be “anti-library,” but make rational choices that fit what they need. They aren’t very often knowledgeable about the inner working of collection development or the serials crisis but they are knowledgeable about what they need right now in their academic careers.

I explain to my faculty, much like Thorn suggests, that the for-profit profiling systems are sometimes deceptive, corporate, and, often times, include illegal materials. While the illegality of the for-profit profiles often reaches faculty, who want to avoid any legal entanglements, the prevalence of these systems does not seem to be waning. The library’s 100% legal version pales in popularity in comparison to the others, who are often much more popular in certain fields. Who am I to tell professors not to choose these options in academic areas where for-profit profiles are more valuable than the library’s resources? Despite my feelings to the contrary, sometimes the for-profit profiles fit certain scholars well.

This brings me back to the issues surrounding SciHub and #Icanhazpdf. The important thing to remember about our users is that they spend much less time than we do worrying about these things. For them, the ease of use of a for-profit profile or a pirated pdf warehouse is an issue of access and not a preference towards profits or not-profits. While each choice we make as actors is political, I do not believe that our faculty who use these platforms are willfully ignorant or disloyal to their institutions, libraries, or librarians. They just want what they want, when they want it.

Carolyn Gardner and Gabriel Gardner speak to this in their College and Research Libraries article from earlier this year:

“Poor usability is also hindering our patrons from gaining access to materials. Librarians need to apply user experience thinking to all our online systems. At our respective libraries, we have to click multiple times just to discover if an item is own. Besides complicated discovery methods, software or holdings errors are possible…Librarians need to view these crowdsourced communities as alternatives that fill a gap that we have yet to meet as opposed to purely underground and shadowy communities.” (CRL February 2017 pg 144)

When the film and television industries felt the crunch from piracy they invested in Netflix and created Hulu, and when the music industry faltered we got Spotify and other streaming platforms. Each of these systems allowed for the quick access to media that users stole to gain access to. Libraries should view SciHub and for-profit profiling systems not as a betrayal but as a call to change and action. If SciHub is easier to use than the library we cannot blame our users if they use it over our complicated systems. If the for-profit profiling systems are superior to the library administered in someways, perhaps that is what our faculty are looking for.

We as librarians shouldn’t  “teach” our patrons to adapt to our obtuse and oftentimes difficult systems but libraries should adapt to the needs of our patrons. I really do not want to be at odds with my colleagues who call for education on these issues, because education is needed on these issues. After all, we are in the business of education. Yet, I believe that, in some ways, we should respect our faculty for what they do know. They know that they need resources to do their job. They should know that the library is often the best source for these resources. They also know that there are some platforms that provide easier access to these materials. I do not begrudge faculty who seek easier paths towards the resources they need to do their jobs, as much as I don’t begrudge undergraduates (or librarians) who use Wikipedia as a first source of quick info. It is a symptom of the age of easy access to materials online, and it is something that we as librarians should learn about what our scholars are looking for.

The second part of this is adpatation. We should not only respect our patron’s decision making processes but we should listen when  faculty seek sleazier means towards library services, and adapt to this need. If the for-profit profiles do something that my profiles don’t, I should think about ways to build my system to reflect those needs. If access to materials needs to be quicker than three clicks through our system, we should work to make it easier to gain legal access to materials. We shouldn’t claim that we know more than they do just because we deal with our obtuse systems on the daily, we should adapt to their needs when they arise.