Category Archives: Information Literacy

Leaves of Graph

ACRLog welcomes a guest post from Pete Coco, the Humanities Liaison at Wheaton College in Norton, MA, and Managing Editor at Each Moment a Mountain.

Note: This post makes heavy use of web content from Google Search and Knowledge Graph. Because this content can vary by user and is subject to change at anytime, this essay uses screenshots instead of linking to live web pages in certain cases. As of the completion of this post, these images continue to match their live counterparts for a user from Providence, RI not logged in to Google services.

This That, Not That That

Early this July, Google unveiled its Knowledge Graph, a semantic reference tool nestled into the top right corner of its search results pages. Google’s video announcing the product makes no risk of understating Knowledge Graph’s potential, but there is a very real innovation behind this tool and it is twofold. For one, Knowledge Graph can distinguish between homonyms and connect related topics. For a clear illustration of this function, consider the distinction one might make between bear and bears. Though the search results page for either query include content related to both grizzlies and quarterbacks, Knowledge Graph knows the difference.

Second, Knowledge Graph purports to contain over 500 million articles. This puts it solidly ahead of Wikipedia, which reports having about 400 million, and lightyears ahead of professionally produced reference tools like Encyclopaedia Brittanica Online, which comprises an apparently piddling 120,000 articles. Combine that almost incomprehensible scope with integration into Google Search, and without much fanfare suddenly the world has its broadest and most prominently placed reference tool.

For years, Google’s search algorithm has been making countless, under-examined choices on behalf of its users about the types of results they should be served. But at its essence, Knowledge Graph presents a big symbolic shift away from (mostly) matching it to web content — content that, per extrinsic indicators, the search algorithm serves up and ranks for relevance — toward the act of openly interpreting the meaning of a search query and making decisions based in that interpretation. Google’s past deviations from the relevance model, when made public, have generally been motivated by legal requirements (such as those surrounding hate speech in Europe or dissent in China) and, more recently, the dictates of profit. Each of these moves has met with controversy.

And yet in the two months since its launch, Knowledge Graph has not been a subject of much commentary at all. This is despite the fact that the shift it represents has big implications that users must account for in their thinking, and can be understood as part of larger shifts the information giant has been making to leverage the reputation earned with Search toward other products.

Librarians and others teaching about internet media have a duty to articulate and problematize these developments. Being in many ways a traditional reference tool, Knowledge Graph presents a unique pedagogic opportunity. Just as it is critical to understand the decisions Google makes on our behalf when we use it to search the web, we must be critically aware of the claim to a newly authoritative, editorial role Google is quietly staking with Knowledge Graph — whether it means to be claiming that role or not.

Perhaps especially if it does not mean to. With interpretation comes great responsibility.

Some Questions

The value of the Knowledge Graph is in its ability to authoritatively parse semantics in a way that provides the user with “knowledge.” Users will use it assuming its ability to do this reliably, or they will not use it at all.

Does Knowledge Graph authoritatively parse semantics?

What is Knowledge Graph’s editorial standard for reliability? What constitutes “knowledge” by this tool’s standard? “Authority”?

What are the consequences for users if the answer to these questions is unclear, unsatisfactory, or both?

What is Google’s responsibility in such a scenario?

He Sings the Body Electric

Consider an example: Walt Whitman. As of this writing, the poet’s entry in Knowledge Graph looks like this (click the image to enlarge):

You might notice the most unlikely claim that Whitman recorded an album called This is the Day. Follow the link and you are brought to a straight, vanilla Google search for this supposed album’s title. The first link in that result list will bring you to a music video on Youtube:

Parsing this mistake might bring one to a second search: “This is the Day Walt Whitman.” The results list generated by that search yield another Youtube video at the top, resolving the confusion: a second, comparably flamboyant Walt Whitman, a choir director from Chicago, has recorded a song by that title.

Note the perfect storm of semantic confusion. The string “Walt Whitman” can refer to either a canonical poet or a contemporary gospel choir director while, at the same time, “This is the Day” can refer either to a song by The The or that second, lesser-known Walt Whitman.

Further, “This is the Day” is in both cases a song, not an album.

Knowledge Graph, designed to clarify exactly this sort of semantic confusion, here manages to create and potentially entrench three such confusions at once about a prominent public figure.

Could there be a better band than one called The The to play a role in this story?

Well Yeah

This particular mistake was first noted in mid-July. More than a month later, it still stands.

At this new scale for reference information, we have no way of knowing how many mistakes like this one are contained within Knowledge Graph. Of course it’s fair to assume this is an unusual case, and to Google’s credit, they address this sort of error in the only feasible way they could, with a feedback mechanism that allows users to suggest corrections. (No doubt bringing this mistake the attention of ACRLog’s readers means Walt Whitman’s days as a time-traveling new wave act are numbered.)

Is Knowledge Graph’s mechanism for correcting mistakes adequate? Appropriate?

How many mistakes like this do there need to be to make a critical understanding of Knowledge Graph’s gaps and limitations crucial to even casual use?

Interpreting the Gaps

Many Google searches sampled for this piece do not yield a Knowledge Graph result. Consider an instructive example: “Obama birth certificate.” Surely, there would be no intellectually serious challenge to a Knowledge Graph stub reflecting the evidence-based consensus on this matter. Then again, there might be a very loud one.

Similarly not available in Knowledge Graph are stubs on “evolution,” or “homosexuality.” In each case, it should be noted that Google’s top ranked search results are reliably “reality-based.” Each is happy to defer to Wikipedia.

In other instances, the stub for topics that seem to reach some threshold of complexity and/or controversy defers to “related” stubs in favor of making nuanced editorial decisions. Consider the entries for “climate change” and the “Vietnam war,” here presented in their entirety.

In moments such as these, is it unreasonable to assume that Knowledge Graph is shying away from controversy and nuance? More charitably, we might say that this tool is simply unequipped to deal with controversy and nuance. But given the controversial, nuanced nature of “knowledge,” is this second framing really so charitable?

What responsibility does a reference tool have to engage, explicate or resolve political controversy?

What can a user infer when such a tool refuses to engage with controversy?

What of the users who will not think to make such an inference?

To what extent is ethical editorial judgment reconcilable with the interests of a singularly massive, publicly traded corporation with wide-ranging interests cutting across daily life?

One might answer some version of the above questions with the suggestion that Knowledge Graph avoids controversy because it is programmed only to feature information that meets some high standard of machine-readable verification and/or cross-referencing. The limitation is perhaps logistical, baked into the cake of Knowledge Graph’s methodology, and it doesn’t necessarily limit the tool’s usefulness for certain purposes so long as the user is aware of the boundaries of that usefulness. Perhaps in that way this could be framed as a very familiar sort of challenge, not so different from the one we face with other media, whether it’s cable news or pop-science journalism.

This is all true, so far as it goes. Still, consider an example like the stub for HIV:

There are countless reasons to be uncomfortable with a definition of HIV implicitly bounded by Ryan White on one end and Magic Johnson on the other. So many important aspects of the virus are omitted here — the science of it, for one, but even if Knowledge Graph is primarily focused on biography, there are still important female, queer or non-American experiences of HIV that merit inclusion in any presentation of this topic. This is the sort of stub in Knowledge Graph that probably deserves to be controversial.

What portion of useful knowledge cannot — and never will — bend to a machine-readable standard or methodology?

Ironically, it is Wikipedia that, for all the controversy it has generated over the years, provides a rigorous, deeply satisfactory answer to the same problem: a transparent governance structure guided in specific instances by ethical principle and human judgment. This has more or less been the traditional mechanism for reference tools, and it works pretty well (at least up to a certain scale). Even more fundamental, length constraints on Wikipedia are forgiving, and articles regularly plumb nuance and controversy. Similarly, a semantic engine like Wolfram Alpha successfully negotiates this problem by focusing on the sorts of quantitative information that isn’t likely to generate so much political controversy. The demographics of its user-base probably help too.

Of course, Google’s problem here is that it searches everything for every purpose. People use it everyday to arbitrate contested facts. Many users assume that Google is programmatically neutral on questions of content itself, intervening only to organize results for their relevance to our questions; Google, then, has no responsibility for the content itself. This assumption is itself complicated and, in many ways, was problematic even before the debut of Knowledge Graph. All the same, it is a “brand” that Knowledge Graph will no doubt leverage in a new direction. Many users will intuitively trust this tool and the boundaries of “knowledge” enforced by its limitations and the prerogatives of Google and its corporate actors.

So:

Consider the college freshman faced with all these ambiguities. Let’s assume that she knows not to trust everything she reads on the internet. She has perhaps even learned this lesson too well, forfeiting contextual, critical judgment of individual sources in favor of a general avoidance of internet sources. Understandably, she might be stubbornly loyal to the internet sources that she does trust.

Trading on the reputation and cultural primacy of Google search, Knowledge Graph could quickly become a trusted source for this student and others like her. We must use our classrooms to provide this student with the critical engagement of her professors, librarians and peers on tools like this one and the ways in which we can use them to critically examine the gaps so common in conventional wisdom. Of course Knowledge Graph has a tremendous amount of potential value, much of which can only proceed from a critical understanding of its limitations.

How would this student answer any of the above questions?

Without pedagogical intervention, would she even think to ask them?

Summer Projects

Ah, summer! A time when we all get to take a deep breath and work on all those things we put off during the school year. I’ve always thought that summer at an academic library is sort of a strange time. Even though it feels more relaxed in and around campus, we’re still  quite busy getting things ready before the students return. Last week when I realized that it was already August, I had to stifle a feeling of panic—the summer feels like its slipping away along with the time to work on all my projects.

Three projects that I’ve been working on over the summer include:

  • Reviewing the collection: Our library is doing a massive and much needed inventory and collection review project. This has involved the efforts of practically every person in the building. For my part, I’ve been looking at each of our music and theatre arts holdings and determining what could be withdrawn (–teaching faculty will get the final say). There have been endless book trucks coming in and out of my office. Nevertheless, it has been a great opportunity for me to see the strengths and weaknesses of the collection.
  • Processing opera scores: A few years ago my institution received a large donation of hundreds of music scores from the wife of a former opera professor. Most of these are opera scores. The collection has sat untouched awaiting cataloging and processing. Thankfully I was able to hire a music cataloger this summer and we are almost finished with cataloging the entire collection. Some items include incredibly rare 18th century first edition opera scores. In the future, I would like to apply for a grant to digitize some of these rare materials. But for now, I’ll just be relieved and satisfied once they officially join our collection.
  • Combining the Olympics and information literacy: While I am not a huge sports fan, whenever the Olympics roll around, I find myself glued to the television practically every night—especially for gymnastics, swimming, and track and field. Lately I’ve been thinking that there must be a way for me to incorporate some sort of Olympic-themed activity or research inquiry into one of my information literacy sessions this Fall. So far nothing has come to me, but I have had a lot of fun perusing the official website for the Olympics—including their photo gallery which contains over a hundred galleries based on year and sport.  The photos go as far back as the 1896 games in Athens.

What huge projects are you working on this summer and will you actually finish them?

Digital Badges for Library Research?

The world of higher education has been abuzz this past year with the idea of digital badges. Many see digital badges as an alternative to higher education’s system of transcripts and post-secondary degrees, which are constantly being critically scrutinized for their value and ability demonstrate that students are ready for a competitive workforce. There have been several articles from the Chronicle of Higher Education discussing this educational trend. One such article is Kevin Carey’s “A Future Full of Badges,” published back in April. In it, Carey describes how UC Davis, a national leader in agriculture, is pioneering a digital open badge program.

UC Davis’s badge system was created specifically for undergraduate students majoring in Sustainable Agriculture and Food Systems. Their innovative system was one of the winners of the Digital Media and Learning Competition (sponsored by Mozilla and the MacArthur Foundation). According to Carey,

Instead of being built around major requirements and grades in standard three-credit courses, the Davis badge system is based on the sustainable-agriculture program’s core competencies—”systems thinking,” for example. It is designed to organize evidence of both formal and informal learning, from within traditional higher education and without.

As opposed to a university transcript, digital badges could provide a well-rounded view of a student’s accomplishments because it could take into account things like conferences attended and specific skills learned. Clearly, we’re not talking about Girl Scout badges.

Carey seems confident that digital badges aren’t simply a higher education fad. He believes that that with time, these types of systems will grow and be recognized by employers. But I’m still a bit skeptical over whether this movement will gain enough momentum to last.

But just for a moment, let’s assume that this open badge system proves to be a fixture in the future of higher education. Does this mean someday a student could get a badge in various areas of library research, such as searching Lexis/Nexis, locating a book by its call number, or correctly citing a source within a paper? Many college and university librarians struggle with getting information competency skills inserted into the curriculum in terms of learning outcomes or core competencies. And even if they are in the curriculum, librarians often struggle when it comes to working with teaching faculty and students to ensure that these skills are effectively being taught and graded. Perhaps badges could be a way for librarians to play a significant role in the development and assessment student information competency skills.

Would potential employers or graduate school admissions departments be impressed with a set of library research badges on someone’s application? I have no idea. But I do know that as the amount of content available via the Internet continues to grow exponentially, the more important it is that students possess the critical thinking skills necessary to search, find, assess, and use information. If digital badges do indeed flourish within higher education, I hope that library research will be a vital part of the badge sash.

Teaching Workload and New Librarians

The following story is true. However, the names have been changed to protect the innocent.

Meredith, an acquaintance of mine from library school, is an extraordinarily bright person with an amazing attitude. The moment I met her, I knew she would make an amazing librarian. Despite the small number of jobs available to academic librarians in this economy and despite being limited geographically, Meredith was hired fresh out of library school as a full-time adjunct instruction librarian at a medium-sized public university. In her first semester Meredith somehow taught over 40 instruction sessions, which included several two-week intensive information literacy course sequences for introductory general education courses.

On the Friday before spring semester classes began, Meredith was informed by her administrators that no temporary staff were to be hired to fill in for a librarian going on sabbatical. Instead, Meredith was now expected to take on 50% of her colleague’s workload, without any additions to her salary. Previously, Meredith had provided her superiors with a thorough account of her work hours—complete with professional standards from the ACRL Standards of Proficiencies for Instruction Librarians and Coordinators—in order to demonstrate that she had a full workload.  Despite this, they believed that she was under-worked and that this addition to her current duties would bring her up to full-time.

To make a long story short, Meredith decided to fight this by arguing that if she was forced to take on 50% more work, the quality of education that she provides would severely deteriorate. She told me, “I cannot roll over and become part of the cycle that is perpetuating the corporatization of higher education.” In the end, Meredith was able to prevent the increase to her workload.

This situation is the result of an unfortunate combination of massive budget cuts and administrators questioning the value of teaching information competency in higher education.  While Meredith’s situation is extreme, I have a feeling that her situation may not be an isolated incident. In this economic climate of dramatic budget cuts, librarians—particularly new, adjunct, and temporary librarians—are especially vulnerable. And the time available for some of us to provide effective instruction in information competency is getting compacted with additional duties and tasks.

I don’t want to make this a “librarians vs. them” kind of a thing because I realize there are a lot of complicated factors at play. But I would like to know: how do we successfully determine and prove what a feasible teaching workload is and how can new librarians like Meredith effectively share and demonstrate workload concerns with their administrators?

The Trouble With Books

Last week I had the opportunity to participate in a conversation with faculty in the library and in other academic departments about undergraduate research assignments. We discussed some of the stumbling blocks that our students seem to face, especially as they search for sources for their papers. It’s hard for us to put ourselves back into the novice mindset that our students have, particularly in their first and second year at college, in which they’re not (yet) familiar with the disciplines. We don’t want them to use Wikipedia or other encyclopedias (which may become increasingly scarce?) as research sources, though for background information they’re great. But many students are just not ready to tackle the scholarly research articles that they’ll find when they search JSTOR or even Academic Search Complete.

More and more often I’m convinced that our beginning undergraduates need to use books for their research assignments. Books can bridge that gap between very general and very scholarly that is difficult to find in a journal article. They often cover a broad subject in smaller chunks (i.e., chapters), and can provide a good model for narrowing a topic into one that’s manageable for a short research assignments. Books can also help students exercise the muscles that they need for better internet and database searching as they mine chapter titles and the index for keywords. I’ve begun to push books much more vocally in my instruction sessions for these very reasons.

However, books come with stumbling blocks, too. Ideally students could search our library catalog and find the books they need for their assignments right on our shelves. We have a collection that serves our students’ needs well, I think, especially in the degree programs. But we are a physically small library, and it’s difficult for us to build a book collection to serve the general needs students have in English Composition I courses, for example. While some of those sections focus on New York City or Brooklyn in their reading and research, in other sections students can choose their own topic, or the faculty member picks a topic of interest which may change from semester to semester. It’s difficult to keep up with these changing topics and, though all of those classes come to the library for an instruction session, we often don’t know which topics students select unless they stop by the Reference Desk to ask for help with their research.

My college is part of a university in which all of the libraries circulate books in common, as do many academic and public library systems. Students (and faculty/staff) can have books delivered between the colleges in just a few days, and we encourage students to take advantage of this service when they’re hunting for sources on their research topics. But sometimes students aren’t doing their research far enough in advance to accommodate the time required to have a book delivered, and, while they can also visit the other colleges’ libraries in libraries in person, they may not have the time for that, either.

What about ebooks? Ebooks can help bridge the just-in-time gap, though they are not without their own issues: subscriptions to ebook packages that may shift the titles available over time, confusing requirements for reading or downloading on mobile devices, variable rules about what can or cannot be printed, etc. And while all of the ebooks we offer in our library can be read on a desktop computer, of course we can’t always accommodate all students who want to use a computer in the library.

So I’m left wondering: how can we get more (and more relevant) books into the hands of our beginning students? And, barring that, are there other resources that cover that middle ground between the general knowledge of encyclopedic sources and the specific, often too advanced, scholarly research of journal articles?