Category Archives: Google

Searching the Library Website and Beyond: A Graduate Student Perspective

This month’s post in our series of guest academic librarian bloggers is by Julia Skinner, a first year Information Studies doctoral student at Florida State University. She blogs at Julia’s Library Research.

I just finished my MLS, and one of the issues raised frequently both in and out of the classroom was how to get college students and researchers to use the library website. Academic librarians I’ve talked with have spent hefty amounts of time (and money) designing sites that meet the self-described needs of patrons, but still find most of the searches that guide students to library resources to be coming from Google. I decided to take a look at my own search habits to get a sense of how, from the graduate student perspective, these tools might be employed, and hopefully generate some discussion about searching on the library website and beyond.

Like many other people, I usually do a quick Google search on my topic early on in the research process. This isn’t necessarily to track down every resource I would be using, but it does give me a general sense of what’s out there on my topic beyond the realm of scholarly materials. Since my own work relies heavily on the journal articles, scholarly monographs, primary sources, and other reliable sources, I feel like seeing what people have said outside the ivory tower can be a good way to give myself some perspective about how my topic is thought of and applied elsewhere. Most of the time, like for my research on Iowa libraries during WWI, there’s not much. But sometimes this search helps me find something useful (for example, in my recent work writing chapters for an encyclopedia on immigration, I was able to find information about nonprofits serving the immigrant community and some news stories.)

Obviously, the university library is still my go-to source. Journal articles, ebooks, not to mention circulating and special collections, are all where the meat and potatoes of my bibliography can be found. I love that many libraries are putting these collections online and purchasing more digital subscriptions (especially in the winter when I have a serious sinus infection and am locked in my house trying to work!) Sometimes, I find these resources through Google Scholar, but most of the time, it’s through searches within the library’s resources. This is especially true for journal articles, which I’ve found Google hasn’t really nailed yet when it comes to bringing desired results from a simple keyword search (I know, it’s a lot to ask, and hence why I love the library site!)

One tool I use heavily is Google Books. Not everything is on there, and most of the things that are have a limited availability (i.e. a preview where only some pages are available) but I have saved countless hours by doing a keyword search in GBooks to get a sense of what’s out there that mentions or is relevant to my topic, but maybe isn’t something I would have grabbed while browsing the shelves. I can then go track down the physical book for a more thorough read, or if I am able to access all the information I need from the preview I can just use it as a digital resource. Some other useful documents are in full view as well: many public domain items, including some ALA documents, can be found there.

Of course I don’t just use Google Books and assume that’s all there is. I also track down public domain titles on sites like Open Library and Project Gutenberg, and approach them in the same way. It’s a great way to get that one tidbit that really pulls an article together, and I usually find that some of those works don’t overlap with the offerings I find in the databases the library subscribes to. I will sometimes use different search engines, search a variety of fields, do Boolean search, etc. all of which helps me extract more little nuggets of information from the vast world of material related to any given topic. Even though I’m an avid Googler, I use library resources just as frequently. I remember speaking with a student a few years ago who could not find anything on her topic through a keyword search, and assumed there was nothing out there on that topic. I was amazed that she hadn’t even considered the university library’s website or physical collections before throwing in the towel! It makes me wonder how many students feel this way, and how we as LIS professionals and instructors can help effectively remove those blinders.

One thing I think will be interesting in the coming years (and which is a great thing to get input about from academic librarians!) is learning more about search habits among undergraduates. I’ll be TAing for our MLIS program this semester, so I’ll be working with students who are my age, getting the degree I just recently obtained, who are tech savvy and knowledgeable about search. What happens when I TA for an undergraduate course? Is sharing my search strategies helpful for papers that only require a handful of sources, and don’t require you to look at a topic from every imaginable angle? I argue that teaching search as something done in as many outlets as possible has the potential to make students better researchers, BUT only if that goes hand in hand with instruction on critically evaluating resources.

Without that, one runs the risk of putting students in information overload or having students work with sources that are irrelevant/untrustworthy. I’m a big fan of helping students recognize that the knowledge they have and the ideas they create are valuable, and it makes me wonder if building on their current search habits in such a way that encourages them to speak about the value of those sources, the flaws in their arguments, etc. will help promote that. I remember having a few (but not many) undergrad courses that encouraged me to draw upon my own knowledge and experience for papers, and to critically analyze works rather than just write papers filled with other peoples arguments followed by I agree/disagree. I feel like teaching is moving more in the direction of critical analysis, and I’m excited to see the role that librarians and library websites play!

Thinking About ‘The Filter Bubble’

This month’s post in our series of guest academic librarian bloggers is by Jessica Hagman, Reference and Instruction Librarian at Ohio University. She blogs at Jess in Ohio.

Last fall, I taught a one-credit learning community seminar. During the week where we discussed research and library resources, I showed the class this video from Google, describing how the search engine works. I suspected that most students had no idea how links come to the top of a Google search results page and no basis on which to begin evaluating the results beyond page rank, a suspicion confirmed by research from the Web Use Project (previously discussed here on ACRLog).

Yet, when I asked whether the video surprised them or if the search engine process was different than they had previously thought, I heard the proverbial crickets. Finally, one student spoke up with a shrug, “I guess I’ve just never thought about it before.” While I probably shouldn’t have been surprised that few students spent time thinking about the mechanics of Google, it was startling to hear it stated so clearly.

I thought about this comment again a few weeks ago when I ran across a link to Eli Pariser’s TED Talk “Beware Online Filter Bubbles.” In the talk and his new book elaborating on the subject Pariser argues that companies like Facebook and Google use the data we share online to build a personalized bubble around each person in which they only encounter information, news and links that confirm their already established world view and assumptions. And while the bubble is pervasive, it is mostly invisible.

After watching the talk, my thoughts turned to the undergraduate researcher writing about a contentious social issue like gun control or abortion whose browser history limits the scope of the results they see on Google. I’ve discussed Google searching in many library instruction sessions, but it’s usually been to point out the poor quality of some of the search results and to encourage students to look beyond the first link. Starting in the fall, I will mention the personalization of search results as well, so that students are at least aware that their search results reflect more than just the keywords they searched.

The implications of the filter bubble may go beyond the research for a freshman composition paper, however. In the later chapters of his book, Pariser argues that the pervasiveness of filter bubbles may hinder learning, creativity, innovation, political dialogue, and even make us more susceptible to manipulative advertising. It’s difficult to discuss these consequences in a one-shot library instruction session, but to know that the bubble exists is a powerful first step to escaping it when necessary.

I will be teaching the learning community seminar again this fall, and this year I will show them Pariser’s talk. While I think it’s important that they be aware of personalized search and its potential implications, I’m also very curious to hear what students think about personalized search and a world of filtered information. While they may not have spent much time thinking about Google in the past, I hope that seeing the video will encourage them to think about how their own search history and browsing data affect what see – or do not see – online.

In Google They Trust

An interesting article swam through my Twitterstream recently that’s a perfect complement to the Project Information Literacy report that Barbara mentioned last week. It’s a recent publication of research by the Web Use Project led by Eszter Hargittai, a professor of Communication Studies at Northwestern University. The article, Trust Online: Young Adults’ Evaluation of Web Content, appears in the latest issue of the International Journal of Communication (which is open access, hooray!), and reports on the information-seeking behavior of college freshmen at the University of Illinois at Chicago. Specifically, the researchers examine how students search for, locate, and evaluate information on the web.

Surveys were administered to 1,060 students, then a subset of 102 students were observed and interviewed as they searched for information on the internet. In the survey students were asked to rate criteria they use for evaluating websites and how often they use those criteria when doing research for their coursework. Students rated several criteria as important to consider when searching for information for school assignments, including currency/timeliness, checking additional sources to verify the information, identifying opinion versus fact, and identifying the author of the website.

However, while students surveyed and interviewed know that they should assess the credibility of information sources they find on the web, in practice this didn’t always hold true. When researchers observed students searching for information, the students rarely assessed the credibility of websites using what faculty and librarians would consider appropriate criteria, e.g., examining author credentials, checking references, etc. Instead, they placed much trust in familiar brands: Google, Yahoo!, SparkNotes, MapQuest, and Microsoft, among others.

Students also invested their trust in search engines to provide them with the “best” results for their research needs. While some acknowledged that search engine results are not ranked by credibility or accuracy, they asserted that in their experience the top results returned by search engines were usually the most relevant for them. Adding to the confusion, some students went right to the sponsored links on the search engine results page, which are not organic results at all but paid advertising.

Some of the students interviewed were able to differentiate between the types of information usually found on websites based on domain name, remarking that websites with .edu and .gov addresses are most trustworthy. But students were less clear on the differences between .org and .com. Many regard .org websites as more trustworthy, probably because originally that domain was reserved for non-profit organizations, a restriction which no longer exists.

I highly recommend giving this article a read, as it’s full of additional data and details that I’m sure will resonate with academic librarians. For me reading this article was like stepping into one of my English Comp instruction sessions. I always devote a portion of the class to discussing doing research on the internet, often ask students these same questions, and (usually) get the same responses. It’s great to see published data on these issues, and I hope the article is widely read throughout higher ed. My one wish is that there were a way to comment directly on the article and remind faculty that librarians can collaborate with them to strengthen their students’ website evaluation skills.

Widespread Ignorance About Google B.S.

According to a story in this morning’s Chronicle, many scholars remain “wary” of the Google Book Search project. This is perhaps to be expected (many librarians are wary of it, too, although I prefer to think of our work more as “due diligence”), but more distressing is the conclusion drawn by Pamela Samuelson (UC Berkeley School of Information and Co-Director of the Berkeley Center for Law and Technology) that there is “widespread ignorance [among our colleagues] about the agreement and its implications for the future of scholarship and research.”

Samuelson and her co-authors note that several provisions of the proposed Google B.S. settlement “seem to run contrary to scholarly norms and open-access policies that we think are widely shared in scholarly communities.” In the Chronicle’s report of their concerns, one can see the potential benefit on campus of a robust scholarly communications education program, i.e., one that engages librarians, faculty members, graduate students, and others (e.g., University Press, Graduate College, Office of Research) in a discussion of issues such as author rights, copyright management, open access policies and publishing, and the library and the press and the leaders of scholarly societies and professional associations (who are also often on our campuses) as the pillars supporting a new vision of the university’s role in the dissemination of research and scholarship.

Is Samuelson right? Is there “widespread ignorance” on your campus regarding the implications of the Google Book Search settlement? Is this part of a broader “teachable moment” on your campus on scholarly communication issues and the resources that your library is ready to put in play to help faculty to better understand these issues and to understand both the potential of large-scale digitization programs for enhancing discovery of scholarly materials, and the implications that taking one or another direction on those programs may have for the process of scholarly communication? Will you be taking advantage of that teachable moment?

Quick quiz: when Google Scholar went live, many information literacy instruction programs began to offer workshops on how to use Google Scholar as part of the research process; how many of you with scholarly communication education programs are planning (or have already conducted) workshops on the broader implications of Google Book Search for local understanding of author rights, open access alternatives, use of Creative Commons, etc.? Have you shared resources such as ARL’s Guide for the Perplexed? Who have been your campus partners in developing such programs?

We’re academic librarians. “Widespread ignorance” is something we should be able to help to address!

Libraries on Planet Google

It has been a week since news of the Google settlement with authors and publishers broke. Though rumors had been rife that it was imminent, I was still blown away by the scope of it. Of course the court still has to rule, but the outlines – if they remain intact – are stunning in their implications.

First of all, as Jeffrey Toobin predicted in his 2007 New Yorker article, “Google’s Moon Shot,” the fair use question remains unsettled. Anyone else who tries to follow in Google’s footsteps to digitize in-copyright books had better have a many millions of dollars handy to pay lawyers fees. This puts Google in an incredibly strong position. They will have a lock on great big digitized book collections. They have overnight become an enormous vendor of licensed content. And a huge product with no competitors can set the agenda. Did the libraries who jumped on this bandwagon foresee this outcome? Are they happy with it?

Paul Courant of UMich sees the positive side.

First, and foremost, the settlement continues to allow the libraries to retain control of digital copies of works that Google has scanned in connection with the digitization projects. We continue to be responsible for our own collections. Moreover, we will be able to make research uses of our own collections. The huge investments that universities have made in their libraries over a century and more will continue to benefit those universities and the academy more broadly.

Second, the settlement provides a mechanism that will make these collections widely available. Many, including me, would have been delighted if the outcome of the lawsuit had been a ringing affirmation of the fair use rights that Google had asserted as a defense. (My inexpert opinion is that Google’s position would and should have prevailed.) But even a win for Google would have left the libraries unable to have full use of their digitized collections of in-copyright materials on behalf of their own campuses or the broader public. . . . The settlement cuts through this morass. As the product develops, academic libraries will be able to license not only their own digitized works but everyone else’s. Michigan’s faculty and students will be able to read Stanford and California’s digitized books, as well as Michigan’s own. I never doubted that we were going to have to pay rightsholders in order to have reading access to digitized copies of works that are in-copyright. Under the settlement, academic libraries will pay, but will do so without having to bear large and repeated transaction costs. (Of course, saving on transaction costs won’t be of much value if the basic price is too high, but I expect that the prices will be reasonable, both because there is helpful language in the settlement and because of my reading of the relevant markets.)

Harvard is not so sanguine, according to a story in the Chron. They didn’t allow Google to digitize in-copyright books, and they will stick with that practice.

Harvard’s concerns center on access to the scanned texts — how widely available access would be and how much it might cost. “As we understand it, the settlement contains too many potential limitations on access to and use of the books by members of the higher-education community and by patrons of public libraries,” Harvard’s university-library director, Robert C. Darnton, wrote in a letter to the library staff.

He noted that “the settlement provides no assurance that the prices charged for access will be reasonable, especially since the subscription services will have no real competitors [and] the scope of access to the digitized books is in various ways both limited and uncertain.” He also expressed concern about the quality of the scanned books, which “in many cases will be missing photographs, illustrations, and other pictorial works, which will reduce their utility for research.”

Lawrence Lessig thinks there’s a lot that’s good about the settlement. We dodged the bullet of a loss on the fair use issue and improved on what was available in Google Books previously without shrinking the definition of fair use:

IMHO, this is a good deal that could be the basis for something really fantastic. The Authors Guild and the American Association of Publishers have settled for terms that will assure greater access to these materials than would have been the case had Google prevailed. Under the agreement, 20% of any work not opting out will be available freely; full access can be purchased for a fee. That secures more access for this class of out-of-print but presumptively-under-copyright works than Google was initially proposing. And as this constitutes up to 75% of the books in the libraries to be scanned, that is hugely important and good. That’s good news for Google, and the AAP/Authors Guild, and the public.

Andrew Keen isn’t so sure – as he writes in The Independent, “Will Life on Planet Google be a Nightmare or a Dream?” (And he is one of a few who consider the privacy issues – once a closely guarded value of libraries. We don’t think anyone should keep an eye on what you read. Unless it’s Uncle Google.)

Is Google good or is it evil? Is the company an all-knowing behemoth that is hubristically “transforming our lives”, Big Brother-style, with its intrusive technology? Or is it a plucky, selfless Silicon Valley start-up that is “audaciously” organising all the world’s information for all of our benefit? Is Google Orwell or is it Disney? . . .

The truth — and even on planet Google there remain truths – is that Google’s greed for knowledge is both thrillingly audacious and terrifyingly threatening. Google is, in fact, an Orwell-Disney co-production. The company wants to know everything about us so that it can help us in every way. Room 101, then, on planet Google, is a brightly lit, cheerful place where we can, at the click of a mouse, know all there is to know about ourselves, our neighbours and the world.

Brewster Kahle, not surprisingly, told the Mercury News this is a bad move. “When Google started out, they pointed people to other people’s content,” Kahle said. “Now they’re breaking the model of the Web. They’re like the bad old days of AOL, trying to build a walled garden of content that you have to pay to see.” Of course our libraries are full of enormously expensive walled gardens. And with this settlement we’ll have one more to tend. A big one. A big one with no serious competitors.

While Lessig is cheered that this settlement may well torpedo the flawed orphan works legislation pending in Congress, Georgia Harper encourages libraries to keep working on alternatives to the Google orphanage.

This isn’t the Congressional approach to problem solving (shove the parties into a room and lock the door until they have reached an agreement — and may the strongest interest obliterate the weaker and we’ll call it a compromise in the public interest). This is the publisher’s and Google’s no nonsense business approach: “Hey, let’s just start selling all the books and if there’s money to be made, the owners will either show up to claim it, or the money will lie there for 5 years while we give everyone time to wake up and smell the coffee. At the end of 5 years, we’ll pretty much know what’s orphan and what’s not. What’s not to like?” . . .

Google clearly understood and accepted that this plan was based on an idea I found repugnant: if orphan works don’t have owners, by definition, then why is it that the Registry should keep the money that comes in for books that ultimately no one claims? The publishers and authors just don’t see orphans as really belonging to everyone in the absence of an owner. They see them as belonging to all the other authors and publishers, but not the public. . . .

I want this process to work. I think it has a much better chance of working than that piece of, uh, than that piece of legislation that nearly passed earlier this fall. It doesn’t give us an answer today and it *only* deals with books, so it’s not a comprehensive solution, but it might serve as an example of what works, assuming it does work. But libraries can still do their own research on individual titles that they think may be orphans while we wait for this deal’s market incentives to do their job, and for it to become clear that transparency is in the owners’ best interests as well as the public’s.

For example, I believe that the OCLC’s Copyright Evidence Registry is just as important today as it was 5 days ago before Google announced this deal. Although the publisher/author Registry has potential to be definitive, there will be need for multiple sources of information about the copyright status of works until the publisher/author Registry earns its keep. No source that wants to be definitive can do so if it can’t be trusted.

James Gibson wraps up his analysis in the Washington Post

By settling the case, Google has made it much more difficult for others to compete with its Book Search service. Of course, Google was already in a dominant position because few companies have the resources to scan all those millions of books. But even fewer have the additional funds needed to pay fees to all those copyright owners. The licenses are essentially a barrier to entry, and it’s possible that only Google will be able to surmount that barrier.

Sure, Google now has to share its profits with publishers. But when a company has no competitors, there are plenty of profits to share.

For more commentary, see the round-ups provided by Library Journal, Peter Suber, and EFF.

UPDATE: Library Journal on not holding our breaths; Peter Brantley on the stinginess of the public library provision.

photo courtesy of stevecadman