Siva Vaidhyanathan Questions Google Book Search

Friday at the Drexel University Libraries’ Scholarly Communication Symposium, Siva Vaidhyanathan raised some serious questions about the partnership between libraries and Google in a powerful and provocative analysis of Google Book Search.

Siva is unique in that he combines an in-depth knowledge of copyright, a reasoned appreciation for new technology and a clear love and deep respect for libraries and librarians with a strong sense of social justice and the public good. He is a skilled presenter and powerful speaker. Others wear suits, he wears a black leather jacket. He tends to raise difficult questions. Either we should feel very lucky that he has chosen to cast his critical eye on our issues, or we should feel slightly nervous.

He began by dismissing both Kevin Kelly’s overenthusiastic embrace of the universal library and John Updike’s nostalgic defense of the traditional publishing world. He agreed with Kelly that digitization was a worthwhile goal but asked if Google Book Search was the right way to do it, if now was the right time, and if copyright was up to the task. He also disagreed with Kelly by countering that books are linear for a reason. He conceded that people of the book are racked with anxiety about the future, implying that this may have been a motivating factor pushing libraries into too hasty deals with Google.

Among Vaidhyanathan’s concerns about Google Book Search include some nitty gritty quality issues about the improper ranking and the inadequacy of some search results. He pointed to a search for “copyright” in which the first hit is a book from 1912, and a search for “Copyright Law” that does not pull up the most recent and relevant books. This suggests that despite Kelly’s claims, users would still be better off if they consulted with a librarian. He also searched for some famous literary quotes (“it was the best of times”) that did not turn up their sources, but he did admit that at least one search (“Karl Rove”) turned up two good books at the top of the list and that led Vaidyanathan to information that he previously did not know.

Vaidhyanathan then reeled off 5 questions each for Google and the Google Library Partners:

For Google:
1. What will be the guiding principles for inclusion, exclusion, and rank within the index?
2. What will be scanned first? What order?
3. What safeguards are you taking to ensure user confidentiality and privacy?
4. What will be your metadata standards? Why would one book outrank another?
5. Will you omit certain titles from the index if a government demands it? Or will you merely present snippets to indicate the book’s existence?

For Libraries:
1. Did you insist on assurances that Google would protect user confidentiality and privacy?
2. Did you insist that Google’s index include input from your librarians about quality control, order of inclusion, order and metadata standards?
3. Did Google restrict your use of the digital files in any way? (no obvious restrictions in Michigan contract.)
4. Did you consider the harm to potential markets for publishers who have been selling and leasing digital files? What is the copyright justification for receiving an electronic copy as payment for a transaction?
5. What’s the hurry?

At one point, Vaidhyanathan compared Google Book Search to the Human Genome Project. Here, he claimed, a for-profit company named Celera demonstrated it could do the work better and faster, but governments declined, recognizing that this information should not be privatized. Now Vaidhyanathan became animated, stating that it should be the same for knowledge and asking, “since when is expediency one of the core values of librarianship?” (Ouch. I was taken aback when he said this.) He basically accused some of the world’s top research libraries of rushing into deals with Google in which they did not realize the true value of their holdings, failing to insist on quality control, failing to guarantee user privacy, and damaging their relationships with publishers.

These are serious charges, some of them have surfaced in a previous ACRLog post. As for the Google Library Partners side of the story, their public statements do point to the public good as a motivation for making their collections more widely accessible, and it’s hard to fault them for that. Vaidhyanathan’s comparison to the Human Genome Project seems unfair–no governments were willing to step up to digitize books on anything like the scale of Google Book Search as far as I know. According to the University of Michigan, it would have taken them more than a thousand years to digitize their collection on their previous pace of digitizing. New York Public Library strikes a cautious tone in their statement and hardly seems to be rushing into anything. As for quality control and metadata issues, I don’t know, but the FRBR Blog has reported that a Library of Congress working group on bibliographic control has recently met with Google. It’s an interesting point about library’s relationship with publishers–however there is an argument that GBS will help publishers to sell more books from their back catalogs.

But what about privacy? What were the discussions between the Library Partners and Google on privacy? Vaidhyanathan reminds us that Google is not just any company, but a humongous company with ambitious aims “to organize the world’s information.” Do we want all our information needs to be met and filtered through a lens that utimately has profit as its main aim?

In other interesting thoughts that Vaidhyanathan did not fully expand upon, he said he thought the issue of the libraries receiving an electronic copy as payment for the transaction could be the silver bullet that would lose a court case for Google. I think he also said that that if the project succeeds it will ultimately weaken fair use.

I don’t remember one session at ACRL Baltimore on Google Book Search. If Vaidhyanathan tells us one thing, it’s that as a profession, we need to know more.

UPDATE – A writer at the American Historical Association blog confirms some of Siva’s worries.