Corner Office: Google's Dan Clancy
As the Google Book Search settlement deadline looms, the company's point man talks about pricing, orphan works, and the role of libraries
By Andrew Richard Albanese & Norman Oder -- Library Journal, 5/1/2009
If you thought Google had a strong presence on the web, you should see its playful, spacious New York digs. The first impression is to wonder how anyone works there. But the coffee bars, café tables with board games, Legos, and artful displays of old technology all help keep the Google engine running.
Google director of content partnerships Tom Turvey introduces us (via a video link from California) to Dan Clancy, engineering director for Google Book Search. Turvey first points out that Google Book Search involves two spheres, libraries and more than 20,000 publishing partners, a separate track that has already contributed more than one million scanned book titles.
Then the talk turns to the pending Google Book Search settlement, involving millions of volumes digitized from libraries, which drew a lawsuit from the Association of American Publishers and the Authors Guild. With a court hearing to resolve the settlement looming in June (and amicus briefs filed by library groups and others), LJ's wide-ranging discussion with Clancy focuses on the decision and its implications for libraries, users, and the future of books on the web.
(See LibraryJournal.com/GoogleSettlement for comprehensive links to documents and news coverage, including increasing concerns over pricing and orphan works.)
LJ: To start, a simple question. Why settle?
DC: The settlement was driven by what we felt was, in the end, better for everyone and better for users in particular. We strongly believe in our fair use position, but we didn't start this project to win a court case on fair use. We started it to provide discovery tools. This settlement is an opportunity to do what, I think, from a user perspective is far better. The snippets we've been showing are a far cry from what the user wants, and really the only solution was a partnership. We assume we would have gone through the courts and won. But once we won, we still would've had snippets.
Some say the settlement favors expediency over a more optimal solution.
[Harvard University librarian] Bob Darnton wrote his piece, and he talked about an ideal solution. We just never saw that future coming to bear. Certainly we feel there will be progress in terms of cleaning up the public domain. And we would like to see progress on the orphan works issue. We just didn't see that this government, or any government, would be doing something that would waive copyright for books out of print.
Can this plan quickly go into practice?
I think many of the biggest challenges are behind us. Obviously, overseeing this whole activity takes a lot of commitment on Google's part. Assuming the court approves the settlement, we don't think it will be long before some of the products and services will be available to users. We were already thinking through some of the issues and products and decisions.
What kinds of products and services?
A couple of years ago we publicly discussed what's called consumer purchase [as part of] our partner program. Plans slowed down because we were talking to publishers, and others, about specific aspects of product development, like engineering. And part of this is because we would be selling content. We view our core mission as search and discovery.
Pricing is certainly a big concern for libraries.
Pricing needs to be driven by two mutually compatible goals: broad access and revenue on behalf of the rights holder. This type of database isn't something that's going to flourish by exclusivity. If this is not priced accordingly, universities can choose not to subscribe. [A library patron] can always preview a book and then get it through [interlibrary loan]. This is where the existence of a robust consumer model really helps. What we see in consumer markets, in general, is that it drives price to a reasonable level.
How will the Registry work? Will it have a librarian on it?
Sometimes there's been confusion that it's Google's registry. It isn't. The Registry is designed as an organization to represent rights holders. The idea of Google having an advisory board of people we're partnering with is interesting. I think, as with any product, there's lots of ways to get input. Right now, there's the class action settlement. Once that's authorized, we'd be very interested in creative ideas about an advisory board.
One free terminal for public libraries sounds archaic.
There's been a lot of misunderstanding about that. If we took out the stuff about the one free public access terminal, everyone would just be looking at the value proposition. The settlement agreement says at least one public access terminal, but the Registry has the ability to authorize additional access points. The settlement currently does not give us authorization for remote access for public libraries. We talked about that with authors and publishers, but they were not necessarily comfortable with the interplay between public library remote access and the consumer model. What's the right model for the public library? People say you haven't necessarily gotten it right, but it's not like we knew. This is a very different resource.
Will you administer subscriptions yourselves or use vendors?
That hasn't been determined yet. It's our job to sell the content, whether we use third-party vendors or not, and, in this market, there are a number of entities that do that. But we're still evaluating that space and have a lot to learn.
Critics, like Brewster Kahle of the Internet Archive, say you place unnecessary restrictions on public domain books.
There are many people involved in digitizing public domain content, then selling it back to libraries. We've talked with Brewster, and sometimes people believe we are more different than we are similar. The reality is we are very close. We think the public domain should be free. We do have what are effectively two “netiquette” requests—that users don't rehost or redistribute. The reason for that is because we are investing a large amount of money in digitizing public domain books and giving them back to libraries and users. If it was such a great business, I think a lot more people would be digitizing public domain books and giving them away for free.
Books you scan don't surface in other search engines. Would you lift that restriction for public domain books?
In all of these things, we're looking at what's the right balance. The idea of crawl is something we might do in the future.
Harvard's Darnton warned of a potential monopoly on book content. What might be the competition?
I can break the book world down into four sectors: one, books where publishers have a digital copy, which is almost every new book and many published in the last ten to 15 years. Two is the backlist and out-of-print books, where publishers and rights holders come forward. There are lots of out-of-print books where publishers still have rights, and those books will be claimed, and the Registry will be able to do deals with whomever it wants. Third, we have the public domain, to which we offer full, free access. And four, the area the settlement doesn't solve more broadly: truly orphaned works—works where no one comes forward. That was a limitation of the class action lawsuit, and I think that's an important area.
This agreement doesn't do anything for books published after January 5, 2009. In my view, the broader marketplace is still going to be dominated by new books. There are many questions there, like, what's the right model for libraries in an electronic age when publishers sell electronic content? I don't think this settlement agreement gives an answer to that for new books. It just opens up a database for books that many libraries have decided they would not buy.
How might the settlement help ease the orphan works problem?
This settlement obviously is not a solution to the orphan works problem. What it does is address one of the questions in that area, like what does it mean to do an “adequate search.” In this agreement, the information about what books have been claimed will be public. The Registry has the job of trying to find rights holders—and Google has put money toward trying to get rights holders to come forward through the notice of settlement. With respect to others who wish to use orphan works, that isn't addressed in the settlement, but we support broader orphan works legislation.
Why not make orphan works you've scanned open to all?
We aren't authorized to do that. A class action lawsuit has certain limitations. Legislation could potentially address that, but, right now, the settlement has the limitations of class action law. Under this agreement, it's the libraries' and Google's job to determine what we consider is in the public domain, and there's a provision where we can get a safe harbor, if we happen to screw up.
Librarians are concerned they have no formal voice in all this.
There are two aspects to library interest. One is from a partner perspective—they are providing these books, so what will they get out of it. The other is their interest as consumers of the content. This market is challenging, because libraries have a lot of options. We feel libraries are going to have a very strong voice in terms of the product. Some of their concerns are really about the product, and the settlement agreement isn't where we define that. This legal document talks about everything we can't do. It doesn't say anything about what we will do.
| Author Information |
| Andrew Richard Albanese is Features Editor, Publishers Weekly, and formerly Editor, LJ Academic Newswire. Norman Oder is Editor, News, LJ |























