Library Journal Mobile
Log In  |  Register          Free Newsletter Subscription
Subscribe to LJ Magazine

ProQuest and Google Strike Newspaper Digitization Deal

Josh Hadro -- Library Journal, 9/12/2008

Hoping to do for newspapers what Google Book Search has done for monographs, ProQuest and search giant Google have reached an agreement to digitize millions of pages of content from ProQuest’s vast newspaper microfilm archives. While ProQuest has vowed to continue improving and expanding its Historical Newspapers collection independently, the Google deal aims to create searchable electronic versions of smaller newspapers otherwise unlikely to be digitized, making them available on the open web via Google’s News archive search. “The problem is that, until now, finding a workable economic model for libraries and publishers has been challenging,” said Rod Gauvin, ProQuest senior VP of publishing. “This model overcomes that hurdle, unlocking a wealth of content for libraries and Internet users with unique research needs.”
Google is underwriting digitization costs—which have not been detailed—in return for revenue based on ads displayed alongside the newspaper page images (see an example scanned from the St. Petersburg Times). Digitization has begun with the content to which ProQuest already has rights to digitize and make available online, including mostly orphaned publications and those in the public domain. For newspapers in the ProQuest archives still bound by copyright, Google and ProQuest execs say they hope to work with copyright owners to reach further agreements, allowing publishers to choose whether to keep articles behind a pay-per-view wall, or whether simply to enter into a royalty-sharing agreement based on ad revenues generated by views of their digitized content.
Scanning toward different ends
Google hopes to digitize hundreds of millions of pages over the next few years, ProQuest VP of publishing Chris Cowan told LJ, a feat made possible less through any advance in scanning technology than through Google’s capacity to work on a large scale, as well as an emphasis on quantity over quality. “[Google is] very creative at throwing technology at problems to build solutions,” Cowan said of the company’s large-scale approach to scanning, saying that nearly the entire process has been automated, from page imaging to optical character recognition (OCR) scanning.
The deal leaves significant room for ProQuest to differentiate its Historical Newspapers offering, which contain such publications as the New York Times and Chicago Tribune, as a premium product in terms of added editorial effort and the human intervention required to make its selectively scanned materials more discoverable and useful to expert researchers. In contrast to scanning by Google, editors hired by ProQuest check headlines, first paragraphs, captions, and more to achieve their claim of “99.95 percent accuracy.” In addition, metadata is added along with tags describing whether the scanned content is an article, opinion piece, editorial cartoon, etc. Finally, ProQuest stresses that the agreement does not affect long-term preservation plans for the microfilm collection. “Microfilm will always be the preservation medium,” Cowan said, noting that, while digital formats are constantly changing, “film that’s handled appropriately can last several hundred years.”

Talkback

We would love your feedback!

Post a comment

» VIEW ALL TALKBACK THREADS

Related Content

Related Content

 

By This Author

Sponsored Links




 
Advertisement
Sponsored Links

MOST POPULAR PAGES

More Content

  • Blogs
  • Podcasts
  • Photos

Blogs

  • Cheryl LaGuardia
    E-Views

    November 20, 2009
    Portable Libraries, Mobile Students
    I attended this excellent ACRL-NE Information Information Technology Interest Group (ITIG) Social pr...
    More
  • Cheryl LaGuardia
    E-Views

    November 20, 2009
    Parker Library on the Web
    Corpus Christi College (Cambridge) and Stanford University Libraries recently released t...
    More
  • » VIEW ALL BLOGS RSS

Photos

  • Design Institute 2007
    December 11, 2007 at Chicago's Harold Washington Library Center:Design Institute 2007
  • Learning Gardens
    New York's GreenBranches program links the library to the street.
  • Green Picks: LBD May 2007
    Want to reduce your library's carbon footprint? Join the Cradle-to-Cradle revolution. Helen Milling shares the green products her firm is using.
Advertisements





LJ NEWSLETTERS


Booksmack
LJXpress
LJ Academic Newswire
LJReview Alert
LJ Criticas Review Alert
SLJ Extra Helping
Curriculum Connections
SLJTeen
PWDaily
Children's Bookshelf
PW Comics Week
Cooking the Books
Religion BookLine
Please read our Privacy Policy
©2009 Reed Business Information, a division of Reed Elsevier Inc. All rights reserved.
Use of this Web site is subject to its Terms of Use | Privacy Policy
Please visit these other Reed Business sites