Library Journal Mobile
Log In  |  Register          Free Newsletter Subscription
Subscribe to LJ Magazine

ProQuest and Google Strike Newspaper Digitization Deal

Josh Hadro -- Library Journal, 9/12/2008

Hoping to do for newspapers what Google Book Search has done for monographs, ProQuest and search giant Google have reached an agreement to digitize millions of pages of content from ProQuest’s vast newspaper microfilm archives. While ProQuest has vowed to continue improving and expanding its Historical Newspapers collection independently, the Google deal aims to create searchable electronic versions of smaller newspapers otherwise unlikely to be digitized, making them available on the open web via Google’s News archive search. “The problem is that, until now, finding a workable economic model for libraries and publishers has been challenging,” said Rod Gauvin, ProQuest senior VP of publishing. “This model overcomes that hurdle, unlocking a wealth of content for libraries and Internet users with unique research needs.”
Google is underwriting digitization costs—which have not been detailed—in return for revenue based on ads displayed alongside the newspaper page images (see an example scanned from the St. Petersburg Times). Digitization has begun with the content to which ProQuest already has rights to digitize and make available online, including mostly orphaned publications and those in the public domain. For newspapers in the ProQuest archives still bound by copyright, Google and ProQuest execs say they hope to work with copyright owners to reach further agreements, allowing publishers to choose whether to keep articles behind a pay-per-view wall, or whether simply to enter into a royalty-sharing agreement based on ad revenues generated by views of their digitized content.
Scanning toward different ends
Google hopes to digitize hundreds of millions of pages over the next few years, ProQuest VP of publishing Chris Cowan told LJ, a feat made possible less through any advance in scanning technology than through Google’s capacity to work on a large scale, as well as an emphasis on quantity over quality. “[Google is] very creative at throwing technology at problems to build solutions,” Cowan said of the company’s large-scale approach to scanning, saying that nearly the entire process has been automated, from page imaging to optical character recognition (OCR) scanning.
The deal leaves significant room for ProQuest to differentiate its Historical Newspapers offering, which contain such publications as the New York Times and Chicago Tribune, as a premium product in terms of added editorial effort and the human intervention required to make its selectively scanned materials more discoverable and useful to expert researchers. In contrast to scanning by Google, editors hired by ProQuest check headlines, first paragraphs, captions, and more to achieve their claim of “99.95 percent accuracy.” In addition, metadata is added along with tags describing whether the scanned content is an article, opinion piece, editorial cartoon, etc. Finally, ProQuest stresses that the agreement does not affect long-term preservation plans for the microfilm collection. “Microfilm will always be the preservation medium,” Cowan said, noting that, while digital formats are constantly changing, “film that’s handled appropriately can last several hundred years.”

Talkback

We would love your feedback!

Post a comment

» VIEW ALL TALKBACK THREADS

Related Content

Related Content

 

By This Author

Sponsored Links




 
Advertisement
Sponsored Links

More Content

  • Blogs
  • Podcasts
  • Photos

Blogs

  • Cheryl LaGuardia
    E-Views

    November 7, 2009
    That Song is in My Head and I Like It
    So I recently discovered that Pink Martini has had TWO albums out since the last one I knew about (H...
    More
  • Cheryl LaGuardia
    E-Views

    November 7, 2009
    MLA Signs with Summon
    The Modern Language Association (MLA) just signed an agreement with Serials Solutions (part of ProQu...
    More
  • » VIEW ALL BLOGS RSS

Photos

  • Design Institute 2007
    December 11, 2007 at Chicago's Harold Washington Library Center:Design Institute 2007
  • Learning Gardens
    New York's GreenBranches program links the library to the street.
  • Green Picks: LBD May 2007
    Want to reduce your library's carbon footprint? Join the Cradle-to-Cradle revolution. Helen Milling shares the green products her firm is using.
Advertisements





LJ NEWSLETTERS

Click on a title below to learn more.

LJ BookSmack
LJXPRESS
LJ ACADEMIC NEWSWIRE
LJ REVIEW ALERT
LJ Criticas Review Alert
©2009 Reed Business Information, a division of Reed Elsevier Inc. All rights reserved.
Use of this Web site is subject to its Terms of Use | Privacy Policy
Please visit these other Reed Business sites