Login  |  Register          Free Newsletter Subscription
Subscribe to LJ Magazine
Email
Print
Reprint
Learn RSS

ProQuest and Google Strike Newspaper Digitization Deal

Josh Hadro -- Library Journal, 9/12/2008

Hoping to do for newspapers what Google Book Search has done for monographs, ProQuest and search giant Google have reached an agreement to digitize millions of pages of content from ProQuest’s vast newspaper microfilm archives. While ProQuest has vowed to continue improving and expanding its Historical Newspapers collection independently, the Google deal aims to create searchable electronic versions of smaller newspapers otherwise unlikely to be digitized, making them available on the open web via Google’s News archive search. “The problem is that, until now, finding a workable economic model for libraries and publishers has been challenging,” said Rod Gauvin, ProQuest senior VP of publishing. “This model overcomes that hurdle, unlocking a wealth of content for libraries and Internet users with unique research needs.”

Google is underwriting digitization costs—which have not been detailed—in return for revenue based on ads displayed alongside the newspaper page images (see an example scanned from the St. Petersburg Times). Digitization has begun with the content to which ProQuest already has rights to digitize and make available online, including mostly orphaned publications and those in the public domain. For newspapers in the ProQuest archives still bound by copyright, Google and ProQuest execs say they hope to work with copyright owners to reach further agreements, allowing publishers to choose whether to keep articles behind a pay-per-view wall, or whether simply to enter into a royalty-sharing agreement based on ad revenues generated by views of their digitized content.

Scanning toward different ends

Google hopes to digitize hundreds of millions of pages over the next few years, ProQuest VP of publishing Chris Cowan told LJ, a feat made possible less through any advance in scanning technology than through Google’s capacity to work on a large scale, as well as an emphasis on quantity over quality. “[Google is] very creative at throwing technology at problems to build solutions,” Cowan said of the company’s large-scale approach to scanning, saying that nearly the entire process has been automated, from page imaging to optical character recognition (OCR) scanning.

The deal leaves significant room for ProQuest to differentiate its Historical Newspapers offering, which contain such publications as the New York Times and Chicago Tribune, as a premium product in terms of added editorial effort and the human intervention required to make its selectively scanned materials more discoverable and useful to expert researchers. In contrast to scanning by Google, editors hired by ProQuest check headlines, first paragraphs, captions, and more to achieve their claim of “99.95 percent accuracy.” In addition, metadata is added along with tags describing whether the scanned content is an article, opinion piece, editorial cartoon, etc. Finally, ProQuest stresses that the agreement does not affect long-term preservation plans for the microfilm collection. “Microfilm will always be the preservation medium,” Cowan said, noting that, while digital formats are constantly changing, “film that’s handled appropriately can last several hundred years.”

Email
Print
Reprint
Learn RSS

Talkback

We would love your feedback!

Post a comment

» VIEW ALL TALKBACK THREADS

Related Content

Related Content

 

By This Author

Sponsored Links




 
Advertisement
Sponsored Links

More Content

  • Blogs
  • Podcasts
  • Photos

Blogs

  • Cheryl LaGuardia
    E-Views

    November 30, 2008
    Website testing, again and again
    Came across this article, "The ultimate guide to testing your website," on techradar.com, ...
    More
  • Cheryl LaGuardia
    E-Views

    November 20, 2008
    Greenwood’s American Indian Experience: Try It Here for Free!
    Greenwood has just released the American Indian Experience: an American Mosaic Online Resource, ...
    More
  • » VIEW ALL BLOGS RSS

Photos

Advertisements





LJ NEWSLETTERS

Click on a title below to learn more.

LJ BookSmack
LJXPRESS
LJ ACADEMIC NEWSWIRE
LJ REVIEW ALERT
CRÍTICAS
©2008 Reed Business Information, a division of Reed Elsevier Inc. All rights reserved.
Use of this Web site is subject to its Terms of Use | Privacy Policy
Please visit these other Reed Business sites