Library Journal Mobile
Log In  |  Register          Free Newsletter Subscription
Subscribe to LJ Magazine

The Birth of the Universal Library

The Open Content Alliance is our opportunity to shape the future of access to information, writes Jonathan B. Bengtson

By Jonathan B. Bengtson (netConnect) -- netConnect, 4/15/2006

We live in revolutionary times. While digitization projects in libraries have been around for a number of years, in the past 18 months the possibilities of digitization and the cultural impact of such scanning projects have leapt dramatically beyond the confines of academia. Making the full text of libraries’ holdings available digitally is without question a natural next step in widening access to our collections—and massive digitization projects, such as that of the Open Content Alliance (OCA), have brought us into a new era.

While Google has grabbed more headlines with its controversial plan to scan millions of books, in copyright or not, from university libraries, including the entire book collection of the University of Michigan, the OCA has quietly launched an alternative plan. The OCA is a collaborative venture that seeks to develop a process to scan books and information rapidly for the public domain. As the debate in the media continues to be about whether and for how long the printed book will survive as the central medium for the dissemination of knowledge—and which digital forms will emerge to supersede print—ventures like the OCA are blazing a path of significant change. For libraries, participating in this process is not only natural, it is essential. It is the only way we can build on our wisdom and knowledge to benefit most from the age of the Internet.

The OCA

The OCA is driven by the vision of Brewster Kahle, the founder of the Internet Archive. The Internet Archive is a California-based nonprofit founded to create an “Internet library,” offering permanent, free access to historical collections in digital, audio, visual, and printed formats. It is probably best known for its “Wayback Machine,” which has been archiving web sites since the mid-1990s and now contains over 40 billion pages. Kahle’s vision remains the key to understanding the OCA’s goals: first and foremost, to provide free access to the world’s written heritage via the web, or at least that part of the heritage in public domain.

In autumn 2004, shortly before the announcement of the Google Book Search project, the Internet Archive began collaborating with the University of Toronto library to test methods of digitizing books at the rate of many hundreds of pages per hour. In 2005, as the tests in Toronto were taking place, Kahle was instrumental in bringing together the corporate and nonprofit organizations that would become the original members of the OCA. Unlike Google Book Search, which will scan books still in copyright and provide online previews of these works, the OCA’s approach focuses first on out-of-copyright books and, secondly, on working with rights holders to provide access to their works if they so choose.

To date, the commercial publisher O’Reilly Media has agreed to make certain in-copyright content available to the OCA, as has University of Toronto Press. But the OCA is not just about books. It also provides free and open access to all formats of the digital files of scanned books, from raw camera images to processed PDFs and JPEGs. One of the OCA’s primary goals in 2006 is to scan a large amount of material that reflects the history, people, culture, and ecology of North America.

This includes projects by the California Digital Library (CDL) to scan thousands of volumes of American literature and a project by a consortium of Canadian libraries in partnership with the University of Toronto to scan material related to Canadian literature, history, and culture. Some items from the California project have been made accessible via a site called “Open Library,” which uses software that mimics turning the pages of a book. Under the aegis of the OCA, the British Library will also be scanning approximately 100,000 works as part of its National Digital Library plan.

The allies

Collaboration is a key component of the OCA. Kahle often compares the goals of the OCA with those of the Library of Alexandria. That is, the ultimate aim is to create a virtual library that contains all of human knowledge, just as the Library of Alexandria was the universal library for antiquity. Such an ambitious undertaking, which Kahle has also compared to putting a man on the moon, can only succeed through an alliance between the nonprofit sector, notably the Internet Archive and the various contributor libraries, and the commercial sector.

To date, the list of commercial partners is impressive. Yahoo! is supplying the search engine for the OCA and indexing the scanned books. Microsoft’s MSN Search has committed $5 million toward the scanning of 50 million pages of text. Other companies such as Hewlett-Packard, LibriVox, Octavo, Lulu.com, and Adobe have also pledged expertise to the OCA. The Research Libraries Group (RLG) will be supplying bibliographic records from its database of nearly 50 million titles, and it is currently working with the CDL in its project to digitize works of American literature.

The collaborative nature of the OCA has also resulted in the formation of a number of working groups to examine other issues involved with this massive digital shift, such as the creation of metadata, collaborative collection development, digital file preservation, and data transfer protocols. Sensitive to the criticism that Google Book Search has received from rights holders, the OCA has also been in discussion with major publishers and organizations as to how legal and sustainable business models might be developed to make more copyrighted content available.

A useful comparison to the projects that are developing under the umbrella of the OCA is the open source movement in the computing sector. For example, a number of OCA contributors, including the Smithsonian Institutions Libraries, are working on a cooperative project called the Biodiversity Heritage Library. As the OCA and its partners make progress and share their experience and expertise, such new and innovative projects should continue to emerge.

New tools

While most believe that the printed book will survive in some form, and this hardly needs pointing out when writing from Marshall McLuhan’s former stomping ground at St. Michael’s College, Toronto, an understanding of the uses and, particularly, the limitations of digital technology is crucial, if such technology is to be exploited to its fullest. Unfortunately, ebook displays have yet to provide suitable alternatives to printed books. [See Product Pipeline, p. 18.]

Within the past two decades, however, a rapid new process of change has begun. Early in our history, the transition from an oral to a written culture developed over many centuries. During this slow evolution, our way of thinking fundamentally changed, from repetitive, oral, memory-based knowledge to visual and spatial memory, based on the physical object of the book. For centuries books were simply the most efficient and usable technology for the transmission of culture and ideas. We need only reflect on the past few years to sense how quickly and radically the ways that we write and communicate have been and will be altered.

New technologies such as e-ink technology are evolving rapidly, even if not as rapidly as we may desire. Sony’s soon-to-be-launched “portable reader system” is the latest gadget to enter the marketplace. Admittedly, at first glance, it does not seem to be a device that will convert large numbers of people to electronic texts. It does, however, represent significant improvements over existing devices.

Further developments in ebook technology will undoubtedly follow in the coming years. University libraries will likely play a significant role in these technologies as well. Arguably, libraries are where the impacts of change, be they political, educational, or organizational, are most acutely concentrated and where information technology is the most potent agent for that change.

Enduring value

No one can reliably predict how the Internet will change our lives in the coming years. Google Book Search and OCA initiatives, in conjunction with continued improvements in e-ink display devices, may well accelerate the acceptance of ebooks as viable alternatives to traditional printed books. But for all the excitement and possibilities, this is also an uncertain time—certainly for libraries.

Thanks largely to the advent of the computer, libraries are now more successful at providing access to their holdings than at any other time in human history. Through online catalogs and consortia such as OCLC and RLG, the physical card catalog has been consigned to library history. And that’s just the beginning. The philosophy of Open Content Alliance member libraries is to engage creatively with uncertainty. In so doing, we will help redefine library services while, at the same time, defending those traditional library roles that technology cannot and will never supplant.


Link List
ASE EDGE INC.
www.aseedge.com/home.asp
INTERNET ARCHIVE
www.archive.org
JOHN M. KELLY LIBRARY, ST. MICHAEL’S COLLEGE, TORONTO
www.utoronto.ca/stmikes/library/index.htm
NATIONAL INSTITUTE OF NEWMAN STUDIES
www.newmanstudiesinstitute.org
OPEN CONTENT ALLIANCE
www.opencontentalliance.org
OPEN LIBRARY
www.openlibrary.org
SONY E-READER
products.sel.sony.com/pa/prs/ index.html?DCMP= reader&HQS=showcase_reader
UNIVERSITY OF TORONTO LIBRARY
main.library.utoronto.ca
 

 

The Newman Project

The OCA scanning center in the John M. Kelly Library at St Michael’s College, University of Toronto, has two Scribe machines. One of them is dedicated full time to an ambitious project to build a comprehensive digital collection of the writings by and about John Henry Cardinal Newman (1801–90). The Kelly Library houses the largest collection in North America of early editions of this English theologian. As an Anglican priest in Oxford, Newman was the driving force behind the Oxford Movement, which sought to reform the Church of England. In 1845, he shocked Victorian England by converting to Catholicism and was eventually made Cardinal in 1879 by Pope Leo XIII. Comfortable in many different genres, including fiction, poetry, history, and theology, Newman was one of the most influential English Catholic apologists and wrote on a wide range of subjects, from doctrinal theology to the nature of university education to early church history.

In 2004, the library became the center of an international effort to digitize and make available its collection, and a number of related Newman collections from other institutions, to a worldwide audience via the Internet. In partnership with the National Institute of Newman Studies in Pittsburgh and the Open Content Alliance, the library is scanning the various collections of Newman’s works in order to create a comprehensive “virtual” collection of every one of his lectures, newspaper articles, sermons, variant editions, and so forth. The scanned text will be analyzed by sophisticated data-mining software, developed and provided by corporate partner ASE Edge Inc., to explore subtle changes in Newman’s thought over time. Legal services are also being provided to identify works for scanning that were printed after 1923 that are no longer under U.S. copyright. When the project is completed, it will be the first time this technology has been used to capture the complete corpus of one of the world’s key intellectual figures. The project will serve as a model for the future application of new 21st-century digital scanning technologies to library book collections.


OCA Members

  • Adobe Systems Incorporated
  • Biodiversity Heritage Library, a cooperative project of the American Museum of Natural History; Harvard University Botany Libraries; Harvard University, Ernst Mayr Library of the Museum of Comparative Zoology; Missouri Botanical Garden; Natural History Museum, London; New York Botanical Garden; Royal Botanic Gardens, Kew; Smithsonian Institution Libraries
  • Boston Public Library
  • Columbia University
  • Emory University
  • European Archive
  • Simon Fraser University (Canada)
  • William and Flora Hewlett Foundation
  • HP Labs
  • Internet Archive
  • Johns Hopkins University Libraries
  • McMaster University (Canada)
  • Memorial University of Newfoundland
  • Missouri Botanical Garden
  • MSN
  • National Archives (UK)
  • National Library of Australia
  • O’Reilly Media
  • Prelinger Archives
  • Research Libraries Group (RLG)
  • Rice University
  • Smithsonian Institution Libraries
  • University of British Columbia
  • University of California
  • University of North Carolina–Chapel Hill
  • University of North Carolina–Chapel Hill, School of
  • Information and Library Science
  • University of Ottawa
  • University of Pittsburgh
  • University of Texas
  • University of Toronto
  • University of Virginia
  • Washington University
  • Xerox Corporation
  • Yahoo!
  • York University (Canada)

Talkback

We would love your feedback!

Post a comment

» VIEW ALL TALKBACK THREADS

Related Content

Related Content

 

By This Author

There are no other articles written by this author.

Sponsored Links




 
Advertisement
Sponsored Links

MOST POPULAR PAGES

More Content

  • Blogs
  • Podcasts
  • Photos

Blogs


Sorry, no blogs are active for this topic.

» VIEW ALL BLOGS RSS

Photos

  • Design Institute 2007
    December 11, 2007 at Chicago's Harold Washington Library Center:Design Institute 2007
  • Learning Gardens
    New York's GreenBranches program links the library to the street.
  • Green Picks: LBD May 2007
    Want to reduce your library's carbon footprint? Join the Cradle-to-Cradle revolution. Helen Milling shares the green products her firm is using.
Advertisements





LJ NEWSLETTERS


Booksmack
LJXpress
LJ Academic Newswire
LJReview Alert
LJ Criticas Review Alert
SLJ Extra Helping
Curriculum Connections
SLJTeen
PWDaily
Children's Bookshelf
PW Comics Week
Cooking the Books
Religion BookLine
Please read our Privacy Policy
©2009 Reed Business Information, a division of Reed Elsevier Inc. All rights reserved.
Use of this Web site is subject to its Terms of Use | Privacy Policy
Please visit these other Reed Business sites