Library Journal Mobile
Log In  |  Register          Free Newsletter Subscription
Subscribe to LJ Magazine

Lots of Librarians Can Keep Stuff Safe

Libraries are able to safeguard content with LOCKSS, open source digital preservation software

By Karen G. Schneider -- Library Journal, 8/15/2007

The image of lost or destroyed library collections conjures up floodwater pouring into libraries in New Orleans or firefighters in Fahrenheit 451 dousing books with gasoline. But for digital collections, the reality, though subtler, is equally disturbing: over the next several decades libraries face the potential loss of all the e-journals, ebooks, electronic theses, local digital collections, and other “e-stuff” curated for the public good.

One answer to this problem is LOCKSS (“Lots of Copies Keep Stuff Safe”), free, open source digital-preservation software from Stanford University that preserves digital content in a library-to-library network, just as multiple libraries keep copies of the same book. LOCKSS gets its name from its core preservation strategy. LOCKSS boxes—computers running LOCKSS software—share content with one another, ensuring that digital content is not simply backed up to one or two locations but is replicated across a network. “Having multiple copies of our content on a geographically dispersed network gives us more confidence that it will still be around in five years, ten years, or—given weather conditions down here—next week,” says Aaron Trehub, director of library technology at Auburn University Libraries, AL.

LOCKSS was introduced to preserve e-journals in academic libraries but has been deployed for a variety of digital content. While the software is free, membership in the LOCKSS Alliance is necessary for access to e-journals and certain features. Annual fees for academic institutions range from $1000 to $11,000.

Installation and maintenance

Librarians say it's easy to install LOCKSS (instructions are on the LOCKSS site), but, in a nutshell, you need an old PC you can dedicate to this task—LOCKSS actually runs better on older hardware—and a CD or write-protectable flash drive. “Get computer. Insert CD. Answer bunches o' questions regarding system administration. Go,” reports Eric Lease Morgan, head of the digital access and information architecture department, University Libraries of Notre Dame, South Bend, IN. Maintenance of a LOCKSS box is “almost trivial,” according to Morgan.

When content preserved within the LOCKSS network changes, the LOCKSS Alliance sends out updates by email. Dave Bretthauer, enterprise team leader at the University of Connecticut Libraries, Storrs, says these updates “couldn't be easier. Receive an email, cache the content, save a new backup configuration file in a couple of places, and you're done.”

Format migration

Integrity checking is another component of the LOCKSS belt-and-suspender approach to digital preservation. The software continuously and sequentially audits all known LOCKSS boxes to ensure content has not been corrupted, deleted, or otherwise changed without the knowledge of the e-journal publishers.

Content formats change rapidly in the digital environment—even current versions of Word cannot open some older Microsoft files. LOCKSS provides format migration, another crucial preservation technology. LOCKSS project cofounder David Rosenthal describes format obsolescence as “the prostate cancer of digital preservation. It is a serious and ultimately fatal problem.... But it is highly likely that something else will kill you first, so 'watchful waiting'...is normally the best course of action.”

For its “watchful waiting” strategy, the LOCKSS project uses the “migration on access” method of format migration, which converts the document to a new, readable format when a user requests it while also preserving the original format as much as possible. Migration on access minimizes demand on the system, since the 80/20 rule in librarianship means that most library content gets accessed rarely if at all.

Fidelity to the original format means future generations should see content as we meant to present it—down to fonts, page widths, and relationships to other articles on the page—not filtered through several digital reinterpretations of what we originally intended.

LOCKSS vs. Portico

The preservation of e-journals, the impetus for LOCKSS, remains its most common use. Many publishers have agreed to allow LOCKSS servers to store and share content with authorized subscribers; the publishers also provide the journal content the LOCKSS network preserves. The library's license agreements determine what journals it can access.

For e-journal preservation, LOCKSS competes directly with Portico, a project initially funded by the Andrew W. Mellon Foundation, Ithaka, the Library of Congress, and JSTOR. (LOCKSS won't divulge numbers on publishers or members.)

LOCKSS is attractive to libraries already comfortably maintaining servers and open source software; for these institutions, Portico's proprietary software and annual licensing fees are less appealing. Librarians using Portico counter that LOCKSS has fewer publishers participating (one librarian at an institution with a large e-journal collection reported that LOCKSS had 12 percent of its titles and Portico 33 percent) and stress Portico's ease of use, as Portico maintains the content on its own servers. LOCKSS proponents claim that the LOCKSS model of storing content locally, on library-owned servers, is inherently more secure, even if a third-party like Portico is a nonprofit established by librarians.

LOCKSS and Portico handle format obsolescence differently. LOCKSS uses migration on access and strives to maintain the fidelity of original formats; Portico accepts prepublishing files from publishers and, as the Portico web site notes, “converts them from their original proprietary formats to an archival format” based on the National Library of Medicine archival standard. Which approach will eventually prove correct—assuming either does—remains open to debate.

Donald Waters, program officer for scholarly communications at the Mellon Foundation, which has funded both LOCKSS and Portico, notes that Mellon's funding was intended “to give the marketplace of scholarly institutions an opportunity to vote with their own investments.”

Some libraries are opting for LOCKSS, some Portico, and a few, like the University of Connecticut Libraries, are trying both. “E-journal preservation is at far too early a stage for us to put all our eggs in one basket,” says Bretthauer.

Beyond e-journals

Preservation concerns stretch far beyond e-journals. Librarians are using LOCKSS to establish regional or special-topic preservation networks for their own digital content, from electronic theses and dissertations to digital collections for local history.

The Alabama Digital Preservation Network (ADPNet), a LOCKSS network involving six academic libraries and the Alabama Department of Archives and History, started in late 2006. Richard Pearce-Moses, deputy director for technology and information resources at the Arizona State Library, archives and public records, plans to use LOCKSS in the library's Persistent Digital Archives and Library System. MetaArchive, a project among six universities in the Southeast, established a LOCKSS network dedicated to regionally appropriate content. (The LOCKSS project recommends six as the minimum number of LOCKSS boxes in a network.)

What's next?

Libraries are in an odd place now. They have never before owned so little of the content they manage. LOCKSS offers one solution, even if it's unclear how the e-journal preservation wars will pan out. Meanwhile, the use of LOCKSS for preserving local born-digital content—with a free download, plus one morning's worth of time—is certainly worth a spin around the block.


LINK LIST
Collection-Based Persistent Digital Archives
www.dlib.org/dlib/march00/moore/03moore-pt1.html
Format Obsolescence
blog.dshr.org/2007_05_01_archive.html
LOCKSS
www.lockss.org
LOCKSS Alliance
www.lockss.org/lockss/LOCKSS_Alliance
LOCKSS Publishers
www.lockss.org/lockss/Publishers_and_Titles
MetaArchive
www.metaarchive.org
National Library of Medicine DTD
dtd.nlm.nih.gov
Portico
www.portico.org
Video on LOCKSS
www.youtube.com/watch?v=TOE_Jw23cVg&eurl=


Author Information
Karen G. Schneider (freerangelibrarian.com) is a freelance writer and librarian in Tallahassee, FL

Talkback

We would love your feedback!

Post a comment

» VIEW ALL TALKBACK THREADS

Related Content

Related Content

 

By This Author

There are no other articles written by this author.

Sponsored Links




 
Advertisement
Sponsored Links

MOST POPULAR PAGES

More Content

  • Blogs
  • Podcasts
  • Photos

Blogs


Sorry, no blogs are active for this topic.

» VIEW ALL BLOGS RSS

Photos

  • Design Institute 2007
    December 11, 2007 at Chicago's Harold Washington Library Center:Design Institute 2007
  • Learning Gardens
    New York's GreenBranches program links the library to the street.
  • Green Picks: LBD May 2007
    Want to reduce your library's carbon footprint? Join the Cradle-to-Cradle revolution. Helen Milling shares the green products her firm is using.
Advertisements





LJ NEWSLETTERS


Booksmack
LJXpress
LJ Academic Newswire
LJReview Alert
LJ Criticas Review Alert
SLJ Extra Helping
Curriculum Connections
SLJTeen
PWDaily
Children's Bookshelf
PW Comics Week
Cooking the Books
Religion BookLine
Please read our Privacy Policy
©2009 Reed Business Information, a division of Reed Elsevier Inc. All rights reserved.
Use of this Web site is subject to its Terms of Use | Privacy Policy
Please visit these other Reed Business sites