Digital Libraries: Building a New Bibliographic Infrastructure
By Roy Tennant -- Library Journal, 1/15/2004
More than a year ago I called for the death of MARC (see LJ 10/15/02 , p. 26ff.). That column sparked a lively discussion among librarians—especially catalogers. As I thought about it and discussed the issue with others, I decided I had convicted the wrong suspect. Let MARC die of old age rather than homicide.
I thought that MARC (the MARC record syntax, MARC elements, and AACR2) was too limiting for modern library needs and opportunities. I now realize that with a robust bibliographic infrastructure we could profitably use any bibliographic metadata standard that we could imagine, including MARC.
The point is we need to craft standards, software tools, and systems that can accept, manipulate, store, output, search, and display metadata from a wide variety of bibliographic or related standards. Our systems should be able to accept an ONIX record from a publisher, which contains basic bibliographic fields and elements like cover art, and use it as a prototype cataloging record or to enrich an existing record. Our systems should be able to output a record easily in Dublin Core for harvesting through the Open Archives Initiative's Protocol for Metadata Harvesting. In other words, we need a new bibliographic infrastructure that allows for the easy and effective sharing of various types of records.
This requires a transfer format or schema that can encapsulate everything you need to associate with a particular intellectual object. Luckily, such a schema is being developed: the Metadata Encoding and Transfer Syntax (METS). Some libraries are already using this schema to embed multiple bibliographic records from different schemata into one package.
Supporting all recordsThis is our future; we can no longer rely on only one record structure. We must be able to accept many different kinds of bibliographic record structures, from ONIX to Dublin Core to whatever else comes along that contains useful information. To use these record formats we will need rules and guidelines to follow in their application. We need both general rules and schema-specific rules, similar to the way we have used AACR2 to define what information we capture in MARC.
Beyond rules and guidelines, best practices as developed by libraries working with this new infrastructure will help guide other libraries. "Crosswalks" or standard translations from one bibliographic schema to another will also be required, although in many cases it will be better to retain records in their original format and map elements into common fields upon indexing.
No more homogenizationAn undertaking of this type has many challenges. One of the first and most significant will be moving from a relatively homogenous bibliographic environment (the MARC21/AACR2 hegemony) to a diverse one. We will need to find useful ways of ingesting records in a variety of formats, both by crosswalking and by indexing the original formats and virtually merging the diverse records.
Moving to a new infrastructure would, at minimum, require upgrades to our existing integrated library systems. At worst, it could require migrating to entirely new systems. Neither of these solutions is without problems, as anyone who has switched systems will confirm. But migrating is unlikely to be as difficult as it will be to change us. Those of us who have only known a MARC world may find it difficult to learn how to build and use a diverse bibliographic environment effectively.
Despite these challenges, such changes are both necessary and achievable. They are necessary to exploit new metadata opportunities and technologies like XML and the Internet. Our choice is to remake our bibliographic infrastructure and achieve new levels of service, or to maintain the status quo and risk becoming increasingly (and deservedly) marginalized.
The challenge to systems, utilitiesWill our integrated library systems be up to the task? Although a number of vendors clearly see a future based on XML, it will be some time before their systems can easily accommodate records from a variety of formats. Our major bibliographic utilities, RLIN and WorldCat, are moving in the right direction. OCLC has remade WorldCat from the bottom up, employing XML and an in-house XML schema dubbed "XWC" that accommodates Dublin Core, MARC, and other formats. This is just the beginning of a rich bibliographic infrastructure that could employ metadata in just about any XML-encoded form.
If you are intrigued by this vision for a bibliographic infrastructure, watch for my article "A Bibliographic Metadata Infrastructure for the 21st Century" in an upcoming issue of Library Hi Tech. Meanwhile, consider where you think our bibliographic systems should be and let me know your thoughts.
| LINK LIST | ||
| METS www.loc.gov/standards/mets |
MODS www.loc.gov/standards/mods |
ONIX www.editeur.org/onix.html |
| Author Information |
| Roy Tennant (roy.tennant@ucop.edu) is Manager, eScholarship Web & Services Design, California Digital Library. He is founder and manager of the electronic discussion lists Web4Lib and Current Cites |





















