Library Journal Mobile
Log In  |  Register          Free Newsletter Subscription
Subscribe to LJ Magazine

Digital Libraries: The New Cataloger

By Roy Tennant -- Library Journal, 4/15/2006

I've often said librarians should like any metadata they see. This is because we are entering an age where MARC no longer rules, since the 21st-century library will be handling increasing amounts of born-digital material. Even now, librarians are using formats such as Dublin Core (DC), Metadata Object Description Schema (MODS), and Metadata Encoding and Transmission Standard (METS), among others, to capture and manipulate important data about various information resources. One metadata standard is way too inadequate for the job.

Our job requires much more than facility with new formats. We will need new kinds of tools that are only now beginning to be imagined and created for a growing amount of born-digital material as well as books. Publishers are increasingly supplying machine-readable metadata about the publications they put out—largely to enable their books to be sold by Amazon and other online booksellers. These records could provide much enriching information to our existing MARC data if the infrastructure were in place to normalize the records. Publishers often provide cover art, pull quotes from reviews, descriptive text, author biographies, and other useful material that MARC records typically lack, which vendors like Syndetic Solutions supply to libraries for on-the-fly display.

The inside scoop

How do I know this? I walk around with over 10,000 ONIX metadata records on my laptop that I downloaded from willing publishers. If we had a service to collect these records from publishers and make them available to catalogers, we could have access to many valuable facts about library materials. The real news is what completely original kinds of tasks catalogers will be expected to perform.

In an online world, where there are many amazing free resources, librarians must get better at selecting and providing access to the right slice of this material. Part of this will entail harvesting (automated gathering) of metadata that describes freely available resources. OAIster.org, the mega-harvester site at the University of Michigan, has gathered records for over seven million freely available resources.

Gathering is just a start

As work at the California Digital Library, Cornell University, University of Illinois at Urbana-Champaign, and other places demonstrates, the 21st-century librarian must be good at normalizing and enriching selected piles of metadata. Metadata created for one purpose or system may not be optimized for another purpose or system. Also, when you aggregate a wide variety of metadata, you find a surprising number of variances in encoding practices as well as simple errors (see “Bitter Harvest”).

In response, we are investigating ways to normalize and enrich metadata for greater versatility. Our first success is a utility for normalizing and enriching dates. For example, when given a date as “1880s,” the function will create four new date fields, from a normalized “1880-1889” to a set of date tokens for enabling searching (e.g., 1880, 1881).

This type of operation can be executed as a record is captured and placed into a database, but other types of metadata transformation cannot be performed simply by software, e.g., assigning subjects. Experiments with topical clustering software have been encouraging but not flawless. The optimum solution may be to enable a cataloger to view automatic subjects made by the software and remove or add topic assignments.

A new toolbox

We also see a need for tools that enable a group of records to be selected based on virtually any criteria and then transformed in a particular way (e.g., change all occurrences of X to Y). As such, the modern cataloger will one day be a software-enabled specialist who can gather, subset, normalize, and enrich piles of records for a specific audience or purpose.

The real challenge is the retooling and reeducation of those already in the field. A number of LIS programs have adjusted their curricula. A good place to start is Karen Coyle's “Metadata: Data with a Purpose.” The need for catalogers will not go away soon, but what they will be asked to do will be very, very different.

For more on the wired library, see the netConnect supplement mailed with this issue and with the January, July, and October 15 issues of LJ


Link List
Bitter Harvest
www.cdlib.org/inside/
projects/harvesting/bitter_harvest.html
Date Normalization Utility
www.cdlib.org/inside/diglib/datenorm
Metadata: Data with a Purpose
www.kcoyle.net/meta_purpose.html
METS & MODS
www.loc.gov/standards
OAIster
oaister.org
ONIX for Libraries
dali.cdlib.org:8080/onix


Author Information
Roy Tennant (roy.tennant@ucop.edu) is User Services Architect, California Digital Library. He is author of Managing the Digital Library (Reed Business Pr., dist. by Neal-Schuman)

Talkback

We would love your feedback!

Post a comment

» VIEW ALL TALKBACK THREADS

Related Content

Related Content

 

By This Author

Sponsored Links




 
Advertisement
Sponsored Links

More Content

  • Blogs
  • Podcasts
  • Photos

Blogs

  • Cheryl LaGuardia
    E-Views

    July 3, 2009
    Another Bing Convert
    I’ve been playing with Bing (Microsoft’s new search service) ever since learning about i...
    More
  • Roy Tennant
    Tennant: Digital Libraries

    July 3, 2009
    "The Flow" Revisited: The Personal Angle
    Earlier in the week I wrote again about "the flow" -- that is, sources of information and ...
    More
  • » VIEW ALL BLOGS RSS

Photos

  • Design Institute 2007
    December 11, 2007 at Chicago's Harold Washington Library Center:Design Institute 2007
  • Learning Gardens
    New York's GreenBranches program links the library to the street.
  • Green Picks: LBD May 2007
    Want to reduce your library's carbon footprint? Join the Cradle-to-Cradle revolution. Helen Milling shares the green products her firm is using.
Advertisements





LJ NEWSLETTERS

Click on a title below to learn more.

LJ BookSmack
LJXPRESS
LJ ACADEMIC NEWSWIRE
LJ REVIEW ALERT
CRÍTICAS
©2009 Reed Business Information, a division of Reed Elsevier Inc. All rights reserved.
Use of this Web site is subject to its Terms of Use | Privacy Policy
Please visit these other Reed Business sites