Login  |  Register          Free Newsletter Subscription
Subscribe to LJ Magazine
Email
Print
Reprint
Learn RSS

Digital Libraries: The New Cataloger

By Roy Tennant -- Library Journal, 4/15/2006

I've often said librarians should like any metadata they see. This is because we are entering an age where MARC no longer rules, since the 21st-century library will be handling increasing amounts of born-digital material. Even now, librarians are using formats such as Dublin Core (DC), Metadata Object Description Schema (MODS), and Metadata Encoding and Transmission Standard (METS), among others, to capture and manipulate important data about various information resources. One metadata standard is way too inadequate for the job.

Our job requires much more than facility with new formats. We will need new kinds of tools that are only now beginning to be imagined and created for a growing amount of born-digital material as well as books. Publishers are increasingly supplying machine-readable metadata about the publications they put out—largely to enable their books to be sold by Amazon and other online booksellers. These records could provide much enriching information to our existing MARC data if the infrastructure were in place to normalize the records. Publishers often provide cover art, pull quotes from reviews, descriptive text, author biographies, and other useful material that MARC records typically lack, which vendors like Syndetic Solutions supply to libraries for on-the-fly display.

The inside scoop

How do I know this? I walk around with over 10,000 ONIX metadata records on my laptop that I downloaded from willing publishers. If we had a service to collect these records from publishers and make them available to catalogers, we could have access to many valuable facts about library materials. The real news is what completely original kinds of tasks catalogers will be expected to perform.

In an online world, where there are many amazing free resources, librarians must get better at selecting and providing access to the right slice of this material. Part of this will entail harvesting (automated gathering) of metadata that describes freely available resources. OAIster.org, the mega-harvester site at the University of Michigan, has gathered records for over seven million freely available resources.

Gathering is just a start

As work at the California Digital Library, Cornell University, University of Illinois at Urbana-Champaign, and other places demonstrates, the 21st-century librarian must be good at normalizing and enriching selected piles of metadata. Metadata created for one purpose or system may not be optimized for another purpose or system. Also, when you aggregate a wide variety of metadata, you find a surprising number of variances in encoding practices as well as simple errors (see “Bitter Harvest”).

In response, we are investigating ways to normalize and enrich metadata for greater versatility. Our first success is a utility for normalizing and enriching dates. For example, when given a date as “1880s,” the function will create four new date fields, from a normalized “1880-1889” to a set of date tokens for enabling searching (e.g., 1880, 1881).

This type of operation can be executed as a record is captured and placed into a database, but other types of metadata transformation cannot be performed simply by software, e.g., assigning subjects. Experiments with topical clustering software have been encouraging but not flawless. The optimum solution may be to enable a cataloger to view automatic subjects made by the software and remove or add topic assignments.

A new toolbox

We also see a need for tools that enable a group of records to be selected based on virtually any criteria and then transformed in a particular way (e.g., change all occurrences of X to Y). As such, the modern cataloger will one day be a software-enabled specialist who can gather, subset, normalize, and enrich piles of records for a specific audience or purpose.

The real challenge is the retooling and reeducation of those already in the field. A number of LIS programs have adjusted their curricula. A good place to start is Karen Coyle's “Metadata: Data with a Purpose.” The need for catalogers will not go away soon, but what they will be asked to do will be very, very different.

For more on the wired library, see the netConnect supplement mailed with this issue and with the January, July, and October 15 issues of LJ


Link List
Bitter Harvest
www.cdlib.org/inside/
projects/harvesting/bitter_harvest.html
Date Normalization Utility
www.cdlib.org/inside/diglib/datenorm
Metadata: Data with a Purpose
www.kcoyle.net/meta_purpose.html
METS & MODS
www.loc.gov/standards
OAIster
oaister.org
ONIX for Libraries
dali.cdlib.org:8080/onix


Author Information
Roy Tennant (roy.tennant@ucop.edu) is User Services Architect, California Digital Library. He is author of Managing the Digital Library (Reed Business Pr., dist. by Neal-Schuman)

Email
Print
Reprint
Learn RSS

Talkback

We would love your feedback!

Post a comment

» VIEW ALL TALKBACK THREADS

Related Content

Related Content

 

By This Author

Sponsored Links




 
Advertisement
Sponsored Links

More Content

  • Blogs
  • Podcasts
  • Photos

Blogs

  • Roy Tennant
    Tennant: Digital Libraries

    October 8, 2008
    When A Good Idea Goes Bad
    Wikis are a good idea. They bring web authoring, and even collaborative web authoring, to the masses...
    More
  • Cheryl LaGuardia
    E-Views

    October 6, 2008
    Free EBSCO Research Starters
    EBSCO tells me they can’t give me any “freebie trials” of their products for the b...
    More
  • » VIEW ALL BLOGS RSS

Photos

Advertisements





LJ NEWSLETTERS

Click on a title below to learn more.

LJ BookSmack
LJXPRESS
LJ ACADEMIC NEWSWIRE
LJ REVIEW ALERT
CRÍTICAS
©2008 Reed Business Information, a division of Reed Elsevier Inc. All rights reserved.
Use of this Web site is subject to its Terms of Use | Privacy Policy
Please visit these other Reed Business sites