Login  |  Register          Free Newsletter Subscription
Subscribe to LJ Magazine
Email
Print
Reprint
Learn RSS

Shoestring Digital Library

If existing digital library software doesn't suit your needs, create your own

By Jonathan Weber -- Library Journal, 7/15/2006

Creating a digital library might seem like a task best left to a large research collection with a vast staff and generous budget. However, tools for successfully creating digital libraries are getting easier to use all the time. If your library has someone with experience building web pages, you might be well on your way to a digital library of your own with a little help from the right software.

Recently, I was involved in an exploratory theater production digital library project with fellow library school students, which included scripts, reviews, and production details. It was possible to make rich interconnections among plays, theaters, actors, playwrights—but these sorts of connections don't fit easily into digital library software. The “bibliographic record + item” paradigm that works so well for traditional materials and is translated to the digital realm in many digital library software packages might not work for digital materials.

One could customize digital library software to handle this information, perhaps by adapting the metadata to encode the linkages among the materials. This is possible, in theory, since it's open source software. But the common open source digital library applications are large, complicated systems written in languages such as C++ and Java, and attempting customizations that delve this deeply into their functionality can be daunting.

The solution? Don't use digital library software.

The explosion of people creating content for the web has led to the availability of many high-quality applications and frameworks for managing content. These aren't explicitly identified as “digital library” software but may nevertheless be useful. They need to be customized, but in many cases these applications were designed to be adapted, and their features and documentation make it easier for the nonprogrammer.

Manipulating your CMS

WordPress was designed for blogging, but it is flexible enough to create all kinds of web sites. It's written in PHP, a web programming language that mixes with HTML to produce dynamic web pages, and it has a host of template functions and plugins that make it easy to customize. The Western Springs History Project uses WordPress to present historical photographs of buildings in Western Springs, IL. Visitors can leave comments about the buildings; these often come from current or former residents with stories about the houses. It's a great example of using a tool from outside the usual software to build an engaging digital library from a public library, in this case Thomas Ford Memorial Library.

A web content management system (CMS) can also be used to organize and automate a web site. Drupal, Mambo, PostNuke, and Plone are all popular content management systems that offer a world of possibilities. Increased flexibility comes with increased complexity, so expect to spend more time customizing to achieve your objectives.

Eight Lessons from the Long Tail

Web Exclusive LJ talks to Chris Anderson When Chris Anderson described the long tail phenomena last year in his article in Wired magazine (10/05), he transformed how we think about hits and sleepers and the power of individuality when it comes to selecting something to read, listen to, or watch. Now, he's further exploring this idea in The Long Tail (Hyperion, Jul. 2006). LJ's Andrew Albanese talks to Anderson about the impact and potential of the long tail for libraries. Go to libraryjournal.com/Anderson.

Do it yourself

Ultimate do-it-yourselfers can try a web application development framework like Ruby on Rails, Django, or TurboGears. As “frameworks,” they provide the skeletal structure for an application, including ways of managing content in a database and viewing it using templates. They require using a programming language (Ruby, in the case of Ruby on Rails; Python for Django or TurboGears) to control the functions of the application and HTML to build templates.

We took this route with the aforementioned theater project using Ruby on Rails. It was much easier to organize the data in a database than it would have been to force the information into an existing paradigm. It does take more than knowing HTML to write a functional Ruby on Rails application, but by no means does one need to be a brilliant programmer.

The applications and software discussed above all require some expertise in writing and styling HTML, a general notion of how a database works, and in some cases a basic understanding of programming (variables, if/then statements, and so on). Most of all, they require a willingness to experiment to see what works and what doesn't.

Although we were writing an application from scratch, the Rails framework provides so much support, it's really more like customizing an ultrageneric application. Starting from a blank slate like this can be advantageous for best fulfilling your needs, and it allows great flexibility. For example, we were able to hook up data on theater locations to Google Maps to provide a nice browsing interface.

Mind the gap

Straying outside “digital library” software does have some disadvantages. Because they weren't designed specifically with digital libraries in mind, applications like WordPress and frameworks like Ruby on Rails don't automatically support library standards like MARC, Dublin Core, Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH), and others. You can certainly build in solutions for these, but they're not there by default.

The underlying software that makes everything work, such as database servers and web servers, has also become much easier to install and configure. If your library has a web-hosting service, check with the hosting company; you may already have what you need. If you need to install such software on your own server, you won't have to rely on command-line tools (unless you want to). For example, MySQL, the most popular open source relational database, now has easy-to-use installation wizards and graphical front ends for creating databases.

Most important, all of the open source applications discussed here (from Greenstone and DSpace to WordPress and Ruby on Rails) have loads of online documentation and active user groups, forums, and other resources. Other users are often willing to give you a hand. For additional help, look to your community for assistance. You might be surprised at the skills of some of your patrons, or a summer internship for a library science or computer science student could help cover a lot of ground.

Start with content

Every library is likely to have something unique and interesting. A small public library probably has a complete run of the town newspaper, historical photographs of downtown, or other historical documents.

A digital library can really open up access to historical materials. The physical items may be fragile and require limited access, but digital versions can be perused much more freely online.

However, a word of caution about preservation: preserving digital documents is at least as difficult as preserving the originals. Don't think the existence of a digital copy makes an item impervious to loss, and make sure to back up your digital assets regularly.

Once content is selected, it needs to be digitized (unless it's already in digital form). For images and text, that means scanning. There are a number of important factors to consider when scanning, including resolution and image file format. In general, the minimum requirements are an inexpensive desktop scanner and an image-editing program, along with some basic operating knowledge.

You'll also need to describe your content to help people use it. If you're lucky, maybe you already have MARC records. If not, you'll need to develop metadata using MARC, Dublin Core, or some other scheme of your choosing—preferably one based on standards.

Requirements before tools

Before selecting software, know what the software needs to accomplish. In most cases, the goal is to provide access to the digital materials over the World Wide Web. Think about the content and how users will want to search, browse, link, and comment. Develop a list of features and prioritize them into necessities and mere niceties. Careful consideration of the requirements will help you to evaluate better the software needs of your particular digital library project.

Among the requirements should be the things we've come to expect out of any web application: aesthetics, usability, and accessibility for people with disabilities. There are also several features you might want to consider that are specific to digital libraries, such as importing and delivering metadata in standardized forms (e.g., MARC, Dublin Core) or for enabling harvesting of records by search engines (e.g., OAI-PMH). The best way to get an idea of the requirements for a digital library project is to evaluate other similar projects for strengths and weaknesses. Remember that digital libraries aren't all called “digital libraries”; you may find any number of web sites, databases, or other applications to be inspiring.

Go forth

Developing requirements is an important phase, but don't get so hung up on getting the “perfect” requirements that you never get around to starting the project. You can make adjustments later.

You don't need the collection of the Library of Congress and a team of rocket scientists to make a digital library, and you don't need a big pile of grant money, either. All it takes to get started is web access and a fearless staff member with a little knowledge and a lot of curiosity. When you do get your digital library project up and running, don't forget to publicize it to let your community know how great it is. And expect to revisit it after it's been running a while to see what's working well and what could use improvement. Good luck with your digital library.

 

Digital Library Software

Once you have an idea of how your digital library should work, it's time to select some software. If a particular piece of software can meet your requirements, great. If it can't, there are three options: customize the software to meet your needs, find a different piece of software, or adjust your requirements. Some level of customization is to be expected in any digital library project, at least on a cosmetic level—colors and styles. Someone with reasonable experience with HTML and style sheets should be able to achieve this. More complex customization requires more time, effort, and skill. Weigh those against the value added and the resources available.

With that in mind, two software applications that should definitely be up for consideration are Greenstone and DSpace. Both are reasonably easy to install and configure in a basic way without a dedicated staff or programmer, and both are open source (so no licensing fees).

Greenstone is probably the most popular open source package for digital libraries, especially for small digital libraries, because of its ease of use. Developed in New Zealand in cooperation with UNESCO, it was expressly designed to be a free and simple way to create digital library collections. It's very easy to install and has a friendly “librarian interface” for creating collections. You can import metadata in Dublin Core, MARC, or a number of other formats, and it supports OAI-PMH for serving your documents or for harvesting others' metadata. The software supports a number of text, image, and multimedia formats. Greenstone was developed to be useful worldwide, and its interface is available in about 35 languages. Some customizations of look and feel are basically accomplished through configuration files and style sheets, but changes in functionality can be quite complicated, since Greenstone is primarily written in C++. However, Greenstone v.3, currently in development, is refactored in Java and XML and promises to be less cumbersome to customize.

DSpace is another popular open source application, developed at MIT and designed for use in institutional repositories (archives of the works of people at institutions, such as universities). As such, it accepts a wide variety of content types and has a flexible permissions system that allows contributions. DSpace uses a specialized version of Dublin Core for metadata and supports OAI-PMH. DSpace is written in Java.

There are several other open source packages available, such as ePrints (another institutional repository tool) and other tools for more specialized types of content. And, if you have some money available, there are commercial digital library tools as well, among them OCLC's CONTENTdm, and Ex Libris's DigiTool.


Link List
Django
www.djangoproject.com
Drupal
www.drupal.org
DSpace
www.dspace.org
Greenstone
www.greenstone.org
Mambo
www.mamboserver.com
Plone
www.plone.org
PostNuke
www.postnuke.com
Ruby on Rails
www.rubyonrails.org
TurboGears
www.turbogears.org
WordPress
wordpress.org
   

Unusual digital libraries
NYC Play Openings (Ruby on Rails)
plays.dystmesis.com
Western Springs History Project (WordPress)
www.westernspringshistory.org
 


Author Information
Jonathan Weber is an information architect, technical writer, and library science student at the University of Pittsburgh
Email
Print
Reprint
Learn RSS

Talkback

We would love your feedback!

Post a comment

» VIEW ALL TALKBACK THREADS

Related Content

Related Content

 

By This Author

Sponsored Links




 
Advertisement
Sponsored Links

More Content

  • Blogs
  • Podcasts
  • Photos

Blogs

  • Roy Tennant
    Tennant: Digital Libraries

    October 8, 2008
    When A Good Idea Goes Bad
    Wikis are a good idea. They bring web authoring, and even collaborative web authoring, to the masses...
    More
  • Cheryl LaGuardia
    E-Views

    October 6, 2008
    Free EBSCO Research Starters
    EBSCO tells me they can’t give me any “freebie trials” of their products for the b...
    More
  • » VIEW ALL BLOGS RSS

Photos

Advertisements





LJ NEWSLETTERS

Click on a title below to learn more.

LJ BookSmack
LJXPRESS
LJ ACADEMIC NEWSWIRE
LJ REVIEW ALERT
CRÍTICAS
©2008 Reed Business Information, a division of Reed Elsevier Inc. All rights reserved.
Use of this Web site is subject to its Terms of Use | Privacy Policy
Please visit these other Reed Business sites