Shoestring Digital Library
If existing digital library software doesn't suit your needs, create your own
By Jonathan Weber -- Library Journal, 7/15/2006
Creating a digital library might seem like a task best left to a large research collection with a vast staff and generous budget. However, tools for successfully creating digital libraries are getting easier to use all the time. If your library has someone with experience building web pages, you might be well on your way to a digital library of your own with a little help from the right software.
Recently, I was involved in an exploratory theater production digital library project with fellow library school students, which included scripts, reviews, and production details. It was possible to make rich interconnections among plays, theaters, actors, playwrights—but these sorts of connections don't fit easily into digital library software. The “bibliographic record + item” paradigm that works so well for traditional materials and is translated to the digital realm in many digital library software packages might not work for digital materials.
One could customize digital library software to handle this information, perhaps by adapting the metadata to encode the linkages among the materials. This is possible, in theory, since it's open source software. But the common open source digital library applications are large, complicated systems written in languages such as C++ and Java, and attempting customizations that delve this deeply into their functionality can be daunting.
The solution? Don't use digital library software.
The explosion of people creating content for the web has led to the availability of many high-quality applications and frameworks for managing content. These aren't explicitly identified as “digital library” software but may nevertheless be useful. They need to be customized, but in many cases these applications were designed to be adapted, and their features and documentation make it easier for the nonprogrammer.
Manipulating your CMS
WordPress was designed for blogging, but it is flexible enough to create all kinds of web sites. It's written in PHP, a web programming language that mixes with HTML to produce dynamic web pages, and it has a host of template functions and plugins that make it easy to customize. The Western Springs History Project uses WordPress to present historical photographs of buildings in Western Springs, IL. Visitors can leave comments about the buildings; these often come from current or former residents with stories about the houses. It's a great example of using a tool from outside the usual software to build an engaging digital library from a public library, in this case Thomas Ford Memorial Library.
A web content management system (CMS) can also be used to organize and automate a web site. Drupal, Mambo, PostNuke, and Plone are all popular content management systems that offer a world of possibilities. Increased flexibility comes with increased complexity, so expect to spend more time customizing to achieve your objectives.
Eight Lessons from the Long Tail Web Exclusive LJ talks to Chris Anderson When Chris Anderson described the long tail phenomena last year in his article in Wired magazine (10/05), he transformed how we think about hits and sleepers and the power of individuality when it comes to selecting something to read, listen to, or watch. Now, he's further exploring this idea in The Long Tail (Hyperion, Jul. 2006). LJ's Andrew Albanese talks to Anderson about the impact and potential of the long tail for libraries. Go to libraryjournal.com/Anderson. |
Do it yourself
Ultimate do-it-yourselfers can try a web application development framework like Ruby on Rails, Django, or TurboGears. As “frameworks,” they provide the skeletal structure for an application, including ways of managing content in a database and viewing it using templates. They require using a programming language (Ruby, in the case of Ruby on Rails; Python for Django or TurboGears) to control the functions of the application and HTML to build templates.
We took this route with the aforementioned theater project using Ruby on Rails. It was much easier to organize the data in a database than it would have been to force the information into an existing paradigm. It does take more than knowing HTML to write a functional Ruby on Rails application, but by no means does one need to be a brilliant programmer.
The applications and software discussed above all require some expertise in writing and styling HTML, a general notion of how a database works, and in some cases a basic understanding of programming (variables, if/then statements, and so on). Most of all, they require a willingness to experiment to see what works and what doesn't.
Although we were writing an application from scratch, the Rails framework provides so much support, it's really more like customizing an ultrageneric application. Starting from a blank slate like this can be advantageous for best fulfilling your needs, and it allows great flexibility. For example, we were able to hook up data on theater locations to Google Maps to provide a nice browsing interface.
Mind the gap
Straying outside “digital library” software does have some disadvantages. Because they weren't designed specifically with digital libraries in mind, applications like WordPress and frameworks like Ruby on Rails don't automatically support library standards like MARC, Dublin Core, Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH), and others. You can certainly build in solutions for these, but they're not there by default.
The underlying software that makes everything work, such as database servers and web servers, has also become much easier to install and configure. If your library has a web-hosting service, check with the hosting company; you may already have what you need. If you need to install such software on your own server, you won't have to rely on command-line tools (unless you want to). For example, MySQL, the most popular open source relational database, now has easy-to-use installation wizards and graphical front ends for creating databases.
Most important, all of the open source applications discussed here (from Greenstone and DSpace to WordPress and Ruby on Rails) have loads of online documentation and active user groups, forums, and other resources. Other users are often willing to give you a hand. For additional help, look to your community for assistance. You might be surprised at the skills of some of your patrons, or a summer internship for a library science or computer science student could help cover a lot of ground.
Start with content
Every library is likely to have something unique and interesting. A small public library probably has a complete run of the town newspaper, historical photographs of downtown, or other historical documents.
A digital library can really open up access to historical materials. The physical items may be fragile and require limited access, but digital versions can be perused much more freely online.
However, a word of caution about preservation: preserving digital documents is at least as difficult as preserving the originals. Don't think the existence of a digital copy makes an item impervious to loss, and make sure to back up your digital assets regularly.
Once content is selected, it needs to be digitized (unless it's already in digital form). For images and text, that means scanning. There are a number of important factors to consider when scanning, including resolution and image file format. In general, the minimum requirements are an inexpensive desktop scanner and an image-editing program, along with some basic operating knowledge.
You'll also need to describe your content to help people use it. If you're lucky, maybe you already have MARC records. If not, you'll need to develop metadata using MARC, Dublin Core, or some other scheme of your choosing—preferably one based on standards.
Requirements before tools
Before selecting software, know what the software needs to accomplish. In most cases, the goal is to provide access to the digital materials over the World Wide Web. Think about the content and how users will want to search, browse, link, and comment. Develop a list of features and prioritize them into necessities and mere niceties. Careful consideration of the requirements will help you to evaluate better the software needs of your particular digital library project.
Among the requirements should be the things we've come to expect out of any web application: aesthetics, usability, and accessibility for people with disabilities. There are also several features you might want to consider that are specific to digital libraries, such as importing and delivering metadata in standardized forms (e.g., MARC, Dublin Core) or for enabling harvesting of records by search engines (e.g., OAI-PMH). The best way to get an idea of the requirements for a digital library project is to evaluate other similar projects for strengths and weaknesses. Remember that digital libraries aren't all called “digital libraries”; you may find any number of web sites, databases, or other applications to be inspiring.
Go forth
Developing requirements is an important phase, but don't get so hung up on getting the “perfect” requirements that you never get around to starting the project. You can make adjustments later.
You don't need the collection of the Library of Congress and a team of rocket scientists to make a digital library, and you don't need a big pile of grant money, either. All it takes to get started is web access and a fearless staff member with a little knowledge and a lot of curiosity. When you do get your digital library project up and running, don't forget to publicize it to let your community know how great it is. And expect to revisit it after it's been running a while to see what's working well and what could use improvement. Good luck with your digital library.
|
| Link List | ||
| Django www.djangoproject.com |
Drupal www.drupal.org |
DSpace www.dspace.org |
| Greenstone www.greenstone.org |
Mambo www.mamboserver.com |
Plone www.plone.org |
| PostNuke www.postnuke.com |
Ruby on Rails www.rubyonrails.org |
TurboGears www.turbogears.org |
| WordPress wordpress.org |
||
Unusual digital libraries | ||
| NYC Play Openings (Ruby on Rails) plays.dystmesis.com |
Western Springs History Project (WordPress) www.westernspringshistory.org |
|
| Author Information |
| Jonathan Weber is an information architect, technical writer, and library science student at the University of Pittsburgh |




















