Recent Posts
Recent Comments
Most Commented On
Archives
Blog
Link This | Email this | Blog This | Comments (2)
BagIt -- Just BagItJuly 6, 2009 Perhaps I can be forgiven for seeing Michael Jackson in the digital preservation efforts of libraries, but when I ran across the "BagIt" initiative by the Library of Congress, the California Digital Library and Stanford, I couldn't help thinking about Michael Jackson's song "Beat It". So sue me.But it still might be better than the allusion the project itself uses, which hangs off the phrase "bag it and tag it." For the appropriate video for that tagline, I must leave it to your imagination, but as I recall there was something from the "Thriller" album that would work quite well. But I digress. BagIt is an intriguingly simple specification that aims to do one thing and do it well -- identify a set of files and transfer them reliably. There are some wrinkles, but overall it is a very straightforward way to accomplish a simple and yet all-important goal -- to transfer a set of files as a related unit. Some additional description from the Library of Congress web site describes this in greater detail:
This is all well and good, but I have to tell you that the "holey bag" just really makes my day. I mean, I couldn't come up with something this brilliant, and yet simple, on my best day: A bag filled with content is considered complete. A variation, called a holey bag, is gaining wider acceptance because of its flexibility. A holey bag has the standard bag structure but its "data" directory is empty. The holey bag contains an additional text file titled "fetch.txt" at the root level that lists the URLs of the files to be fetched (so-called "holes" in the digital collection to be filled in). A script consults the "fetch.txt" file, follows the URLs, downloads the files and aggregates them into the local "data" directory within the bag. The sender’s source files do not need to reside in the same directory or on the same server; they can be retrieved from many different sources. A holey bag becomes complete after the digital collection is entirely downloaded and its manifest file is verified.I love the absolute simplicity of this, and in this I see the guiding hand of John Kunze of the California Digital Library, who has always seen the utility of simplicity to enable longevity. His identifier scheme, Archival Resource Key (ARK), is but one example of this. Kudos to John and the rest of the team for coming up with something so simple and yet so useful and effective. By the way, if you're at all interested in this, you simply must see the video that is designed to introduce BagIt. Perhaps it isn't quite up to Michael Jackson's level, but given what resources the Library of Congress has to work with, it totally rocks. It has humor, awesome file footage, well-done segues, and overall good production values. Congrats to the crew who produced it. I'm telling you, just BagIt. Oh yeah, and tag it -- yeah, that's right, tag it too. And moonwalk into the bright light of the new day, knowing that your important content is safely bagged and tagged. Posted by Roy Tennant on July 6, 2009 | Comments (2)
July 7, 2009
In response to: BagIt -- Just BagIt DrWeb commented: The whole SheBagIt.. sorry, couldn't resist. Roy, it's a fascinating structure and functional use will probably follow the form. I was wondering if you know anything about the security aspect. The declaration file (authenticity), for example. Is there any encryption scheme or algorithm envisioned for these "bags"?
July 16, 2009
In response to: BagIt -- Just BagIt Peter Binkley commented: It's a great little spec. But I wonder - why was the "bag" metaphor for a collection of stuff still available? We've used folders, packages, buckets, bins, and more abstract terms like container or archive - how did bag escape? And what container metaphors remain unexploited? Will some future spec have to define a portmanteau?
Advertisement
|
Advertisements
|
|
|
|