Advertisement
Articles

Digital Libraries: The Year of the Open

E-Mail This Link


Enter recipient's e-mail:


Close
Email
Print |
RSS |
Share | |

By Roy Tennant -- Library Journal, 09/15/2007

Two events this year are ushering in a new era of openness—both in the source code and the file formats of commercial software. Adobe and Microsoft have announced technologies that are open and transparent (see “IDPF Hosts Digital Book 2007,” LJ 6/1/07, p. 27ff.). It is hard to overestimate the impact of these developments, since much of what they'll enable is yet to be seen. Still, they represent enormous potential for anyone interested in libraries, information technology, and coming digital services.

PDF evolves

Many people may not be aware that the Adobe Acrobat file format has long been an openly published specification, or that the full version of Adobe Acrobat could save an XML (extensible markup language) version of a PDF (portable document format). This trend is only intensifying as Adobe works with the Open Publication Structure (OPS) specification managed by the International Digital Publication Forum (IDPF).

The beta release of the Adobe Integrated Runtime (AIR)™ environment “allows developers to use HTML/CSS, Ajax, Adobe Flash®, and Adobe Flex™ to extend rich Internet applications (RIAs) to the desktop,” according to Adobe. Users can dynamically flow the text and repaginate as font size changes, thereby providing a much richer and more natural screen reading experience, not achievable with a standard Adobe Acrobat PDF file.

Adobe Digital Editions, which uses the OPS file format, is an AIR application. If you go to the Adobe Digital Editions site and download an ebook, you'll see the potential of this publishing platform (see “Digital Books Redux” in the link list).

The extensible office

An even larger development is the news that Microsoft is introducing a completely new (and open) file format with Office 2007. When you save a document in Word, PowerPoint, or Excel, the file will have the character “x” added to the typical filename extension, so “.doc” will be “.docx” in Word 2007. This signifies that the document is in XML, specifically OpenXML, a growing standard.

But there's more. If you add “.zip” to the end of the filename, turning “my.docx” into “my.docx.zip,” and then unzip it (by double-clicking on it), the “file” becomes a directory that reveals a package consisting of the document itself in XML as well as potentially a number of other components—for example, higher-resolution versions of the images in the document and the metadata describing it.

The true beauty of this design, however, is that it is extensible. Anyone can add components to this package. I could create a Dublin Core record describing my document, put it in the package, and zip it back up. When I give this document to someone else, it will have my contribution as well as the original files.

Implications for libraries

Documents will increasingly be open to other applications to manipulate, index, and transform. Librarians (and others) will find it much easier to capture files in their native format and do interesting things with them, such as indexing them for access and transforming them into canonical, standard formats for preservation, such as TEI (Text Encoding Initiative).

Also, the open “package” format of Microsoft files offers interesting opportunities for libraries to create metadata packages that can be inserted into the original document's ZIP configuration and transported transparently as one file. Only those who need to see the library metadata package have to check it.

With open software and file formats, the opportunities to enrich, expand, and embellish are unlimited. From such fertile fields innovation can flower. If we need a single word to describe 2007, I nominate open.


LINK LIST
Adobe AIR labs.adobe.com/technologies/air
Adobe Digital Editions www.adobe.com/products/digitaleditions
Adobe Flex labs.adobe.com/technologies/flex
Adobe Mars Project labs.adobe.com/technologies/mars
Digital Books Redux libraryjournal.com/blog/1090000309/post/1840011784.html
Microsoft Open X Format msdn2.microsoft.com/en-us/library/aa338205.aspx
Open Office XML Formats www.ecma-international.org/memento/TC45.htm
Open Publication Structure (OPS) www.idpf.org/2007/ops




 
Advertisement

LJ Reviews Database

LJ Reviews Center

Latest Stories



From the Blogs



Advertisement

Advertisement

Connect with Library Journal


Follow on Twitter








About Us | Advertising Information | Submissions | Site Map | Contact Us | RSS | Subscriptions
©2011 Media Source, Inc., All rights reserved.
Use of this Web site is subject to its Terms of Use | Privacy Policy
Media Source Inc. Media Source Inc. Media Source Inc. Media Source Inc. Media Source Inc. Media Source Inc.