Login  |  Register          Free Newsletter Subscription
Subscribe to LJ Magazine
Tennant: Digital Libraries   


Link This | Email this | Blog This | Comments (3)


Hathi Trust Follow-Up
September 3, 2008

Just over a week ago I blogged about some fun I was having with metadata downloaded from the Hathi Trust. The Hathi Trust is being established by CIC institutions to be the repository of the content they are mass digitizing. The University of Michigan MBooks platform was being used to do it, and MBooks has as of today transformed itself into the Hathi Trust with the release of its new web site today.

The new web site is chock-full of information, including how to download data about the books through either OAI-PMH in MARC21 or unqualified Dublin Core or as abbreviated records in a tab-delimited file. I used the latter format to throw together a rough search of the data on my prototype site.

I think it's worth noting that from the August 1 data dump to the September 1 version, over 100,000 items had been added (from roughly 1.45 million records to well over 1.5 million, and the total today is over 1.6 million). With a so far somewhat steady 18 percent of these books being fully available in the United States, we're talking about almost 600 open access books being added per day. Now call me old-fashioned, but that's not chopped liver.

So kudos to the Hathi Trust, the CIC institutions that comprise it, and the good work they are doing to expose this information for the rest of us. They remain a shining example of how to do mass digitization right.

Posted by Roy Tennant on September 3, 2008 | Comments (3)


Industries: News & Features
September 4, 2008
In response to: Hathi Trust Follow-Up
Jeffrey Beall commented:

Thanks for posting this information. At my library, we've loaded over 100,000 Mbooks records into our online library catalog. The response has been affirming. Thank you also to the University of Michigan for generously sharing its data.




September 6, 2008
In response to: Hathi Trust Follow-Up
Jonathan Rochkind commented:

Can you remind us what CIC stands for and is?




September 6, 2008
In response to: Hathi Trust Follow-Up
Roy Tennant commented:

Jonathan, so sorry, CIC stands for Committee on Institutional Cooperation, which is fairly impenetrable. Their web site is at www.cic.uiuc.edu . Their "about" page says "The CIC is a consortium of 12 research universities, including the 11 members of the Big Ten Conference and the University of Chicago."





POST A COMMENT
Display Name or Registered Users Login Here.
Please restrict submissions to less than 7,000 characters (including any HTML formatting).

Before submitting this form, please type the characters displayed above. Note the letters are case sensitive:


Advertisement

Advertisements





©2008 Reed Business Information, a division of Reed Elsevier Inc. All rights reserved.
Use of this Web site is subject to its Terms of Use | Privacy Policy
Please visit these other Reed Business sites