Advertisement
Articles

HathiTrust Full-Text Search Bolsters Research Appeal

E-Mail This Link


Enter recipient's e-mail:


Close
Email
Print |
RSS |
Share | |

Library-led Google Book Search alternative improves technology

Josh Hadro -- Library Journal, 12/03/2009

Go back to the
Academic Newswire
for more stories
  • 4.6 million volumes now searchable in full-text
  • Paving the way for computational research tools
  • In contrast to Google project, an emphasis on preservation

Launched a year ago, the HathiTrust, representing 25 research universities, has built up a reputation as something of a library-led alternative to a universal digital library branded by Google. With the recent addition of full-text search across its growing corpus, the HathiTrust is expanding the technology necessary to deliver much of what librarians have come to expect from such an undertaking.

The project debuted a beta catalog search in May which covered only the basic bibliographic records describing the nearly three million items then in its collection. Now, users are able to perform full-text search across the corpus, retrieving result frequency and page number locations, even from within works whose full texts are not visible.

HathiTrust currently boasts more than 4.6 million volumes, equaling some 1.6 billion indexed pages. The combined full-text and bibliographic metadata searching is performed via the open source Apache Solr/Lucene project, much the same technology that drives the VuFind catalog frontend software.

According to the HathiTrust, the new full-text search paves the way for future advanced search options, as well as "tools that can be used in computational research."

Corpus appeal
Both the beefed-up search functionality and the promise of a massive research corpus once again invite direct comparison to the eventual products of the Google Book Search settlement.

The planned computational corpus is highly anticipated by a number of researcher and could open up new lines of quantitative research into the humanities.

Value of preservation
An often overlooked difference between the two is the equal emphasis HathiTrust places on preservation as on access.

Highlighting this distinction between between Google's proposed commercial product and the HathiTrust, Associate University Librarian, University of Michigan, Ann Arbor, and Executive Director of HathiTrust John Wilkin wrote in a January Library Journal essay, "[D]espite Google's ability to provide rich access, there are important things Google cannot do. Most critically, Google cannot be that trust for the future."

Contact the author: josh.hadro@reedbusiness.com


Read more Newswire stories:

Values Remake: An Experiment in Virtual Collaboration

Barbara Jones, Ex-Director at Wesleyan, Named Head of ALA OIF and FTRF

CLIR Funds Documentation of "Hidden" Collections

Universities of Alaska and Miami Receive Substantial Gifts


Columns:
Digital Vampires, Zombie Books | Peer to Peer Review

Today You Are a University | From the Bell Tower


Ebrary offers swine flu Searchable Information Center; Mellon grant funds Univ. of Minnesota's EthicShare site; more details on new IGI platform

Best Sellers in Botany-Zoology





 
Advertisement

LJ Reviews Database

LJ Reviews Center

Latest Stories



From the Blogs



Advertisement

Advertisement

Connect with Library Journal


Follow on Twitter








About Us | Advertising Information | Submissions | Site Map | Contact Us | RSS | Subscriptions
©2011 Media Source, Inc., All rights reserved.
Use of this Web site is subject to its Terms of Use | Privacy Policy
Media Source Inc. Media Source Inc. Media Source Inc. Media Source Inc. Media Source Inc. Media Source Inc.