Advertisement
Articles

HathiTrust's Copyright Detectives

E-Mail This Link


Enter recipient's e-mail:


Close
Email
Print |
RSS |
Share | |
By David Rapp Oct 21, 2010

An ongoing, grant-funded initiative involving the HathiTrust digital repository has been tracking down previously unknown public-domain materials in the repository's vast collection of scanned works from research libraries across the country.

HathiTrust has gathered several new partners in the last few months, including, most recently, Cornell University, as well as Dartmouth College, and the Triangle Research Library Network. Thirty-five research libraries have signed on so far, and most of them have donated digital scans of its materials to the project—nearly 7 million volumes so far—many of which had been digitized via the Google Books project, or similar initiatives. Cornell alone has pledged to add some 300,000 works by March 2011.

Many of the scans are of public-domain materials, such as works published before 1923, which are available for anyone to read via the HathiTrust Digital Library. Many others are copyrighted materials, for which HathiTrust must restrict access.

But, due to the vagaries of U.S. copyright law, some scans' status can be more of a mystery. If a work was published between 1923 and 1963, but the copyright holder didn't renew the copyright after its first 28-year term, it, too, is public domain. Those works should legally be accessible via the HathiTrust as well, but determining a work's copyright status requires research. That's where the Copyright Review Project comes in.

Tracking down statuses
The University of Michigan Library—which alone has deposited more than four million scans to the HathiTrust project—was awarded a $578,955 Institute of Museum and Library Services grant (match: $655,898) in 2008 for a three-year project. Its aim: to go through HathiTrust scans of works published between 1923 and 1963 and determine their copyright status.

The project has since expanded to include staff from other institutions, including the University of Minnesota, Indiana University, and the University of Wisconsin-Madison—currently about 20 staffers in all.

According to Anne Karle-Zenith, the Copyright Review Project Librarian at University of Michigan, the project has checked the status of about 95,000 books so far; of those, more than 52,000—greater than half—have been found to be public domain. The project looks at the books most recently deposited into the HathiTrust database, and that's a lot of books: there's a backlog of about 175,000 books currently, Karle-Zenith said.

The books are checked first against Stanford University's online Copyright Renewal Database, which Karle-Zenith said is a much less cumbersome process than other methods of research. (Periodicals are not part of the project, as the Stanford database only contains renewals for monographic books and pamphlets.)

If no copyright renewal is found, it's very likely none exists, though research results are sent to the U.S. Copyright Office to make sure nothing was missed. [See Karle-Zenith's clarification on this process below.] If no renewal is found, than the work is deemed public domain, and its full text is then made available through HathiTrust. (The project's full guidelines are available here [PDF].)

Karle-Zenith said that literature and textbooks appear to be renewed more often than other types of works, but there have been some surprising finds. For example, author Gore Vidal's early novel The Judgment of Paris (1952) was found to be public domain, as well as the 1947 textbook Atomics for the Millions, which contains some of the first published artwork by Maurice Sendak, who would later write and illustrate the children's book Where the Wild Things Are.




Reader Comments (11)


I wish we could get a list (or a record set) of these works; we would like to add bib records to our catalog for the post-1923, open access works. jeffrey.beall@ucdenver.edu

Posted by jeffrey.beall@ucdenver.edu on October 21, 2010 01:19:29PM

David, thanks for this piece - it's great! I just want to clarify one thing. We do not send all our results to the U.S. Copyright Office. As part of our efforts to evaluate our work, the IMLS grant stipulates that we will engage the Copyright Office to undertake comparison searches of works we have determined to be in the public domain in order to evaluate the accuracy and effectiveness of the Copyright Review process. We sent the first set of volumes for evaluation in early 2010, and recently received results from the Copyright Office. Based on the budget allotted, the Copyright Office was able to evaluate 96 volumes for renewal status. Only four of our determinations were incorrect (i.e., a work we determined was not under copyright was actually renewed/still protected). Further analysis revealed only one of these volumes to be a true miss on the part of our reviewers. While the sample size is small it helped to confirm that our process and its reliance on the Stanford Renewal Database is producing reasonably reliable results.

Posted by Anne Karle-Zenith on October 21, 2010 02:51:14PM

Jeffrey, we do make our rights determinations for all works in HathiTrust publicly available via tab delimited files. Among other data about the works in HathiTrust, these files include rights information for each volume, including the rights attribute (e.g., public domain, in-copyright, etc.) as well as the rights determination reason code (e.g., copyright not renewed, no copyright notice on the piece, etc.). More information about HathiTrust Data Distribution & APIs and the HathiTrust Metadata is available here: http://www.hathitrust.org/data http://www.hathitrust.org/hathifiles_metadata

Posted by Anne Karle-Zenith on October 21, 2010 02:56:44PM

This is tremendously important work. Kudos to the University of Michigan and its partner institutions for putting in the hard work to enable the world to freely access "orphan works".

Posted by Roy Tennant on October 21, 2010 05:27:21PM

Previous | Next

Comments that include profanity, personal attacks, or antisocial behavior such as "spamming", "trolling", or any other inappropriate material will be removed from the site. We will take steps to block users who violate any of our terms of use. You are fully responsible for the content you post. All comments must comply with the Terms and Conditions of this site and by submitting comments you confirm your agreement to these Terms and Conditions.

Your name: *

Your email address: * (We won't publish this.)



* = Required information


 

Welcome the LJ Archives.

This archive site is the home to all LJ articles published prior to January 2012;
Advertisement

LJ Reviews Database

LJ Reviews Center

Latest Stories



From the Blogs



Advertisement

Advertisement

Connect with Library Journal


Follow on Twitter








About Us | Advertising Information | Submissions | Site Map | Contact Us | RSS | Subscriptions
©2011 Media Source, Inc., All rights reserved.
Use of this Web site is subject to its Terms of Use | Privacy Policy
Media Source Inc. Media Source Inc. Media Source Inc. Media Source Inc. Media Source Inc. Media Source Inc.