The Library of Congress Posts Update and Releases Report About What’s Going On With Their Twitter Archive

by Gary Price
Jan 04, 2013 | Filed in Programs+

Update Digital Preservation expert and Founder of LOCKSS, Dr. David Rosenthal offer some analysis of the amount of data the archive contains. Hat Tip: @lorcand — The Library of Congress is out with a blog post and white paper (embedded below) that provides info about the complete archive of tweets that Twitter donated to The [...]

Update Digital Preservation expert and Founder of LOCKSS, Dr. David Rosenthal offer some analysis of the amount of data the archive contains. Hat Tip: @lorcand

—

The Library of Congress is out with a blog post and white paper (embedded below) that provides info about the complete archive of tweets that Twitter donated to The Library of Congress.

The donation was first announced on April 15, 2010 in blog posts by LC and Twitter.

Since then LC has remained very quiet with details about how the the Twitter archive might be used and if it would be available to the public either online or in person at LC.

While LC officials did make comments from time to time almost no new details emerged although we asked…a lot. We never understood (and still don’t) why LC has been so tight-lipped about this project.

One thing we did learn was that a Boulder, CO company named Gnip was working with LC to build the archive. By the way, Gnip is also provides (fee-based) exclusive access of every publicly available tweet back to 2006.

Today’s Update

Today, almost 1000 days after it was first announced, LC’s Director of Communications, Gayle Osterberg, has written a blog post with an update about the LC’s Twitter archive.

Key Points from the Blog Post Archive of tweets from 2006-2010 now complete. Contains 170 billion tweets. “The volume of tweets the Library receives each day has grown from 140 million beginning in February 2011 to nearly half a billion tweets each day as of October 2012.” LC’s focus is now, “addressing the significant technology challenges to making the archive accessible to researchers in a comprehensive, useful way.” Getting this done is a priority for LC LC has received more than 400 requests from researchers to use archive

It’s good to learn some new details about how the project is going.

However, the post and report lack specifics about:

Access to the archive (Who will be able to access? How will the process work?) A preliminary/tentative timeline about when this access might become available. Later this year? Next year? Details about the technology that will be used to search, organize tweets? We did learn when the project launched that the Computational Approaches to Digital Stewardship partnership between Stanford and LC might be involved. Are they? Were they? Why LC has been so quiet about how the project was developing.

The Washington Post has a story about the Twitter archive that includes several interesting details (not included in the LC document) that helps answer some of the questions listed above. This article includes several quotes from Deputy Librarian of Congress Robert Dizard that makes it sound like providing access for researchers will not be taking place anytime soon.

See: “Library of Congress has archive of tweets, but no plan for its public display.”

On the Data LC Has Now Archived

“It’s pretty raw,” [Deputy Librarian of Congress Robert] Dizard said. “You often hear a reference to Twitter as a fire hose, that constant stream of tweets going around the world. What we have here is a large and growing lake. What we need is the technology that allows us to both understand and make useful that lake of information.”

On Access

For now, giving researchers access to the archive remains cost-prohibitive for the cash-strapped library, which has spent tens of thousands of dollars on the project so far, Dizard says.

“We know from the testing we’ve done with even small parts of the data that we are not going to be able to, on our own, provide really useful access at a cost that is reasonable for us,” Dizard said. “For even just the 2006 to 2010 [portion of the] archive, which is about 21 billion tweets, just to do one search could take 24 hours using our existing servers.”

Future Plans

The eventual plan is to make the collection available only within the Library of Congress reading rooms. Requiring an in-person visit to search a database of material that originated online may seem incongruous, but Dizard says it’s a condition of the deal with Twitter, which gifted the archive, so that the library won’t be “competing with the commercial sector.”

Finally, here’s the complete white paper that LC made available to day. The section titled, “The Library of Congress Agreement with Twitter” includes details that have not been made public to this point although we asked LC several times back when the archive project was first announced.

Update on the Twitter Archive At The Library of Congress

Add Comment :-

Comment Policy:

Be respectful, and do not attack the author, people mentioned in the article, or other commenters. Take on the idea, not the messenger.
Don't use obscene, profane, or vulgar language.
Stay on point. Comments that stray from the topic at hand may be deleted.
Comments may be republished in print, online, or other forms of media.
If you see something objectionable, please let us know. Once a comment has been flagged, a staff member will investigate.

Fill out the form or Login / Register to comment:

(All fields required)

First Name should not be empty !!!

Last Name should not be empty !!!

email should not be empty !!!

Comment should not be empty !!!

Please check the reCaptcha

Comment should not be empty !!!

CONTINUE READING?

Non - Subscribers

Subscribers

INNOVATION

MIT’s Grand Challenges Issues Final Report

by Lisa Peet

ARCHIVES & PRESERVATION

VHS Preservation Project Announces Founding Members

by Matt Enis

ARCHIVES & PRESERVATION

UBC Library Partners with French Department on Revolution Pamphlet Collection

by Lisa Peet

ARCHIVES & PRESERVATION

Library of Congress Launches Crowdsourcing Platform

by Matt Enis

INNOVATION

Controlled Digital Lending Concept Gains Ground

by Matt Enis

ARCHIVES & PRESERVATION

Adam Matthew Launches Quartex Digital Library Platform

by Matt Enis

RECOMMENDED

REVIEWS+

Run Your Week: Big Books, Sure Bets & Titles Making News | July 17 2018

Neal Wyatt Jul 17, 2018

The Other Woman by Daniel Silva leads holds this week. Former President Obama has more summer reading. Downton Abbey is heading to the movies.

TECHNOLOGY

Materials on Hand | Materials Handling

Matt Enis, May 16, 2018

Automated systems are helping libraries move staff to patron-facing work, while manufacturers innovate new design features.

PROGRAMS+

LGBTQ Collection Donated to Vancouver Archives

Lisa Peet, Jun 21, 2018

Longtime archivist, former head of the Vancouver Public Library’s history division, and queer rights activist Ron Dutton donated more than 750,000 items documenting the British Columbia LGBTQ community to the City of Vancouver Archives in March.

ALREADY A SUBSCRIBER? LOG IN

We are currently offering this content for free. Sign up now to activate your personal profile, where you can save articles for future viewing

The Library of Congress Posts Update and Releases Report About What’s Going On With Their Twitter Archive

Get Print. Get Digital. Get Both!

Add Comment :-

Comment Policy:

CONTINUE READING?

Added To Cart

RELATED

MIT’s Grand Challenges Issues Final Report

VHS Preservation Project Announces Founding Members

UBC Library Partners with French Department on Revolution Pamphlet Collection

Library of Congress Launches Crowdsourcing Platform

Controlled Digital Lending Concept Gains Ground

Adam Matthew Launches Quartex Digital Library Platform

Run Your Week: Big Books, Sure Bets & Titles Making News | July 17 2018

Materials on Hand | Materials Handling

LGBTQ Collection Donated to Vancouver Archives

Log In

REGISTER FREE to keep reading

If you are already a member, please Log In

Success.

Create a Password to complete your registration. Get access to:

ALREADY A SUBSCRIBER? LOG IN

ALREADY A SUBSCRIBER? LOG IN

Thank you for visiting.

SUBSCRIPTION OPTIONS

Already a subscriber? Log In

Thank you for visiting.

Already a subscriber? Log In

Already a subscriber? Log In