Confidentiality concerns
The appraisal module allows donors and archivists to review collections, including attached documents or photos, prior to transferring files to an archival repository. This includes tools that enable users to search for potentially sensitive content, such as credit card or social security information. If an institution or a donor has used specific formats for other types of sensitive information—such as a string of characters in a faculty identification card, or any unique, regular expressions that signal confidential communication within the collection—archivists can add those to ePADD's automated processing forms. Email messages can then be flagged, annotated, or restricted individually or in bulk prior to processing. These features will help archivists make email collections as accessible as possible, while ensuring privacy and confidentiality for donors and third-parties discussed in correspondence. “Depending on the email archive…there may be issues with [the Family Educational Rights and Privacy Act] FERPA or [the Health Insurance Portability and Accountability Act’s privacy rule] HIPAA, or state or local statutes around privacy and confidentiality,” said Josh Schneider, ePADD community manager, and assistant university archivist for Stanford University's special collections and archives. “In archives, you need to work a lot with donor privacy as well as third-party privacy. So, we wanted to make a tool that had functionalities in it that let people easily search for potentially private or confidential information and take actions on messages containing that information, and do it in a way that really lends itself to bulk actions to deal with the volume [of email collections].” During ingestion, the ePADD Processing module conducts several automated processes, Schneider said. For example, "it takes the various names and email addresses associated with a particular individual and it concatenates those, [resolving] the name of an individual," Schneider said. "It also identifies and extracts named entities in the archive,” using OCLC FAST to search Library of Congress subject headings, as well as the LC Name Authority File, DBPedia, and the Virtual International Authority File. “Persons, organizations, or locations that are mentioned in the subject line or body of the mail message, ePADD identifies those and extracts those. And a lot of the advanced browsing functionality and search functionalities that ePADD does depends on that early activity of extracting those named entities."Search and discovery
The discovery module runs on a web-server, enabling remote users to search an archived email collection using a browser, with full-text access limited based on the donor’s wishes or an institution’s policies. Remote users must contact the host institution to request access to specific full-text messages or attachments. For example, Stanford’s own ePADD discovery module for the library’s Robert Creeley email archive enables browsing, searching, and graphing by named entities, but redacts all other text from each message. “Because of policies at Stanford, we are only delivering the extracted entities—the persons, places, locations, and organizations,” Edwards said. “Within the body of the email message, you can see the extracted entities, but you won’t see the full text of the archive, nor do you see the domain for the correspondent’s [email].” Searching can be limited to incoming or outgoing messages, and a bulk search query box enables users to search a block of text to match against a collection’s entity index. Graphing tools enable users to visualize how often specific people, organizations, and locations were mentioned within the archive, and when those entities were mentioned. “It gives you really clean data, in which to see the top correspondence over time, or the top topics that have been discussed over time in the account,” Schneider said. By contrast, most email programs only facilitate discovery by searching. “You can’t go into, say, Gmail, and identify the top 10 people you corresponded with between 2005 and 2010, which locations were most discussed…those aren’t questions that most email programs can handle. But, ePADD, because it’s doing some indexing of the messages at ingest, is able to answer some questions like that.”On-site access
In contrast to the discovery module, the delivery module enables archivists to provide moderated full-text access to a processed email collection, typically in an on-site reading room. In addition to the searching and graphing functions of the discovery module, on-site users can generate complex tiered searches using a customizable lexicon, and explore images and other email attachments within the collection. Users can also request copies of messages or attachments using a “checkout cart”-type feature. Users can download ePADD and a detailed user guide from the project website library.stanford.edu/projects/epadd. The site also features a community resources page where new users can seek help, share expertise, or contribute a use case. While installing the discovery module on a web server will likely require the help of an institution’s IT department, Edwards said that the software is otherwise flexible and scalable enough that interested users can download it to their personal computers to explore its features using their own email accounts. Schneider encouraged archivists at other institutions to check out the free software, noting that, “we’re doing our best to try to promote [ePADD] as a community resource. It’s open source, and we’re interested in getting use cases and really developing a community of practitioners around the software.”We are currently offering this content for free. Sign up now to activate your personal profile, where you can save articles for future viewing
Add Comment :-
Comment Policy:
Comment should not be empty !!!