Serials Solutions Debuts Vivisimo
Federated searching gets supercharged with results clustering technology
By Michael Rogers -- Library Journal, 11/1/2006
Federated searching has been both a blessing and a curse. It is wonderful for users to be able to search across all your holdings with one query, but often they have to hack through the results with a machete to find the item they really want. The process can be frustrating, time-consuming, and a headache for librarians and the vendors that supply the search mechanisms. For inexperienced searchers weaned on Google, it can be daunting and puzzling to use a search tool that pulls them through 100 databases.
Serials Solutions is attempting to “supercharge” federated searching by incorporating Vivisimo Results Clustering capabilities into its Central Search product at no charge to customers. JR Jenkins, Serials Solutions’ group product manager, argues that most of the work to date on the relatively young federated searching has been at the foundation and plumbing layer, “where you’re spending a lot of time and energy trying to build connections to the resources because the requirements for each one of them varies dramatically from content provider to content provider.”
“The challenge of working with the results,” he said, “is you end up with a big ol’ list of citations and you lose the subject categorization, the taxonomies, etc., all of that stuff that ideally can help a patron figure out what they’re looking for.” So one of the hot topics for search suppliers is the notion of relevance, and the pressure is on because of Google. Federated searching often doesn’t have an index, and vendors lack the capability to do a page-rank system like Google, so what can be done to improve it?
Engaging with the citations
For Serials Solutions, the answer is results clustering. Clustering is a classification system that aggregates citations based on common terms, so users find answers faster. “This thing is supercharging the user experience and has the patron engaging with the citations rather than with the software interface,” said Jenkins. A recent study comparing the navigation between searching Google and using a federated search showed that in federated searching, 50 percent of the time was spent navigating. On Google, it was 11 percent. “So the user is engaged with the software rather than the citations, and we need[ed] to change that,” Jenkins said.
Vivisimo embraces existing web use paradigms, “so there’s no notion of having to navigate with boxes and squares,” as visualization tools require, or make decisions about resources. One of the greatest challenges to federated searching is the time it takes to search the various databases, retrieve the citations, and process the results.
No magic bullet
The cluster algorithm runs in less than a second and will go through all the citations and find common terms and bring them together. “The results are pulled out in a way so that instead of having to go through the results a page at a time to find what they’re interested in, users can now look at them in aggregated clusters,” Jenkins said.
All the concepts and terms that surround their query are illuminated. Instead of having to have that “magic bullet” in the first two or three citations, clustering brings to the fore items that may have come from page nine.
Search tabs
Jenkins asserts that the keys to good clustering are accuracy, uniqueness, and, to some extent, natural language. To eliminate confusion, clusters are run against a thesaurus so that unique terms are identified. Central Search clustering lets users break down the results by date, by journal title, author name, and more.
For undergrads who don’t know journal titles outside of TIME or Newsweek, Jenkins said, clustering illuminates all the journal titles in a library’s collection and can drill down specifically to certain resources. “Same with authors; with a click and a glance you can see the scope of what they’ve done.”
The beauty of clustering is that the general searcher runs very simple phrase queries, and clustering takes that very general term and exposes it to all the specifics attached to that term, “so users can come to the search engine having to know a lot less than an advanced searcher and still benefit,” said Jenkins.



















