Visualize This
By Judy Luther, Maureen Kelly, & Donald Beagle -- Library Journal, 3/1/2005
Three students hunch over their computers, each doing research for a paper on nanotechnology. They all began their assignment with a simple Google search, only to learn that there are "2,500,000 English pages for nanotechnology." Their problem is not simply the number of pages available on the topic but also their very different interests: one is working on an assignment for biology class, one is studying engineering, and the third is studying business.
Google's success--and its challenge--is that it is an extremely efficient word finder; it builds indexes of the words on web pages. But words can be tricky things when it comes to conveying meaning. And a simple list of web pages containing those words doesn't provide us with much insight.
Visualization software is designed to help users get a "picture" of the meaning behind the words. Its underlying premise is that information retrieval benefits from smart organization of content. With visualization software, our three students could see their search results clustered to represent the different aspects of nanotechnology. They can then iteratively drill down in the appropriate cluster, all the while narrowing their search and learning more about their specific topic. As a result, those 2.5 million Google results become a gateway to discovery as well as to meaningful answers.
But will the new crop of visual interfaces displace the text-based tools to which we are so accustomed? Or will they be integrated with text-based search as just another option? Only time will tell if they provide enough of an advantage to sway users and don't end up just another new technology--interesting only to the media, scholars, and a few early adopters.
Awash in informationInformation is now so accessible we are often thwarted by our own success in searching. When confronted with those thousands of search results on Google, we rely on its sophisticated ranking algorithms to bring the most important items to the top of the list. But if the answer is not on a "popular" web site, we may never see it on the fifth or 500thpage of our results.
Good visual displays can compress information, convey context and relationships, and allow an array of options to be explored along alternate paths. It's the difference between reading a book to find the answer--with text organized according to perceived importance, of course--or looking at a table of contents and index. True, a well-ordered list can be very efficient. But the challenge of crafting an ordering schema that is right for all users in all instances has become increasingly difficult.
New visualization technologies may allow users to have greater personal control over the retrieval process. Just as someone in a physical library can view and draw clues from the arrangement and proximity of the books, visualization software can help users navigate through a virtual information collection.
Same search, differing resultsInformation retrieval typically involves 1) selecting an appropriate resource, 2) searching it, and 3) examining the results to find suitable answers--or, more often, documents or sites that contain those answers. At any point in the process, a user may decide to return to an earlier step to refine the results or try a different approach. Different types of searches place differing demands on the steps in this process. Some searches are focused, others are more exploratory. Sometimes the searcher wants everything on a topic: often he or she will settle for a "good enough" answer.
Visualization tools provide a compact, browsable overview of the search results in the form of topical clusters, graphs, maps, or other devices that convey themes by how they group the results. Instead of scanning a list of results sequentially based on their importance as ranked by the search engine, we can now see what topics are represented in our results and select one or more topical subsets to further explore.
"The human mind grasps visual representations more quickly than an equivalent amount of text," says Susan Feldman, IDC's research vice president of content technologies in the company's Report on Interactive Data Visualization Tools. "The eye seeks to compare similar things, to examine them from several angles, to shift perspective in order to view how the parts of a whole fit together."
Imagine you are searching for information on a type of wine, Shiraz, for example. This type of query is difficult to accomplish directly using a search statement because you may want to explore a range of countries--all of South America--or vintages or brands rather than a single one. Open-ended queries in search engines often produce a large results set, and the order may have little to do with the searcher's interests.
Beyond navigation of search results, some visualization tools provide a high-level topical overview of an entire information resource, such as HighWire Press's Topic Map, which displays more than 54,000 subjects and their hierarchical subheadings. It offers a way to bridge the search/browse dichotomy by supporting "browse-initiated querying" as well as "query-directed browsing." This type of guided browsing provides opportunities for the serendipity that is often lacking in search.
How they workVisualization tools use two basic approaches to clustering information: they use metadata (such as cataloging information) that is associated with the information resource, and they use statistical and/or linguistic algorithms to create topical clusters on the fly. These approaches can be used alone or in combination to provide different views of the content.
They also differ in the type of visual metaphors that they employ. This can include minimal graphics based on nested lists or file folders; hyperbolic browsers; relationship diagrams using abstract shapes (circles, squares, lines) and connectors; geospatial maps, either abstract or actual; tables and graphs; time lines; or representations of real/concrete objects. For more on the different tools/companies, see "The Visualizers," p. 36-37.
Clever use of visual devices such as color, shape, size, position, and connection creates a multidimensional navigation space in which a lot of information can be conveyed in a compact display, including topics, relationships among topics, frequency of occurrence, importance, etc. While the elegance of the display can be appealing (or distracting), the real value comes from the users' ability to manipulate the display: to travel down a path, leaving breadcrumbs, collecting samples, and reversing and redirecting their steps. The excitement of a clever display will wear off quickly if users can't control the exploration process. Success calls for flexibility in supporting different user styles and in extracting top value from different information collections.
One limitation with visualization software is whether users have adequate screen sizes to view the new, multicolumn displays properly. A recent trend is the shift by the young, mobile, Wi-Fi, connected generation away from desktop computers (and large screens) to laptops. John Sack, associate publisher and director of HighWire Press, observes that "it is important to take this reduced screen real estate into account when designing the interface for visualization tools."
Projects in developmentThe corporate sector has used visualization technologies as part of enterprisewide systems since the late 1990s, incorporating products from KartOO, TheBrain, and ThinkMap. But in libraries, the projects are just underway.
Groxis is partnering with both an academic and a corporate library to extend the applications of Grokker. SunLibrary Grokker enables employees to execute one search across IEEE, netLibrary and the web. Cindy Hill, SunLibrary manager, reports that engineers offered testimonials such as "simply amazing" and suggested enhancements--a true sign that they are vested in its development.
Stanford University is at work with Groxis on the development of Grokker E.D.U., which faculty, staff, and students can download and use to set up saved preferences. Michael Keller, university librarian, says that "the Grokker map is easy to navigate and allows users to quickly get to relevant results." It also "allows a form of federated searching" that "offers a way to bring together information from both the public and the proprietary environment in a single, comprehensive search with results displayed in a way that helps readers make sense of it all."
OCLC is implementing a data visualization pilot project in conjunction with Antarctica Systems Inc. to evaluate library users' experiences with search and display using a visual interface that offers the option to "try an alternative view of ebooks" in the Electronic Books database on OCLC FirstSearch. This will take users to a visual representation of the Electronic Books database, a static database of about 211,000 ebook titles. Included in this pilot are Mekko Maps, which depict all the subcategories of a taxonomy map within horizontal and vertical bars that show the user the subject categories they could choose.
Impact on librariesPublishers such as the Institute of Physics and platform hosts such as HighWire have already incorporated visualization software to display their search results. Library users who find this approach useful may come to expect it as part of interfaces. It's also assumed that the generation known as "Millennials," now entering college, prefers visual over text-based systems. If popular consumer software, such as Google, adopt a visual interface, other large systems will quickly seem out-of-date.
In 2005, academic and public libraries in the United States will see several integrated systems offer visualization. Both TLC and VTLS are adopting AquaBrowser; Dynix is in discussions with them as well. Visualization tools may well raise the bar for all library systems vendors.
Visualization solutions may also increase the adoption of metasearch engines in libraries by providing an appealing way to present search results from multiple databases. Michael Gorrell, CIO at EBSCO Publishing, confirmed that EBSCO is talking to Groxis to allow Grokker to work with EBSCOhost databases.
Kate Noerr at MuseGlobal notes that the growing presence of metasearch means that users are viewing lower-precision, higher-recall information, and the problem then becomes how to find the needle in the even bigger haystack. "Visualization tools, such as clustering, analysis, winnowing, refinement, etc., will aid significantly in reducing what a metasearch engine retrieves for users," says Noerr.
The path aheadVisualization tools have received lots of positive attention from the popular press. When the new version of Grokker was released, David Kirkpatrick in Fortune wrote, "It makes me wonder if Google really does have search as sewed up as we often assume. When you use Grokker you realize just how brain-dead even the best search tools are today." And this January, CNN noted that these "intriguing technologies are getting better at bringing order to all that chaos and could revolutionize how people mine the Internet for information."
Academic librarians at many of the current test sites are cautiously optimistic. They report that users who try these tools generally find them useful for exploring search results, especially on topics at the periphery of their expertise. Users also value the opportunities for serendipitous discovery. However, these tools are not the first thing users turn to. "The key to success in implementing visualization tools is to understand your user," says HighWire's Sack. "Users are familiar with the Google interface. They will start with a keyword search no matter what you tell them to do. If the visualization tool is not a conspicuous, well-integrated part of the interface, it won't be used."
Today, it is ordinary users who determine the success of new technologies. The information market has become a mass market with revenues--both direct and indirect--that is dependent on a critical mass of users. The next several years will determine whether users will want to see it, as much--if not more-- than just read it.
| Link List | ||
| The Associated Press. "Better Search Results Than Google?" CNN.Com, Jan. 5, 2005; www.cnn.com/2004/ TECH/internet/01/05 /seeing.search1.ap |
Beagle, Don. "Visualization of Metadata," Information Technology & Libraries, December 1999. | Beagle, Don. "Visualizing Keyword Distribution Across Multidisciplinary C-Space," D-Lib, June 2003; www.dlib.org/dlib/june03 /06contents.html |
| Kirkpatrick, David. "Going Deeper Than Google," CNN, December 17, 2003; www.cnn.com/2003/TECH /ptech/12/17/fortune.ff.deeper .google/index.html |
OCLC's E-books Pilot with Antarctica http://ebooks.antarctica.net |
|
| Author Information |
| Judy Luther is President of Informed Strategies, a consulting firm; Maureen Kelly is a consultant and formerly Executive Director of Strategic Development for Nstein Technologies and VP for Strategic Development at BIOSIS; Donald (Don) Beagle is Library Director, Belmont Abbey College, NC |
|
























