Many Vendors, One Face--Acquisitions' Next Wave
Rick Lugg and Ruth Fischer talk to the Cornell team that created a singular interface for different data
By Rick Lugg and Ruth Fischer (netConnect) -- netConnect, 7/15/2005
Academic librarians use multiple book vendor web systems for electronic selection and acquisitions, exporting the bibliographic records and transaction data to their integrated library systems (ILS). This has enabled selectors to make the transition from paper slips to e-selection—with major efficiencies for both collection development and acquisitions. But every materials vendor's system looks and acts differently. Because of this, it has been considered impractical to implement electronic selection for all material—until last year, when Cornell University Library (CUL) implemented ITSO CUL (Integrated Tool for Selection and Ordering).
ITSO, software developed by Cornell, begins to address this problem by consolidating multiple streams of new title data into a single interface. Within the ITSO interface, selectors review presorted and prescreened new titles in their respective subjects and decide to select, reject, forward, or defer. Selected titles can be appended with fund and location codes, as well as free-text notes to acquisitions. Error checking happens when the bulk import is run.
Each evening, selected titles are harvested, automatically assigned a vendor, and fed into the bulk import utility of Endeavor's Voyager system. Potential duplicates are segregated for follow-up by acquisitions. The remaining titles automatically generate pending purchase orders in Voyager. Typically, these are approved the next morning, and electronic orders are sent to the appropriate vendors. Cornell estimates that ITSO now accounts for 40 percent of its firm orders and that it has saved more than $100,000 per year in staff costs.
Other libraries have shown interest, and Cornell is now exploring whether there's sufficient demand to offer ITSO as a hosted application, available by subscription. Here, Scott Wicks, Adam Chandler, and Peter Hoyt—collectively identified as CUL—describe ITSO's origins, functionality, and possible future.
Rick Lugg: Let's start with the elevator conversation: In three or four sentences, what does ITSO do and why is it unique?
CUL: It aggregates new title information into a single interface. It replaces separate and often duplicative streams of paper. It serves as a source for a precatalog record.
What problem were you trying to solve when you first conceived it?
The initial problem had to do with the LC (Library of Congress) Alert Service moving away from paper. Previously, most selectors would get printed cards each week. We needed to replace that system with a web-based version that would allow selectors to know what new titles were coming out each week.
Why did you decide to bring in vendor records as well?
When we started looking at what we could get from the electronic version of LC Alerts, we realized that not only could selectors use it for a selection decision, but we could also "repurpose" that MARC record and import it into our OPAC. It could serve as the basis for a purchase order and for a cataloging record at receipt. We wanted to make as much use of the metadata as possible. So once we had the concept down, we realized we could do the same with any other source of MARC records.
So are you subscribing to LC's Books English file?
Actually, we're buying the entire file, including all languages. We had a history of buying the entire file on cards, so we wanted the same file electronically. We load files weekly, but we deduplicate against previous loads.
Cornell has done business for years with several vendors that provide similar web-based selection and acquisitions tools: Yankee Book Peddler, Harrassowitz, Casalini, Aux Amateurs, Blackwell's. Had you not implemented any of their systems in this way?
No, not for selectors. Most selectors liked paper. In a sense, because LC moved away from paper, that opened the door to an online system. Another reason selectors didn't use them was that each one had a different interface.
ITSO relies on regular streams of data from these participating vendors, as well as LC. All of them have invested heavily in their own interfaces, and it must have puzzled them that CUL intended to build functionality that already existed. How did vendors react when you approached them to obtain their data?
The initial reaction was a little bit of shock—and strong disappointment. We told them that [CUL] had never asked for those systems, because we knew the selectors wanted to stay with paper and didn't want to learn all these interfaces.
On the other hand, we did want to use the data the vendors had so painstakingly assembled—their own proprietary codes and descriptors that give a different perspective on a title. Once they understood that, they seemed to feel much better about it. We've talked about the possibility of linking back into their systems from our system. We haven't done it yet, but it's under consideration.
Also, we weren't asking them to build anything new. We were only asking to take some of what they had already done.
And you weren't asking for their entire databases but rather weekly streams of new titles?
Actually, we were only asking for an electronic version of the paper slips each vendor was already sending. In each case, there was a subject profile in place, and we only wanted them to send those prefiltered new records each week.
We don't think we're in conflict with the vendors, or that ITSO will reduce their business. Plus, if libraries can link out to their vendor of choice and see more value-added information, that could result in an order. People might use ITSO as a single selection interface but order in the vendor databases or the ILS.
Can you describe how ITSO works?
There are really three parts: the "ingest" part, the "display" part, and "batch-loading" into the catalog.
OK, let's start with ingestion. The incoming records are in MARC format?
Library of Congress, Aux Amateurs, Casalini, and Harrassowitz are MARC. Yankee Book Peddler and Blackwell's send delimited files so their proprietary descriptors can also be included. We store those in 5xx or 9xx fields in the ITSO record, depending on the nature of the data. Again, the only records we receive each week from the vendors are electronic versions of those that would previously have come as slips.
Are the incoming files deduplicated against each other?
Although Cornell has segmented coverage somewhat (e.g., Yankee Book Peddler for the United States, Blackwell's for the United Kingdom), there is some overlap in the various streams. The first record in is the one that gets loaded.
As we move ITSO out from Cornell, we would dedupe them further. But because of the nature of a Yankee record—which, since these replicate notification slips, indicates that a book will not be shipped on approval—that information is useful to a selector, so we purposely do not deduplicate the Yankee Book Peddler records. For all other vendors, we look for existing ITSO records based on ISBN and LCCN. If either of those matches, the record is not loaded.
Even if it has other unique fields?
Correct. The idea is to have a selector look at a record one time and not have to see the same record many times.
I'm not sure what the vendors would think of this, but do you have in mind eventually to create a sort of "superrecord," one that aggregates all the unique information from various versions of the record?
Yes, that is one possibility we are considering. It would be sort of an "uber-record." But again, if you've already selected before additional data is available, you'd have no reason to go back to it—unless you had deferred it.
Are items matched against the OPAC at initial load?
Yes, based on LCCN and ISBN. We're trying to be conservative in what we consider a duplicate. We felt these two fields allowed us to be conservative enough without reporting false positives. Subsequently, a selector can click on the record in ITSO, and the software will automatically search the OPAC—by author, title, or subject. Finally, at export from ITSO, Voyager's bulk import program enables another duplication check against the OPAC.
How often do files arrive? Who manages them?
Mostly weekly. We have a couple of front-line staff who receive emails notifying CUL that a new file is available. They key in a file name, and a script executes FTP retrieval. It's not a big deal to get the files. The vendors are doing a good job.
How are individual selector's "buckets" created from these files?
There is an admin tool to add or remove "buckets" or selectors. It also manages the call number table—a table that Scott maintains that says "for this call number range, send it to this selector." In addition, selectors have the ability to tailor their profiles for what to include or exclude from all or individual record streams.
How long are records retained?
Record views are customized to individual selectors, so a single record can be distributed to multiple selectors. If they order a title, the status of the record changes and is loaded to the catalog, and that selector does not see the record again. If another selector does not act on the same record, it remains in his/her file indefinitely. But when they log in again, they will see that a different selector has ordered it. If a selector rejects a title, that record stays in that selector's ITSO reject folder for 30 days, then goes away. Any selector sharing the record will see the record status set to "rejected."
How many records are in ITSO at any one time?
We have about 100,000 unique titles in the system right now.
Do you expect the database to grow over time?
Somewhat, although we'll have some control over that. We haven't decided yet how long to retain data—three years, five years? We're addressing it to some degree at the front end, by putting energy into filtering, so selectors get a finer selection of records sent to them. We assume that if fewer out-of-scope records are presented to them, they'll spend less time rejecting items and can use that time to keep up. But ITSO is really designed to stream relevant new titles to selectors, not to build as a database. It doesn't even have a search box at this point.
What do the selectors see? How can they access their titles?
They see an index screen of all new titles they have yet to act upon. That includes titles that have just been loaded and back to February 13, 2004 if they have not acted on them. Any title on which no action has been taken retains a status of "new." The index screen shows call numbers, place of publication and publisher, source of record, and a date of last update—so they can look at oldest records first.
Once a record has been selected, what happens to it?
Once a title has been selected, there's a status change. It stays in a "selected" screen view until those records are harvested overnight. We have a chron [automatically scheduled] job that looks for all records with a status of "selected." The harvested records are added to the post-ITSO process, which includes the bulk load. And those records are no longer in the ordering selector's box (though a selector who shares the record will see that status for the ordering selector's record change to "ordered"). Voyager becomes the database of record.
Does other processing occur between harvest and bulk import?
Vendor mapping occurs based on country of origin in the record, publisher, and type of record. Government document and serials records go to a separate stream and are not loaded automatically. Vendor mapping is based on the ability to predict what should happen with single-volume monographs. For instance, if a title has a UK place of publication, the vendor assignment would be Blackwell's, and the record loaded at bulk import would have the Blackwell's vendor code automatically included.
It sounds as if acquisitions staff have no need for a view into ITSO—that they interact with records only after they're imported into Voyager?
Either after they're loaded to Voyager or pushed to a separate folder. During bulk import, duplicate detection occurs based on ISBN, LCCN, system number, and title. If one of those matches an incoming record, that title is pushed off to the side, where staff can look at it and make a final determination.
What happens with those exceptions?
Acquisitions staff try to load the record that the machine has rejected. They hit a "load" button, and the system will bring both records up side by side for comparison. The percentage is pretty low—under ten percent.
Are additional vendor streamsplanned—especially for AV?
We'll look for opportunities with any format-specific vendor—audio, video, e-resource, legal materials. If we could get a stream of new titles from one of the law vendors, that would certainly increase the local use of ITSO. We're looking for opportunities to expand.
We've heard you say that 40 percent of Cornell's firm orders are processed through ITSO. What's the ultimate goal?
Well, we'd like to get to 100 percent, but, realistically, we'll end up at more like 60 percent—at least here at Cornell. For one thing, we don't buy only new materials, and we're not trying to force selectors to use the system when it makes no sense. So if they're trying to buy something from two years ago, they shouldn't have to find the record in ITSO.
What might ITSO 2.0 look like?
At Cornell, we're focused on this refined filtering ability. Beyond Cornell, there's a whole array of other issues, because every library needs slightly different features. We are confident that some other libraries will find this model very attractive. So what we want to do is take what we have and put it out there basically as is. But we'll make enhancements where we have to.
What would you need to take ITSO beyond Cornell?
We need some capital. Right now, we are all doing this out of our back pockets. To do any major reworking of the software, where we could share it with other people, especially in a hosted system, we would need an influx of capital. I don't know how much that would be…. It will likely depend on how many other libraries are interested in being involved.
We're looking at what level of demand there might be in the market, looking for partners, subscribers, perhaps potential sources of capital. So we're still in the investigative process. This is an entrepreneurial effort within the library. In the same way that a small business would need to show that it has potential, we have to do that with our library administration. We need some people to stand with us and say, "We like this. We want to work with you."
| Author Information |
| Rick Lugg and Ruth Fischer are partners in R2 Consulting, Contoocook, NH. Scott Wicks is Head, Acquisitions, Bibliographic Control, and Documents; Adam Chandler is Information Technology Librarian; and Peter Hoyt is Digital Library Programmer/Analyst, Cornell University Library, Ithaca, NY |
























