Wednesday, November 9, 2011

Deselection and 'Remedial' Discovery

Last Wednesday in Charleston, my colleagues Bob Kieft (Occidental College) and Sam Demas ('freelance librarian') and I facilitated a day-long preconference on 'Shared Print Archiving: Building the Collective Collection." We benefited from a full roster of speakers experienced in management of shared print collections: Lizanne Payne (CRL, WEST, and CIC); Emily Stambaugh (California Digital Library), John MacDonald (Claremont Colleges), Kathryn Harnish (OCLC), Michael Levine-Clark (Univ of Denver), Judith Russell (Univ of Florida, ASERL), Rachel Frick (CLIR/DLF), and Doug Way (Grand Valley State University).

The hordes gather to discuss shared print archiving

 Because we employed lightning talks and approached the topic broadly, discussions were more exploratory than definitive. As befits a discipline in the throes of formation, a certain element of confusion and chaos attended the day. But my overall sense is that a group of smart, experienced people worked hard to share information about disparate efforts, and to integrate them into a regional and even national conversation. More work, more focus, and still broader participation are needed, but we built on good work already underway.

One strand of conversation very much surprised me. The discussion around deselection and drawdown of duplicative print collections repeatedly turned toward discovery and digital content. This seems ironic in some respects, as shared print initiatives tend to focus first on titles that have never circulated. Why would we be concerned about the discoverability of content that has remained untouched for 10-20 years? On the surface, deselection and discovery seem like mutually exclusive categories. Is there really a need to enhance discovery of something that is being withdrawn or moved to storage for lack of use?

 Well, perhaps there is. As a group, we surfaced several arguments in favor of what I'll call 'remedial discovery.' 
  • Self-fulfilling prophecy: One reason that titles don't circulate maybe because they are not found. Users are not always skilled searchers, and even the best cataloging records have a limited number of access points. Cataloging errors may also play a role here (e.g., a misspelling in a primary access field).
  •  Shared print collections limit physical browsing: Paradoxically, the decision to rely on copies that are not held locally in open stacks increases the desire for some form of virtual browsing or enhanced discovery. A user may want to know more about a book before requesting it from another library or from a remote storage facility. The further away the books are, the more desirable virtual browsing appears.
  • Record enhancements have not been universally applied: Many OPACs have taken a page from Amazon to include cover scans, flap copy, and tables of contents. But these enhancements have not been adopted for all libraries, and may not even be available for older titles--those most likely to surface as withdrawal or storage candidates. In some cases, older titles may not have had the same level of exposure as newer titles.
  • Discovery layers are just coming into their own: Discovery tools have improved dramatically in the past few years. The more content that is indexed in those tools, the better the chances a user will find resources that may have been overlooked in the past. Here again, older materials have not benefited from these newer techniques. Perhaps they need another chance, with better tools.
Search: 'you are a faithless mad son of clocks and buzzers'
  • Full-text indexing could stimulate use of older titles: Already the Google Books, Internet Archive, and Hathi Trust interfaces have radically improved the chances of finding older titles -- perhaps older titles that have previously been little used. 

It's an interesting take, and perhaps worth some experimentation. We've come at this topic from other angles previously, in posts on 'patron-driven re-acquisition' and 'curating a discovery environment.' All of this needs to be thought through more carefully, but maybe we ought to consider two simultaneous courses of action once unused titles have been identified.
  1. Continue to draw down highly-redundant print collections in the context of shared print archiving and secure digital collections.
  2. Enhance the remaining records for optimum discoverability. Give them a second chance to benefit from newer discovery tools.
The second of these is somewhat counter-intutitve, since it involves additional investment in a resource that has already cost far more than it has yielded. Some titles may not benefit from the additional work. But it may be worth testing on a small scale. Not only would it level the playing field for older titles, it would provide additional convenience to users examining content remotely. Specific enhancements might include:
  • Add the Hathi Trust public domain URLs (where available) to catalog records for low-use titles
  • Add Tables of Contents to catalog records for all eligible withdrawal candidates
  • Add cover scans, flap copies, and links to reviews for these older, low-use titles
  • Add eBook PDA records for withdrawal or storage candidates
  • Devise a virtual browse function, similar to the Hathi page turner or Amazon 'Look Inside the Book'
No doubt there are other ideas. Some of them will require a good deal of work and investment. There are definitely some trade-offs here, and perhaps the approach must be selective to be affordable. But it's intriguing to think about creating better forms of discovery and access for material that is going offsite or will be held by another library. Lack of browsability is one of faculty's main objections to removing print from central campus stacks. Connecting deselection and enhanced discovery may be one way to answer that.

Tuesday, November 1, 2011

The 'Disapproval Plan' Revisited



In the December 2008 issue of Against the Grain, I introduced a new concept: "The Disapproval Plan: Rules-Based Weeding and Storage Decisions" [pdf]. The article's title was only somewhat tongue-in-cheek. As I tried to demonstrate, selection and deselection represent the same function, performed at different points in a book's lifecycle. At both points, titles are accepted or rejected. Approval plan profiles assure consistent and customizable treatment of newly published titles. Disapproval plan profiles assure consistent and customizable treatment of older titles that have not circulated much. The goal, for most libraries, is to create--and maintain-- an active collection, relevant to the current and future needs of its users.

Approval plans support content acquisition decisions; disapproval plans support storage, weeding, and shared print decisions. Both approval plans and disapproval plans safeguard collection integrity while providing an efficient, reliable alternative to title-by-title scrutiny. Both approval plans and disapproval plans have limitations; both are most appropriately deployed to handle mainstream titles. Subject experts, whether librarians or faculty, remain essential for specialized materials and judgment calls, but rules-based or profile-driven approaches can relieve them of the need to make many obvious and repetitious decisions.

With the benefit of 3 more years of thinking about collection use, deselection, and shared management of print monograph collections, it's become clear that my initial sketch of the disapproval plan concept can be drastically simplified and refined. The original concept focused on commercial alternatives to locally-owned print, such as Google Books, eBook aggregators, print-on-demand, and the used book market. In retrospect, this approach over-emphasized 're-obtainability' and understated the fundamental importance of archival commitments and operating in the context of the 'collective collection.'

Since then, the emergence of the Hathi Trust digital archive (now containing 5.1 million full-text book titles), and shared print archiving initiatives such as those developed by WEST, ASERL, ReCAP, CRL and the CIC have changed the picture, creating new safeguards and expanding deselection options. The work of Constance Malpas, Lizanne Payne, Paul Courant, OhioLINK/OCLC and others has provided new data and insight on print monographs overlap, storage capacity, and costs.

One key fact has not changed, though. The need remains for an automated tool that assembles relevant deselection metadata, and develops rules to operate against that metadata: a deselection decision-support tool. Now is the time to adapt the disapproval plan to new realities, and to incorporate both archival values and service values into the model. The November 2011 release looks like this:

A 'disapproval plan' is a set of library-defined rules that must accomplish four tasks:

Define the deselection universe. What pool of titles is eligible for deselection consideration? Data elements and distinctions might include:
  • Low-use or no-use titles: These can be identified from circulation, direct borrowing, in-house uses (if captured), and ILL data.
  • Titles owned more than x years: Titles should be given a chance to circulate. Most libraries won't consider withdrawing a title owned for less than 5-10 years. Publication date provides a rough approximation, but leaves older imprints that are recently purchased in the pool. Acquisition date or accession date are much more reliable.
  • Titles widely held elsewhere (see below).
  • Titles that will be kept regardless of use: Works by faculty authors, notable alumni, Nobel Prize winners or are cited in authoritative bibliographies may need to be exempted.
  • Specific editions or translations: A conservative approach would suggest that matching be conducted with FRBR work families turned off. This retrieves only those holdings that reflect the edition in hand.
Assure that withdrawal candidates remain secure.  Once the eligible low-use universe has been defined, the archival security of this content must be gauged. An individual library operating within the academic community must help assure that nothing is lost. While it may not be necessary that a title be held locally, we must satisfy ourselves that it has been secured somewhere. Deselection metadata and rules in support of collection integrity might include:
  • Presence of a print copy safely archived in a trusted repository
  • Presence in Hathi Trust digital archive 
  • Presence in Hathi Trust print archive [under development]
  • Explicit retention commitment for 4-6 copies nationally [MARC 583]
  • Number and distribution of other print holdings nationally, globally, or regionally (shared print archiving)
Assure that withdrawal candidates remain accessible.  Once archiving has been assured in both print and digital form, accessibility comes to the fore. In general, archival copies should not leave the facility where they are secured. Instead, 'service copies' are needed. Deselection profiles need to incorporate this factor, which identifies where usable copies of the content exist and in what form. Here the salient data points are:
  • Presence of a service copy in a regional service center
  • Availability of alternate editions; i.e., same content, different vehicle
  • Availability of commercial eBook versions/PDA records
  • Availability of a print on demand edition
  • Re-purchasable on the used book market 

Enable data-driven decisions to store, withdraw, or retain/curate. As approval profiles generate books, notifications, or exclusions when the profile is applied, the disapproval plan can support storage, withdrawal or retention decisions, depending on local needs. Low-use titles that are scarcely held elsewhere might become candidates for preservation or digitization, enabling any library to contribute to the collective collection.

    As with selection, deselection work can be done by the library without outside assistance. Reports generated from the ILS or open-source collection evaluation tools such as the GIST Gifts & Deselection Manager can be employed. But we suggest there is also room for a vendor-assisted model for deselection and disapproval plans, which is why we founded Sustainable Collection Services (SCS). SCS offers a full-service model for deselection, much like approval vendors do for selection. SCS advises on data extracts; normalizes and validates library-supplied bibliographic, circulation, item, and holdings data; and compares low-use titles to WorldCat, Hathi Trust, and other external sources. 

    We now offer in a mediated form the ability to interact with preliminary results and model different combinations of factors: use, time in collection, consortial partner print holdings, existence of Hathi Trust digital version, number of WorldCat holdings. FRBR on, FRBR off. Our tools support full-library analysis or focus on specific subjects or locations. Early in 2012, SCS plans to release a Web version of our service that will enable unmediated interaction with deselection metadata. The disapproval plan may yet live and thrive!

    A final note: 'disapproval' has negative connotations, but selectivity always implies a mix of acceptance and rejection. Approval profiles always drive as much rejection as acceptance. For English-language books, a large approval plan might supply 20,000 new titles out of 60,000 candidates. The remaining 40,000 new books have in effect been 'disapproved.'

    Lack of use over time constitutes disapproval of (or at least indifference toward) that title by patrons. They have chosen other books (or more likely, other [electronic] resources) to use instead. Disapproval in the form of withdrawal or removal to storage by the library simply reflects the preferences of users, and the security and accessibility of the content elsewhere.

    Can we 'neutralize' this term by basing disapproval strictly on data? Probably not. Too bad, because it's a very useful term. These are not bad books. Deselection does not connote disapproval of their content. They are simply not relevant or no longer relevant to a specific user community. Selectors face the same decision when first choosing which titles will enter the collection--and with far less data at their disposal. At deselection, there is a track record.