Processing Through Digitization: University Photographs at Loyola University New Orleans
Elizabeth Kelly, Digital Initiatives Librarian, Loyola University New Orleans
Archival Practice, volume 1, no. 1 (2014)
Abstract:
More Product Less Process (MPLP) is a method for dealing with backlogs of unprocessed materials. In 2012, Loyola University New Orleans applied the techniques of MPLP towards the digitization of its unprocessed University Photographs collection. This article includes a brief review of digitization and minimal processing literature, an analysis of the project thus far, and plans for the future.
Introduction
Loyola University New Orleans' University Archives are a largely unprocessed collection of institutional records. A portion of the collection is made up of photographs received from various persons and offices at the university over the course of the school's history. Without substantial description, organization, or personnel to fully process the collection, Special Collections & Archives (SC&A) staff looked for a method to quickly make this resource available to users. When an institution has a backlog of hidden collections, a technique for quickly processing materials can be found in the More Product, Less Process system devised by Mark A. Greene and Dennis Meissner in 2005.1 Since its inception, More Product, Less Process (MPLP) has become a possible method for dealing with an increasing number of archival responsibilities beyond just processing collections. In 2012, Loyola University New Orleans began to address its backlog of unprocessed materials using MPLP to digitize the University Photographs collection. The goal of this project was to make a potentially high-use portion of the University Archives accessible both on and off-site in a timely and cost-effective manner without substantially inhibiting the user's ability to research effectively. This article will focus on why this collection was selected for MPLP, how and why the selected workflow was developed, and how the project was assessed.
The University Archives
The University Archives at Loyola present a number of challenges and have causemad the staff to look for ways to make significant materials accessible to researchers until the entire collection can be processed.
Part of the collection, the President's Files, was indexed in the 1970s by part-time staff, and some materials collected since then include brief accession inventories. But approximately 40% of the collection has little to no inventory or arrangement, and even the materials that have been inventoried contain personally identifiable information and records including social security numbers, students' grades and disciplinary matters, financial documents, faculty matters, and more.
The index and inventories help staff find material by subject matter, but the materials themselves do not meet the definition of a processed collection, or one that has undergone "arrangement, preservation, description, and screening activities."2 There is no university archivist at the present time to arrange the backlog of university records or to create records schedules and accessions policies with any offices on campus to ensure the timely and continual transfer of university records to the archives.
Despite a lack of processed collections, the University Archives are still accessible to researchers. SC&A is the first stop shop in the Loyola University New Orleans community for materials related to institutional knowledge and history. The Association of Research Libraries' (ARL) Special Collections Task Force Final Status Report charges Special Collections and Archives to "enhance access to collections and backlogs, [and] surface 'hidden collections.'"3 SC&A has done its best to accommodate researchers with what is readily accessible using the index and inventories to find appropriate subject material, followed by reviewing for potentially sensitive information that should be restricted. Staff also works on processing the most frequently used unprocessed portions of the University Archives, but it will be quite some time before the entire collection is fully described. In addition, access to the President's Files, a large section of the University Archives, requires permission from the current President's office. This combination of unprocessed materials, privacy concerns, and use restrictions have created a challenging situation. It has been imperative for SC&A to find ways to make institutional history more available to users until it is possible to completely process the University Archives in the near future.
Prior to the university's centennial celebration in 2012, SC&A outsourced the digitization of the school newspaper, yearbook, and bulletins (course catalogs). The materials in these digital collections have had very high item views compared to other digital collections and have also proven popular with researchers who visit SC&A, as well as those who contact staff through phone or email. These users are frequently interested in images, so the University Photographs seemed like an appropriate next-step for digitization.
Unlike the University Publications and the President's Files, the University Photographs have largely been inaccessible to users. The University Photographs are an artificial collection comprised of four series of images collected or received from various departments and persons at the university:
- 15 linear ft. of document boxes of photos
- 50 three-inch vinyl binders of negatives
- 8 linear ft. of negatives in a metal filing cabinet were donated by the University Photographers with the majority coming from the 1950s and 1960
These photos are largely of campus life, students, and events with 16 linear ft. of document boxes from the office of Public Affairs dated from 1990-2005. Many of them consist of personnel photos used in university directories. An additional 13 linear ft. of boxes are simply labeled "University Photos" and have undetermined provenance, and there is approximately 10 linear ft. of miscellaneous slides, negatives, and film. About a third of the photographs are sorted by genre but the rest are unorganized. Image types in the collection include acetate negatives, prints, slides, and film. Some of the photos have identifying information but many are missing descriptions. In 2008, SC&A received a Preservation Assistance Grant for Smaller Institutions from the National Endowment of the Humanities. This funding allowed a preservation services librarian from SOLINET (now Lyrasis) to perform an onsite preservation needs assessment of SC&A. The final report specifically targeted the first general series from the University Photographers as needing to be rehoused, and in some instances re-formatted due to instances of emulsion flaking and decay (shrinkage and vinegar syndrome) aggravated by improper housing. The report also recommended staff "start small" and outsource digitizing a worthy portion of the collection. This would have required combing through the collection and attempting to find the "best" or most useful images. Instead, SC&A staff chose to digitize the first series of photographs from the University photographers in-house (50 negatives binders, 15 linear ft. of document boxes, and 8 linear ft. of negatives) without adding much if any additional description to the photos so as to create a large database of searchable and browsable photos with the hopes of then soliciting additional description. Ideally most of this work could be done by student workers. Based on staffing and resources, MPLP seemed like a possible tool for working with the collection.
MPLP and Digitization
MPLP is by now a well-known and widely accepted process for managing archival backlogs. At its simplest, MPLP uses techniques to quickly process collections and make them available to researchers, as the authors found that having accessible collections far outweighed the archivist's priority of cleaning, tidying, and imposing intellectual control.4 There are certainly lessons SC&A takes from MPLP towards processing paper-based materials, such as the University Archives, including minimal rehousing and series or collection-level description over item-level. However, as MPLP has grown in popularity, many institutions have also begun to apply it towards other areas of archival work, including but not limited to appraisal, preservation of both physical and electronic records, reference, and digitization.5
Following the release of the original article, Greene and Meissner faced criticism as some felt MPLP inhibited digitization by limiting file-level description and, as a result, information about individual items worth digitizing. In his follow-up article, "MPLP: It's Not Just for Processing Anymore," Greene debunks this and other claims regarding MPLP and digitization.6 If one of the tenants of MPLP is to reduce and/or eliminate item-level description, the same could be true of digitization.7 While some items are worthy of detailed, item-level metadata, others may benefit from a MPLP-inspired file-level, series-level or even collection-level description.8 This shifts the digitization focus from preservation to timely access. A 2007 report by Ricky Erway and Jennifer Schaffner promotes quantity over quality:
RLG [Research Libraries Group] and others have stressed capture and description at the highest quality-level possible. Funders have been drawn to support boutique curation of compelling collections. None of these approaches allow primary sources to have the significant exposure to users that they so richly merit.
Vast quantities of digitized primary materials will trump a few superbly crafted special collections. Minimal description will not restrict use as much as limiting access to those who can show up in person. We must stop our slavish devotion to detail; the perfect has become the enemy of the possible.9
This philosophy can extend from metadata to digitization standards as well. The University of Wisconsin, Oshkosh conducted a case study about the amount of time saved by using minimal processing for digitization as well as users' experiences working with minimal description. As an experiment UW Oshkosh chose to try both a budget, experimental digitization model, scanning from photocopies and eschewing item-level metadata, and a control model, creating full-color scans and item-level metadata using the Ada James Incoming Correspondence collection.10 The results were 0.86 minutes for digitization per page for the experimental model vs. 5 minutes per page for the control, and 0.6 minutes for creating metadata per page for the experimental model vs. 3.12 minutes per page for the control.11 The study also measured time spent performing administrative work (such as inventorying materials, creating metadata templates for students, quality control and interface work) with the control model vs. the experimental model. The "bottom line" results were that the experimental model averaged 1.8 minutes per page, or $0.43, while the control model averaged 8.68 minutes per page or $1.53.12
While traditional digitization is shown by this comparison to be extremely time-consuming and in some cases expensive (especially when involving extensive transcription), there is also the question of whether the experimental model would still result in sufficiently accessible collections for researchers. To determine user satisfaction, seven undergraduate history majors and seven library science graduate students were asked to complete six tasks using both versions of the collection and then report their experiences and preferences.13 The study found that, while users would prefer minimally described digital materials to no digital materials at all, bare-bones metadata made it extremely difficult for students who were used to Google-style searching to find materials.14 When the cost differences between the traditional digitization process and the "experimental model" (roughly four/five times more items digitized for the same amount of money) were explained, "ALL respondents stated that the Experimental Model WAS TO SOME DEGREE acceptable."15 In addition, the users had a great amount of difficulty with the tasks using the experimental model. Many gave up and generally assumed the entire collection was digitized when in fact the finding aid needed to be consulted to show that only a few boxes were digitized for this project.16
The savings incurred by using the experimental model are obvious, but the detriment to the researchers' experience is clearly at issue with this type of approach. The UW Oshkosh project led their staff to determine that improved browsing and searching were necessary for this collection to be usable, and that perhaps a "variable" approach would work.17 The users who participated in assessing this project were informed of the time and cost differences in the experimental vs. the control model. However, archivists cannot reasonably expect to explain to all users that while the results for minimally processed digital collections may not be perfect, they're better than nothing. Archivists should therefore aim for a middle-ground between Google-like large-scale, ultra-mass digitization and what we've traditionally done, small-scale boutique-level digitization.18
In "MPLP: It's Not Just for Processing Anymore," Greene promotes creating folders or series-level metadata and linking to digitized material in online finding aids to best preserve the individual context of the material.19 The act of creating metadata from file, series, or collection level descriptions assumes, however, that those descriptions exist. In some cases, so little arrangement and description may have been imposed on a collection (as with the University Photographs) that describing at the collection level may be inconsequential. Is it beneficial to users to digitize items that may have some item-level description, like captions on a photo, but minimal series or collection level description?
Larisa K. Miller of the Hoover Institution at Stanford University recently proposed a method of digitizing unprocessed text-based resources which can then have Optical Character Recognition (OCR) applied for full-text transcripts of the materials, with the intention of making them full-text discoverable in a way that is more familiar to most users than searching finding aids for folder titles may be.20 While Miller's method still presumes that original order exists and should be preserved through this process, the crux of the method is that additional description may be bypassed if users can search and browse the materials themselves online. While image-based materials are unable to benefit from full-text searching, the idea of arranging, describing, and rehousing materials during the digitization process rather than before is an intriguing proposition.
Digitizing the University Photographs
SC&A decided to undergo its first attempt at minimal processing through large-scale digitization with part of the University Photographs collection. Scanning standards and workflow for this project were largely taken from theoretical standards and workflow proposed by Sarah Dorpinghaus in a presentation at the 2012 Society of American Archivists Conference, San Diego, CA. Dorpinghaus's presentation addressed her work on a digitization project involving the Rosenthall Judaica Collection at the College of Charleston. Portions of the then-unprocessed collection were scanned to make the valuable collection accessible to the public, respond to concerns by the donor about accessibility, and increase visibility for purposes of fundraising.21 The project followed the digitization standards of 400-800 ppi scans, creating TIFF preservation and JPG access copies, and robust metadata.22 Since the collection was not yet processed, the project attempted processing through digitization as a method of making the collection accessible until it was fully processed and a finding aid created.23
In her presentation, Dorpinghaus made recommendations for minimizing the time spent on digitization and metadata creation. These included digitizing at a lower dpi and only creating access copies of images rather than also creating preservation copies, and removing some metadata fields.24 For the University Photographs digitization project, SC&A staff decided to follow this example. Images are scanned as 300 ppi JPGs as opposed to the more time-consuming and space-inhabiting norm of 600+ ppi TIFFs.
Rather than attempting to impose an organization on the tens of thousands of images in the University Photographs Collection, after scanning the photos were rehoused in original order and assigned an identifying number. Because of the hands-off nature of using MPLP for digitization, all of the scanning and rehousing of the documents is done by a student worker. SC&A staff then completes Dublin Core metadata and uploads the images to the LOUISiana Digital Library, the statewide digital library, using CONTENTdm.
SC&A staff compiled a master list of broad LOC Subject Headings to draw from in order to minimize the time necessary to create metadata. Initially, all photos were credited to "Loyola University (New Orleans, La.)." When the photographer was known, their name was included as a Contributor (see "Outreach and Assessment" for update). Where additional description is included with the photos, information is also added to the metadata. Item-level description for the photos may appear more robust than minimal processing usually entails but that is because there is no finding aid to consult for additional information.
Fig. 1
Descriptive metadata in CONTENTdm collection
The collection has not been reorganized based on subject or year; instead users must search the digital collection using keywords and dates to find what they're interested in. Original photos are available upon request by using the identifying number assigned during the digitization process. The photographs are put in folders interleaved with acid-free paper and housed in archival boxes. Although this method combats the preservation issues identified in the 2008 Preservation Needs Assessment final report it remains within SC&A's supplies budget by not necessitating the ordering of different sized enclosures.
Outreach and Assessment
Because the images sometimes lacked adequate descriptions, SC&A attempted to crowd-source information for the photos to enrich item metadata and engage the Loyola user community. The first batch of University Photographs was uploaded to CONTENTdm in November 2012. We attempted to use CONTENTdm's "user generated content" fields to allow users to add comments and tags that display right below the images' original metadata. Depending on the volume of comments and tags, SC&A staff would then try to add appropriate description to item metadata. SC&A staff immediately began working with both the library's and the university's outreach staffs to publicize the project through online and print publications. Unfortunately, while SC&A received positive feedback regarding the digitization and the idea of crowd-sourcing of the photographs, we were not successful in gathering user comments. To date, only three users have provided comments despite the fact that almost 3000 images are now available in the collection and promotion by both SC&A and university staff has been aggressive.
Following the thus-far unsuccessful attempt at crowdsourcing, SC&A staff conducted usability and user satisfaction testing on three University staff members and four students to find out if the description for the photographs was sufficient for users to find what they wanted in the collection, as well as whether the collection itself was findable. While this may seem like too few subjects to get a clear idea of the usability of the collection, usability literature shows that between 5 and 12 subjects is not only sufficient to identify most issues but is also the most economical use of time and resources.25 Records of previous in-person and phone/e-mail users of SC&A materials from 2006-2013 were also analyzed to determine what types of researchers access the University Archives the most. Students consisted of 42% of users, followed by 31% outside researchers unaffiliated with Loyola, 16% University staff members, 6% Loyola faculty members, and 5% alumni. Initially a sample of unaffiliated researchers who have previously used SC&A were contacted to see if they would participate in remote usability testing, but none responded. Instead, a convenience sampling of staff and students were selected to participate in the testing. The SC&A staff member conducting the usability test applied for and received Institutional Review Board approval per University policy previous to the beginning of the test. Testing was conducted on a computer equipped with Camtasia screen-capture software and a microphone, and users were encouraged to "think aloud" as they worked through the test. The SC&A staff member present also took notes during the test. Users filled out background questions regarding their previous experiences with digital collections online, completed nine tasks including finding the University Photographs Collection online, searched for photos within the collection, and filled out a follow-up questionnaire (see Appendix A-C). The questions and tasks were developed based on usability studies done by Judy Jeng at Rutgers26 and Maggie Dickson at University of North Carolina at Chapel Hill.27
Overall the users successfully completed the tasks. Ninety-four percent found the collection to be easy to search and potentially useful. Additional comments and observations during the testing led to several changes to the collection description fields, as well as to the library's website to make the collection easier to find including changing the photographer's name (if known) to "Creator" and "Loyola University (New Orleans, La.)" to "Contributor." Some other issues the users had with the description were specific to CONTENTdm, such as no "Date" field displaying if no date is input.
The study also confirmed that while users were able to find specific photos by searching subjects and dates, the majority wanted the photos to have more description. Eighty-six percent wanted to know the names of people in the photos, and 71% wanted to know what year the photos were taken (if not already available). As such we continue to believe enhanced description of the photos is beneficial to our users and we are currently researching possibilities for crowdsourcing through other platforms like Flickr, as well as targeting specific user groups (such as alumni) for help. Considering only 5% of our users are alumni shows that we are not reaching an obvious and potentially valuable user group for these materials.
Conclusion
To date, about 16% of the first series from the University Photographs collection has been digitized and rehoused, or about 10% of the total images collection. SC&A has had significant turnover of student workers since the project began and three different work-study students have been digitizing the collection since November 2012. Luckily the simple training and speedy workflow processes for this project have made continued progress possible, much more so than traditional processing. Still, as we have moved forward with the project we are finding many duplicates and redundant material in the series, particularly in the negatives binders, meaning weeding and/or random sampling before digitizing would be more useful than we originally anticipated. We plan on continuing to digitize without prior processing the current series of photographs in document boxes and possibly the negatives in the filing cabinet; however, it is not practical for the remainder of the negatives binders or the other three series in the collection (Public Affairs, miscellaneous "University Photos," and miscellaneous slides, negatives and film) as they require some level of processing to determine suitability for selection to the permanent collection and digitization. In addition, some of the more damaged images targeted in the Preservation Needs Assessment final report should be scanned at a higher standard to create preservation and access copies of the digitized images. Applying varying degrees of processing to different parts of a collection is entirely in keeping with MPLP, as the original article states, "Not all series and all files in a collection need to be arranged at the same level of intensity."28
While progress was made, MPLP is obviously not a wholesale solution for all collection backlogs and must be evaluated on a case-by-case basis as noted by Meissner and Greene and reinforced by this project. Processing by digitization could still work for the remaining portions of the photographs collection if it was at least weeded for duplicates. Most likely additional processing will also be done in order to weed images that do not fit with the mission and scope of the University Archives.
Still, since searching the unprocessed University Archives is so time-consuming and often difficult for SC&A staff, the ability to point researchers (whether they contact us or find our resources externally) towards digital collections has had a significant impact on staff time and the satisfaction of users. Additionally, having users conduct pre-research using the digital collections provides for more meaningful reference transactions if additional use of University Archives is needed. Using MPLP to digitize the University Photographs collection enabled SC&A to begin making an unprocessed collection not only accessible but freely available worldwide in a timely manner and without compromising the integrity of the collection. While there are challenges to digitizing minimally processed materials, the benefits of making these materials available can outweigh the negatives. The suite of techniques provided by MPLP will continue to be used by SC&A for future projects involving digitization and more.
Notes
1. Mark A. Greene and Dennis Meissner, "More Product, Less Process: Revamping Traditional Archival Processing," American Archivist 68, no. 2 (2005): 234-35.
2. Desnoyers, Megan Floyd, "When Is a Collection Processed?" The Midwestern Archivist 7, no. 1 (1982): 23.
3. "Final Status Report" (ARL Special Collections Task Force, 2006), accessed June 17, 2013, http://www.arl.org/storage/documents/publications/special-collections-task-force-final-status-report-july2006.pdf.
4. Greene, "More Product," 234-35.
5. Mark A. Greene, "MPLP: It's Not Just for Processing Anymore," American Archivist 73, no. 1 (2010): 175-203.
6. Greene, "MPLP," 193.
7. Greene, "MPLP," 193.
8. Greene, "MPLP," 194.
9. Ricky Erway and Jennifer Schaffner, Shifting Gears: Gearing Up to Get Into the Flow (OCLC 2007), 6, accessed December 17, 2013. http://www.oclc.org/content/dam/research/publications/library/2007/2007-02.pdf?urlm=162902.
10. Joshua Ranger, "More Bytes, Less Bite: Cutting Corners in Digitization," 5-6, (presentation, Society of American Archivists Conference, San Francisco, CA, August 24-31, 2008).
11. Ranger, "More Bytes," 7-8.
12. Ranger, "More Bytes," 12.
13. Ranger, "More Bytes," 14.
14. Ranger, "More Bytes," 19.
15. Ranger, "More Bytes," 21.
16. Ranger, "More Bytes," 15-16.
17. Ranger, "More Bytes," 24.
18. Ranger, "More Bytes," 8, 194.
19. Greene, "MPLP," 194.
20. Miller, Larisa K, "All Text Considered: A Perspective on Mass Digitizing and Archival Processing," American Archivist 76, no. 2 (2013): 536-537.
21. Sarah Glover, e-mail message to author, December 4, 2013.
22. Sarah Dorpinghaus, "The Rosenthall Experiment: Shooting from the Hip Digitization" (paper presented at the fall 2012 Society of American Archivists Conference, San Diego, CA.), 17, accessed June 17, 2013. http://files.archivists.org/conference/sandiego2012/103-Dorpinghaus.pdf.
23. Glover, e-mail.
24. Dorpinghaus, "Rosenthall Experiment," 19.
25. Judy Jeng, "Usability Assessment of Academic Digital Libraries: Effectiveness, Efficiency, Satisfaction, and Learnability," Libri: International Journal of Libraries and Information Services 55, no. (2/3) (2005): 110.
26.Jeng, "Usability," 116-121.
27. Maggie Dickson, "CONTENTdm Digital Collection Management Software and End-User Efficacy," Journal of Web Librarianship 2, no. 2 (2008): 376-378.
28. Greene and Meissner, "More Product," 22.
Appendix A. Usability Background
Background Questions:
1. I am a:
Staff
Student
Outside Researcher
2. Please indicate your gender:
Male
Female
Prefer not to answer
3. Please indicate your level of familiarity with the Internet:
Not familiar
Somewhat familiar
Very familiar
4. Have you looked at digital collections or digital images on the Internet before?
No
Yes
Don't know
5. Have you visited Loyola's Digital Collections before?
No
Yes
Don't know
Appendix B. Usability Tasks.
1. The Loyola New Orleans Archives have a collection of university photographs online. See if you can find it using Google.com. Please rank the ease of finding the answer to this question from 1 to 5, 1 being easiest and 5 being most difficult.
Easy 1 2 3 4 5 Hard
2. Now see if you can find the university photographs from the Monroe Library homepage (library.loyno.edu). Please rank the ease of finding the answer to this question from 1 to 5, 1 being easiest and 5 being most difficult.
Easy 1 2 3 4 5 Hard
3. How many photographs are in the collection? Please rank the ease of finding the answer to this question from 1 to 5, 1 being easiest and 5 being most difficult.
Easy 1 2 3 4 5 Hard
4. Find a photo of someone from the Loyola basketball team. Please rank the ease of finding the answer to this question from 1 to 5, 1 being easiest and 5 being most difficult.
Easy 1 2 3 4 5 Hard
5. What year is this photo from? Please rank the ease of finding the answer to this question from 1 to 5, 1 being easiest and 5 being most difficult.
Easy 1 2 3 4 5 Hard
6. Do you require more information to understand who or what this photo is of?
7. Now look for a photo from the 1930s. Please rank the ease of finding the answer to this question from 1 to 5, 1 being easiest and 5 being most difficult.
Easy 1 2 3 4 5 Hard
8. Who or what is pictured in this photo? Use both your own observations and any description available.
9. Do you require more information to understand who or what this photo is of?
Appendix C. Usability Follow-up.
1. How would you rate your overall experience using the Loyola University Photographs Collection?
Excellent 1 2 3 4 5 Poor
Comments:
2. How useful does this collection seem to you?
Very useful 1 2 3 4 5 Not at all useful
Comments:
3. Would you use this collection again?
Definitely 1 2 3 4 5 Definitely NOT
Comments:
4. Which tasks did you find the most difficult to complete?
5. Is there anything you would like to do using this collection but can't?
6. What other materials would you like to see in the Loyola Digital Collections?