A Call to Crowdsourcing

Recently, the National Archives uploaded a new collection of records to their Citizen Archives Project. The relatively small collection features photographs of life on Native American reservations in the early to mid 20th century all in need of descriptive tagging. The photos have been digitized and given titles, which often include short descriptions gathered from available information, but all other metadata is waiting to be ascribed tags for easier search-ability. Crowdsourced projects such as these utilize volunteers to analyze images and documents that have already been sorted into larger subheadings. Volunteers are tasked with transcribing document text for easier readability, creating descriptive metadata by tagging images making them more accessible to researchers. For example, one photo I worked on was titled, “Mrs. Dorothy Yellow Cloud Does Barbering For Her Family.” After spending a few minutes looking at the photograph I imagined the words I would type in if I were attempting to access this photo. I chose: children, domestic life, exterior, haircut, mother, wife, bench, cabin, and home.

The idea of transcription and tagging via crowdsourcing in libraries and archives is nothing new. Projects like Citizen Archives started around 2011 and most articles on the subject are from the pre-historic time of 2008. Many other institutions have their own crowdsourced transcription projects. The New York Public Library started “What’s on the Menu?” to help transcribe and geotag their collection of historical restaurant menus from across the years and the city. Their Community Oral History Project also uses crowdsourcing to edit and check the computer-generated transcripts of their interviews that often fumble with homonyms or fail to note pauses. The Smithsonian opened up the Transcription Center in July 2013 with un-annotated and un-transcribed documents from just eight Smithsonian museums, libraries and archives. Now 18 different groups and institution rely on the Transcription Center and its active volunteer community to help tag and transcribe thousands of documents ranging from botanical field notebooks to refugee letter ledgers from 1868. One of my favorite projects was the McMaster Postcard Project that asked users to help decipher handwritten captions and messages from hundreds of postcards in their collection. The project was completed with over 18,000 descriptions received and now awaits final processing and its debut on the McMaster University Digital Archives page. The article, “Cultural Institutions Embrace Crowdsourcing“, by Mike Ashenfelder also provides a great list of current and past crowdsourcing efforts by insitutions across the globe and across disciplines.

While “downsides” exist, the most common complaints against crowdsourced projects are similar to that of starting any large new project: expense, time, staff-availability. While crowdsourced projects may take some time, money and effort to set up but once they get going they are capable of achieving goals that the institution may never have had the time, staff or finances to see to the end in the first place. Digitization efforts of collections are daunting, expensive and time consuming. Reluctant institutions and researchers cite the concern of accuracy. If a document or photograph is incorrectly described by an over zealous volunteer will it remain lost to a researcher forever? In order to combat this very real concern most projects are set up with a system of quality control in place. A record must go through three levels of assessment: First the record is tagged or transcribed by one group of volunteers, it is then reviewed by another set of users, before heading to the final approval by a final institute authority (either a staff member or trained volunteer). “Practical usability over scholarly perfection,” (Zastrow) is the end goal with crowdsourced transcription projects. If a document can be accessed through appropriate search tags and read by the user the mission is accomplished.

The largest “upside” is of course engagement with the records. Crowdsourcing efforts bring people together creating new types of virtual communities, which allow for a new flow of untapped knowledge. “Encouraging a sense of public ownership and responsibility towards cultural heritage collections, through user’s contributions and collaborations,” might be listed last on Rose Holley’s reasons why libraries should participate in crowdsourcing projects but it is the most significant reason. Archives have been seen as places for white gloves and research credentials. With the advent of digitalization and web accessible records those stereotypes are slowly fading. You can flip through historic photographs from the comfort of your couch without ever having to put on pants- let alone white gloves. Volunteer tagging makes couch-bound-research possible, allowing anyone to interact with a collection.

In her paper, Holley makes mention of the important distinction between social engagement and crowdsourcing. Social engagement happens sporadically where as crowdsourcing, “relies on sustained input from a group of people working towards a common goal.” The couch-bound-researcher flipping through Coney Island souvenir postcards of the 1900s is interacting on the social engagement level, where as those who transcribed the slanted script sentiments into plain text and placed tags to each post card such as “Coney Island” “1934” “Full Color” “Beach Scene” are actively participating in crowdsourcing efforts.

Crowdsourcing volunteer projects have many different applications and have proven to be a powerful tool within libraries and archives. Since these projects have started in the early 2000’s they have expanded, evolved and generated other institutions both large and small to tap into the potential of the public. But like many things from the “aughts” without constant re-branding and media attention projects like these see a petering out in activity. There is never a shortage of records in need of transcription, but recruiting new volunteers takes some effort. So here I am, doing my part! Sign up takes less than a minute and users will value your work for years to come. Flex your metadata description skills and test your taxonomy from the comfort of your couch during all that free time you have in between studying, pulling your hair out and writing your own blog posts!


Works Referenced

Ashenfelder, M. (2015, September 16). Cultural Institutions Embrace Crowdsourcing. Retrieved September 28, 2017, from https://blogs.loc.gov/thesignal/2015/09/cultural-institutions-embrace-crowdsourcing/

Holley, R. (2010). Crowdsourcing: How and Why Should Libraries Do It? D-Lib Magazine, 16(3/4). Retrieved October 1, 2017, from http://dlib.org/dlib/march10/holley/03holley.html

Wilson, A. (n.d.). Citizen Archivist Dashboard: Improving Access to Historical Records Through Crowdsourcing. Retrieved September 26, 2017, from https://crowdsourcing-toolkit.sites.usa.gov/citizen-archivist/

Zastrow, J. (2014, October). Crowdsourcing Cultural Heritage: ‘Citizen Archivists’ for the Future. Retrieved October 4, 2017, from http://www.infotoday.com/cilmag/oct14/Zastrow–Crowdsourcing-Cultural-Heritage.shtml


By Emma Karin Eriksson

Tagged with: , , , , , , , ,
Posted in Archives, Cataloging, Classification, Libraries, Uncategorized

by Hugh McLeod

Follow INFO 653 Knowledge Organization on WordPress.com
Pratt Institute School of Information
%d bloggers like this: