Digital Preservation Axioms & Development Platforms

In the reading from The Theory and Craft of Digital Preservation, author Trevor Owens outlines Sixteen Guiding Digital Preservation Axioms as a point of reference for the rest of the book and his philosophy on digital preservation.  Axiom seven immediately stuck out to me because it directly relates to a recent switch I made in platform for website development.  Axiom seven was: “The boundaries of digital objects are fuzzy,” referring specifically to the dependencies and environment needed for various objects to run as intended (Owens, 2019).

In software and website development, these hidden, often unintended, dependencies are a significant problem, often referred to as the “works on my machine” problem.  When running a website on a local server, the developer might take for granted certain packages or libraries of information already downloaded onto their computer.  When running the same site in a different environment, not all of the computer’s files outside of the site-specific ones might be present, and the website will no longer work as intended.
Container platforms have become a popular option as a development platform because they take away the possibility of this problem (among other benefits).  Container platforms include Docker, rkt, and Kubernetes.

This image describing containers is from the Docker website (image is hyper-linked). More information about containers and containers versus virtual machines is available on the Docker website.

A container is a software package that contains not only the code for a particular application, but also all its dependencies.  This allows the software to run seamlessly and consistently from one environment to another, eliminating a lot of the “fuzziness.” Container-based platforms, abstract at the application layer.  This means that they take up much less space and boot up much more quickly than a virtual machine, which gives software applications similar independence but requires each VM to have its own copy of an operating system.

I recently switched from using a local server running through the open-source, solution stack MAMP to the container-based Docker. As a hobbyist web developer, I often ask my friend who is a professional developer for help when I run into an unexpected error.  Through Docker, my friend can easily help me troubleshoot problems long distance from his own computer.  Rather than trying to talk through computer setups or have him be physically at my computer, I can send him a small file and he can see exactly what is going on.  He can troubleshoot when and where it is convenient for him.

The use of containers allows for the information of a website to be completely self-contained rather than entangled in other information. Even if all of the same package and library dependencies were available on multiple environments, the site (although the HTML files might all be located together) would be entangled in with all of the computer’s other information.

This isolation of software applications means that the boundaries of these digital objects are more distinct.  This makes digital preservation easier, as all the internal dependencies will remain consistent even as external technologies continue to update. This removes some of the problems with inaccessibility due to incompatibility.  A container helps to distill the application into only the necessary information.  It also makes dependencies more obvious, showing how files and libraries are grouped together to make an application.  This ties into Owens’s eighth axiom: “One person’s digital collection is another’s digital object is another’s data set” (Owens, 2019).  The container is both itself a digital object as well as a, for lack of a better term, container for digital objects (which have their own intentional organization).

Although I doubt digital preservation is a motivator for this shift towards container platforms in software development, it definitely does benefit from this change. It also showcases how clarity in creation of information leads to clarity in preservation of the information. The more the information can “speak for itself” to be clearly understood and used without as much presupposed knowledge when it is created, the more embedded knowledge the information will have in the future.

Required reading referenced:
Owens, T. (2019) Introduction: Beyond Digital Hype and Digital Anxiety. From the book The Theory and Craft of Digital Preservation by Trevor Owens.

By R. Sumi Matsumoto, 653-02

Tagged with: ,
Posted in Born Digital, Knowledge Structures

by Hugh McLeod

Follow INFO 653 Knowledge Organization on WordPress.com
Pratt Institute School of Information
%d bloggers like this: