In the first post in this digital preservation series, I shared some of the unique challenges digital material brings to the preservation game. In this one we will look at some of the technologies and tools digital stewards employ to protect our digital assets.
How can you tell when a computer file has been corrupted? If you try to open it funny, glitchy things might happen. How can you test whether a digital file is uncorrupted? This requires a bit more thought. Digital files are at their base-level a long string of 1’s and 0’s. This is called the file’s bitstream. Preservationists could compare one bitstream to an earlier copy of it, but this requires a lot of processing power for large files, with no guarantee that your comparison copy isn’t also corrupted.
This is where checksums can help us out. Checksums are character strings generated by a class of algorithms called hash functions or cryptographic hashes. You can try one out here: http://md5checksum.com/. Hash functions are used to encrypt lots of things. Passwords submitted to websites are hashed in your browser. Kind of like this:
Hash functions can also be applied to the bitstream of a file. Due to the nature of the various algorithms used even a single change in a one or zero will produce a drastically different checksum. If at the beginning of the preservation process a digital steward produces a checksum for the bitstream, she can now test for data integrity by rerunning the hash and comparing that output to the original checksum.
Now that we can test for unwanted changes in computer files, how can we ensure we always have a valid copy of it? A system called LOCKSS can help with this. LOCKSS stands for Lots Of Copies Keeps Stuff Safe. Similar to the idea of backing up personal files, LOCKSS will duplicate the files given to it and then distribute copies of files across several servers. The idea is to spread the system out over many servers in diverse geographic areas to minimize the risk of a single disaster (natural or otherwise) compromising the entire system. These distributed copies are then regularly hashed, and the checksums compared to test the validity of the files. If a checksum comparision fails, that server can delete it’s failing copy of the file, and ask the other servers for a new one.
Digital preservation is a rapidly developing field. New challenges requiring new solutions arise every day. In the third and final post in this digital preservation series, I’ll discuss activities you can undertake to protect your personal digital heritage.
In this digital world we are increasingly creating, storing, and publishing material entirely in electronic forms. While this is great for the trees and other resources used in making paper, it introduces new challenges in the process of collecting and preserving materials.
The preservation needs of paper are pretty well understood. Guidelines for ideal environments (heat and humidity) and practices (handling and storage) have been in constant refinement for hundreds of years. The libraries, archives, and information science communities only began thinking about preservation for digital material comparatively recently. This first post of a three part series on digital preservation will take a look at the challenges unique to preserving digital materials, and why we must approach digital preservation differently than physical preservation.
What might be surprising to many is the relative fragility of digital assets. Estimates put the average operation life of conventional digital storage media at five years. These failures occur in more than just the physical components: magnetic media are sensitive to anything generating a magnetic field from batteries to the sun! Optical discs can suffer from manufacturing errors or material degradation making them unreadable. Additionally, once damaged, a digital resource is often completely lost. Physical material might be salvaged through conservation. Recovering digital assets after damage is much more difficult.
Complicating the practice of digital preservation is the fact that digital materials are meaningless without the correct hardware and software environments to render them. Consider a printed book. The information conveyed by a book is encoded with ink marks made on paper. So long as the rules of the encoding language (that is, the language it is written in) and the marks on the paper persist, the information in the book can be recalled. The ink won’t independently leave the paper and reorganize into different patterns and structures.
This is exactly what happens to digital information. The long strings of characters encoding digital assets is only intelligible to a narrow band of both software and specific hardware configurations. Many of us have likely encountered the situation of being unable to open an old file in a newer version of software. Software developers are constantly adding and removing features to their products, often with little attention to backwards compatibility. Merely storing the digital encoding (or bitstream) is meaningless without also storing instructions on how to rebuild it back into an understandable, rendered product.
These extra considerations compound when you consider the speed of technological advances, and the new behaviors and interactive experiences we’re building and sharing with our machines and networks. Even identifying what behaviors and functions of digital assets are important to intellectual understanding of the resource is a quagmire. Those of us thinking about digital preservation have ceded a pretty large head-start, and the race is constantly accelerating.
In the next posting of this blog series, I’ll cover some strategies currently being used by the digital preservation community. I’ll finish this series with a post what you can do yourself to safe-guard your digital works and memories.
What do archivists do all day, anyway? Look at old photos? Dust yearbooks? Take papers from one file folder and put them in another?
Those are all true to some extent, but university archivists play more roles in their community than one might think. Take a look at some of the extraordinary events during an average week in FSU Special Collections and Archives:
Thursday, October 15:
Students from the ART5928 workshop “Creating Experiences” visit the Claude Pepper Museum. Their project this semester involves creating a public event that could be held in in a museum space. The students have designed a Claude Pepper Pajama Party event and social media campaign, and today they’re walking through their ideas with Pepper Library Manager Rob Rubero.
FSU Special Collections has always considered local history one of its collecting strengths. In an effort to deepen community connections and learn more about the Tallahassee music industry, Rory Grennan and Katie McCormick attend a public appearance by influential songwriter and producer George Clinton. Aside from smiles and photo opportunities, our archivists enjoy many conversations with Clinton’s family and associates about his work and his legacy.
Friday, October 16:
Today, the Special Collections Research Center reading room has the privilege of hosting the members of the Florida State University History Club. A dozen history undergraduates attend an informational presentation by Manuscript Archivist Rory Grennan and Rare Books Librarian Kat Hoarn. Presentations and instructional sessions for students, faculty, and the public are a core part of the Special Collections mission, and occur frequently at the beginning of the school year. History Club members are excited to see 4000 years of human history laid out in documents from our collections including cuneiform tablets, a page from a Bible printed by Gutenberg, and artist books from the 21st century.
Monday, October 19:
Monday morning, archivists Sandra Varry and Krystal Thomas visit the University Registrar’s office to consult on the preservation of student transcripts on microfilm. The filmed student records see heavy use, and unfortunately enough of the film has been worn down that some records are losing information. The group discusses modern strategies such as digitization to preserve these essential historical records that document a century of higher education.
Later, Sandra Varry and division staff prepare for a new exhibit opening today in the Special Collections Exhibit Room on the first floor of Strozier Library. “Mittan: A Retrospective” celebrates the work of photographer Barry Mittan, and documents student life at FSU in the 1960s and 1970s. The exhibit was curated by graduate assistant Britt Boler and runs through January 2016.
In the afternoon, Krystal Thomas carefully reviews and uploads recently-digitized cookbooks and herbals to the FSU Digital Library. The Digital Library features digitized versions of the highlights of our collections, as chosen by Special Collections staff and our users, and new content is added regularly by archives staff.
Tuesday, October 20:
Things They Don’t Teach You In Grad School #47: Water and vinegar makes an effective, non-abrasive cleaner for a headstone.
Former FSU faculty member Paul Dirac was a giant in the fields of mathematics and quantum mechanics, and his papers are a frequently-consulted resource by researchers at FSU Libraries. Since no members of the Dirac family remain in Tallahassee, it has become the unofficial duty of our library and archives staff to visit Dirac’s grave once a year and see that it is kept clean. October 20th is the anniversary of Dirac’s death, and seems an appropriate time to visit the site. Archivists Katie McCormick, Rory Grennan, and Krystal Thomas, accompanied by library Director of Development Susan Contente and a handful of Physics Department students, scrub the headstone and plant fresh flowers this afternoon.
Wednesday, October 21:
Early this morning, archives staff notice an uncharacteristic rise in temperature in the stacks. After confirming initial impressions with a few temperature readings, contact is quickly made with library facilities staff to take steps to correct an issue with the building’s HVAC systems. Constant environmental monitoring is an important part of preserving our collections, as paper, film, and other substrates are vulnerable to fluctuations in temperature and humidity. There’s no point to collecting items that can’t be made to last! You never know what someone might need next week…
As Special Collections staff, next Wednesday, May 1st is our opportunity to truly become aware of our role in preserving our unique collections and protecting the environment in which they’re stored.
Named by the Society of American Archivists after Hurricanes Katrina, Rita, and Wilma struck the Gulf Coast, “MayDay” – this year and every year – is a nationwide effort whose goal is to save our archival materials, no matter which type of cultural institution in which we work.
Here are a few things we can do that day that will make a difference when and if an emergency occurs, tasks that we can accomplish in a short period of time:
Quickly survey collections areas to insure that nothing is stored directly on the floor, where they would be vulnerable to water damage.
Note the location of fire exits and fire extinguishers.
Review basic emergency procedures – currently being updated – in our Reading Room behind the service desk.
Familiarize ourselves with the evacuation plan and where emergency supplies are stored – a good chance to check that flashlights are working!
Update the contact information in our department staff list
These are just a few suggestions; there’s probably more we can think of. And it’s important that we sustain this effort, not just on MayDay.