Vivek Haldar

How will you read your files in a hundred years?

Jeff Rothenberg might very well have written everything there was to write about the topic of digital longevity and preservation. And it is downright depressing. Consider this:

The year is 2045, and my grandchildren (as yet unborn) are exploring the attic of my house (as yet unbought). They find a letter dated 1995 and a CD-ROM (compact disk). The letter claims that the disk contains a document that provides the key to obtaining my fortune (as yet unearned). My grandchildren are understandably excited, but they have never seen a CD before—except in old movies—and even if they can somehow find a suitable disk drive, how will they run the software necessary to interpret the information on the disk? How can they read my obsolete digital document?

From there the paper goes down a twisty maze of reasoning to lay out exactly how difficult it is to preserve digital documents for the long term, i.e. hundreds or even thousands of years. We have stone and paper documents dating back thousands of years, but it is extremely unlikely that today’s digital documents will make it that far into the future.

There are problems at every level of abstraction. The physical media on which the bits are stored will decay. The documents have funky encodings and metadata that can only be parsed and displayed by the programs that were used to author them. Those programs will only run on certain OSs. Those OSs will only run on certain hardware. If each of those hurdles are overcome, we have some hope of recovering the old document.

That’s like raising an already small probability to the fifth power.

Like I said, it’s depressing.

And the biggest issue is that digital preservation is an active, ongoing task. I could print something on archival paper, store it in a safe deposit box, and be reasonably sure it will be readable in a century. Trying to do the same for a digital document would require fiddling (copying to newer media, maybe changing formats and encodings) every 2-5 years.

Maybe the Library of Congress will go to all this trouble to preserve digital documents it deems significant to the record of our civilization and times. What hope do I as an individual have to carry forward my digital life to the point where I could hand it over to a grown child, or even grandchildren?

I learned a small lesson early in my digital life and have been putting as much as possible in plain text files. But “plain text” can be one of many encodings, and you need to know it independently of the file.

On the table where I write this, I have a black and white photograph of my mother when she was about twelve years old, with my then young grandparents. Will I be able to give my twenty-year old child photographs of the vacation we went on when he was a toddler?