5 min read

The persistence of memory

Permanent data loss in the Information Age
The persistence of memory

It used to bother me that we as a species are not preserving our collective knowledge. No doubt the proliferation of floppy disks, CD-ROMs, hard drives, and now cloud storage, would all but obviate that worry. But the example I always cite is the first in the list, the humble floppy disk.

Front and back of a 5 1/4 inch floppy disk
By Jud McCranie - CC BY-SA 4.0, Link

The 5 1/4" floppy began its existence in 1976. It wasn’t even the first floppy—8" floppies pre-date them. A floppy had the storage capacity of 360 KB, eventually acquiring the density of 2.88 MB by the late 80s. That doesn’t sound like a lot of data, but consider this: a book on my shelf printed in 1909 has the volume of 1450 cm3 but the byte density of 10.2 KB1. In terms of storage, you could fit 35 similar books on a small capacity floppy in the space of a mere 50 cm3.

Today however, only one of those is readable. It would be extremely challenging to find a computer in 2023 than could read a floppy disk. It is trivial to find a human who can read printed text from the previous century.

So what does this mean for our data? I have hard drives collecting dust in the garage that are difficult to pull data off because they’re IDE. My gaming computer solid-state drive works via SATA, but my main board supports a new standard called M.2. My guess is that in a decade my saved games of Civilization VI will be unrecoverable.

That’s not a big loss to human knowledge and frankly I don’t want future generations to know I played the game on easy mode. But what about important stuff? Laws, histories, public records? Artifacts of culture like novels and movies and music? We kind of want to hang on to that.

As a Classics nerd, I feel this in my bones. Because papyrus and vellum decay, we’ve lost countless unknowable treasures to the vagaries of time. Speaking just of European culture, if some medieval monk didn’t find frequent value in practicing his penmanship in Latin using a moldy manuscript, fungus and fire corrupted the data beyond recovery.

I started off saying that this used to bother me. The lost works of Tacitus still do, but much less so “Yellow Submarine” or Discworld. The rise of cloud computing has put a temporary pause on my fears of data loss. The ubiquity of everyone saving everything to the cloud is preserving records across the globe, across cultures, across languages. The reprieve is, I fear, fleeting.

There is nothing to guarantee that OneDrive or iCloud or any other hosted cloud service will have its data intact centuries from now, much less in some readable archival format. I recently got an email from Google saying their are going to start deleting data. That’s fine, but what if everything that’s online now is deleted due to unuse by the end of the century? It’s permanently gone. An online Apple Time Machine backup is not going to be much use to future scholars if it doesn’t exist—and if it does, it needs to be in a data format that makes sense in the future. That last point is also important, as unless its ASCII-encoded text (it’s not), the odds of it being readable by future programs written by future programmers to run on future computers is pretty much nil.

This is all rather bleak. My solace is that the biological limits on human lifespans mean that I don’t have to worry about it, at least not as a function of whether I can get up in the morning and face existential oblivion. My teeth will still need to be brushed tomorrow. My novel—and all its bytes—will be safe on my laptop and in my cloud backups.

I have—or had—high hopes for the Internet Archive as they have been preserving in archival formats not just the history of the Internet (which frankly isn’t all that interesting) but also books and other media. Because late-stage capitalism is actually feudalism, the IA is being sued into uselessness by soulless rent seekers. I hope it survives. It’s been an invaluable research tool for me. But as time has shown, a great fire can wipe out knowledge rather quickly, and copyright law is nothing if not an all-consuming conflagration of greed.

What lasts? Well, print books of course. My oldest book is about 100 years old. I’ve seen much older in museums. Even older than books are inscriptions. But their bit rate is even lower than bound paper. Properly cared for microfilm has a shelf-life of 500 years and only requires a magnifying glass and a light source to decode. Beyond that, the only things that outlast us are the things we choose.

In the end, we as a culture are going to make the decision unconsciously about what to preserve. Future archaeologists will marvel at the era of the ubiquitous phone book in layer after layer of human trash only to mysteriously disappear from the historical record around the turn of the twenty-first century. Was it a vast sweeping fear of the telephone—another strange artifact of plastic with as-yet-to-be determined religious significance—that doomed the phone book? The next layer above is full is crushed silicon encased in even more plastic. Perhaps the death knell of the phone book was the onslaught of the bionics revolution—whose date is usually attributed to the twenty-second century.

If we  don’t preserve our data—as books, inscriptions, or even usable digital media—future humans will be left to guess just as we do as to what, if anything, a neolithic pregnant human totem signifies. Is it art? Or religion? Or a tchotchke?

late 20th century. Unidentified media device. May have been a vocoder or memorizer. A minority theory is that it was a toy though most scholars dismiss this based on the color, location of the find, and artistic significance.
Late 20th century. Unidentified media device. May have been a vocoder or memorizer. A minority theory is that it was a toy though most scholars dismiss this based on the color, location of the find, and artistic significance. - Museum Rotterdam, CC BY-SA 3.0, via Wikimedia Commons

  1. At an average of 4.5 letters per word, there are 135,000 characters in the book. It takes 8 bits to encode a letter, so this book holds 10.7 million bits of information, or 10.2 KB.