Bit rot is a real and present danger to digital information. It refers to the irrevocable degradation or loss of digital information when the digital documents and infrastructure – both hardware and software – needed to access, use, store, view, and understand this information is no longer available, current, or executable.

In other words, technological obsolescence is resulting in serious losses of information, directly contradicting the popular belief, indeed myth, that digital information and documents are permanent.

Bit rot is neither a newly emergent nor unexpected phenomenon. In a 1995 Scientific American article discussing the longevity of digital information, Jeff Rothenberg warned about bit rot’s growing impact.

“We are in imminent danger of losing (digital information and documents) even as we create them,” he argued. “We must invest careful thought and significant effort if we are to preserve these documents for the future. If we are unwilling to make this investment, we risk substantial practical loss, as well as the condemnation of our progeny for thoughtlessly consigning to oblivion a unique historical legacy.”

Since this warning, bit rot’s dangers have only grown. If left unaddressed, bit rot could possibly erase most of today’s digital information in a relatively short period of time. Because of bit rot’s ravages, it’s unlikely that most of today’s digital information will be accessible, usable, viewable, or understandable within a decade. As more information is digitised – by migrating print resources and physical artifacts to digital formats, or, increasingly, by creating digital documents instead of print ones – most of it may become lost for future generations due bit.

Vint Cerf, Google’s vice president also warns about bit rot’s effects, claiming that it could lead to a “forgotten generation, or even a forgotten century” because our digital information is being lost due to technological evolution and its accompanying obsolescence. He observes that, “When you think about the quantity of documentation from our daily lives that is captured in digital form, like our interactions by e-mail, people’s tweets, and all of the worldwide web, it’s clear that we stand to lose an awful lot of our history” because of bit rot. Cerf notes, for instance, how the applications needed to access, use, view, and understand digital information quickly become outdated, thereby losing the digital information because they are no longer able to operate on newer computers, devices, or systems. Thus, digital information is rendered inaccessible, unusable, or unintelligible, and consequently lost to future generations.

It’s often believed that digitisation preserves information. Information is thus digitised for preservation purposes without much consideration that it will, in most cases, become obsolete, or lost with relentless technological developments. Cerf observes that, “We digitise things because we think we will preserve them, but what we don’t understand is that unless we take other steps, those digital versions may not be any better, and may even be worse, than the artifacts that we digitised.”

Indeed, what happens to our digital information when technology continuously changes and progresses? How will we access, say, our digital family pictures in the next five or 10 years? What will happen to digital medical records, court rulings, and financial data? What will become of our institutional, business, cultural, personal, and other digital documents? Cerf argues that, “We don’t want our digital lives to fade away. If we want to preserve them, we need to make sure that the digital objects we create today can still be rendered far into the future.”

There is a similar belief about saving information online. If it’s posted online, it remains there in virtual perpetuity. But the internet is equally as vulnerable to bit rot as other kinds of digital information. For instance, the lifecycle of most webpages is usually only months. Links decay even faster. A recent study of links in digital resources – the majority of which had no print counterpart – discovered that within one year nearly 10 per cent had ceased working, and by three years, 30 per cent of links were dead. Most of the webpages from the 1990s, in fact, are almost completely gone.

Because of bit rot’s ravages, it’s unlikely that most of today’s digital information will be accessible, usable, viewable, or understandable within a decade

The internet is not a stable or reliable virtual place. It’s not a repository, library, or archive. It’s a constantly changing patchwork of different browsers, features, software, hardware, devices, and content. Posting or saving information online does not necessarily guarantee it will remain as it is for long as this patchwork is perpetually updating, upgrading, or changing. When digital information on the internet decays, there is nothing much left of it – it’s gone. It’s almost as if ephemerality is intentionally built within the internet and digital information.

Ironically, it would have been easier to archive and save digital information on the internet of the 1990s than the contemporary internet. To begin with, the sheer number of webpages today, which multiply daily, would be nearly impossible to archive. In 1994, there were less than 3,000 websites – in 2014, there were more than one billion. Furthermore, most 1990s webpages were mainly text-based digital documents with basic, if any, interactive or multimedia features containing minimal amounts of data. If such a webpage had indeed been saved, there would have been a better chance of accessing, viewing, and using it than trying to do the same with a contemporary webpage. Today’s internet involves substantially more complex, and interactive multimedia webpages containing huge volumes of different kinds of data. There is no best process or practice presently available to address the challenge of trying to archive and save these webpages.

In The Discipline of Organising (MIT Press, 2013), Robert J. Glushko presents three major challenges confronting digital information and its preservation: technological obsolescence, expected useful lifetimes of physical storage media, and the (un)availability of software and its associated computing environment. He argues that, “Digitisation creates preservation challenges because technological obsolescence of computer software and hardware require ongoing efforts to ensure the digitised resources can be accessed.”

The first challenge of technological obsolescence, “Is a result of the relentless evolution of the physical media and environments used to store digital information in both institutional or business and personal organising systems.” Glushko explains that, “As the capacity of storage technologies grows from kilobytes to megabytes to gigabytes to terabytes to petabytes, economic and efficiency considerations often make the case to adopt new technology to store newly acquired digital resources and raise questions about what to do with the existing ones.”

The second challenge of expected useful lifetimes of storage media is paradoxical because, “Even as the capacities of digital storage technologies increase at a staggering pace, the expected useful lifetimes of the physical storage media are measured in years or at best in decades.” Glushko notes that, “The contrast between printed and digital resources is striking; books on library shelves don’t disappear if no one uses them, but digital data can be lost just because no one wants access to it within a year or two after its creation.”

The third challenges of the (un)availability of software and its associated computing environment required to use resources at the time of preservation might no longer be available when the resource needs to be accessed. Glushko describes that, “Software and services that convert documents from old formats to new ones are widely available, but they are only useful if the old file can be read from its legacy storage medium.” Sometimes either backwards or forwards compatibility are not possible or viable.

The question, therefore, is how to prevent or at least mitigate bit rot. Cerf recommends the use of digital vellum to preserve old hardware and software to help prevent their obsolescence and ensure the digital information dependent upon them can be recovered and used. He acknowledges that the concept and practicality of digital vellum is a work in progress. He states that, “It’s not without its rough edges but the major concept has been shown to work.”

He explains that digital vellum involves a process of taking, “An X-ray snapshot of the content and the application and the operating system together, with a description of the machine that it runs on, and preserve that for long periods of time. And that digital snapshot will recreate the past, in the future.”

This X-ray snapshot, “Should be transportable from one place to another. So I should be able to move it from the Google cloud to some other cloud, or move it into a machine I have.” Further, “No matter what the medium is in which digital bits are recorded, how long will we be able to read them, and how long will we make sense out of them? So the issue here is not just the physical bits, but what do they mean. If you use a program, for example, to create a spreadsheet, you have a complex file. You store the file away and you hold onto it for 20 or 30 years. And even pretending you can read the disc again, do you have the software that knows what the bits mean? So the digital vellum idea is not just physical medium, but an ecosystem which is able to remember what bits mean over long periods of time.”

Another major component of digital vellum would be standardised descriptions to help ensure they remain accessible, usable, viewable, and understandable. Cerf explains that when you move digital information from one place to another, “You still (must) know how to unpack them to correctly interpret the different parts. That is all achievable if we standardise the descriptions. And that’s the key issue here: how do I ensure in the distant future that the standards are still known, and I can still interpret this carefully constructed X-ray snapshot?”

There are, however, various economic and legal issues surrounding digital vellum that can contribute to bit rot. It’s not necessarily a viable commercial project for many institutions or businesses because its development requires committed, long-term, and sizeable financial, knowledge and technological investments. Moreover, most digital information depends upon proprietary products. In many cases, proprietary codes, algorithms, copyright, patents, licensing, and other legalities are actually included or incorporated into the information itself. Digital information consequently remains mainly dependent upon the owners’ commitment to keep the products intact, updated, or in existence. It’s therefore often prohibitive to buy these legal rights from owners that may go out of business, sell their products, discontinue parts of or all of their products, or stop making upgrades or supporting updates to them. Cerf therefore proposes that, “The rights of preservation might need to be incorporated into our thinking about things like copyright and patents and licensing.”

Bit rot is a serious threat to digital information. Cerf’s concept of digital vellum is one possible solution that shows some practical promise. There are other numerous research avenues needing further exploration to help illuminate other possibilities for tackling this phenomenon. Further, the professional practices of records and information management, as well as librarianship and archival work, offer many established and robust practical directions, frameworks, standards, strategies, and steps to follow or take in addressing many of bit rot’s challenges to digital information. By further developing digital vellum, conducting greater research, and adopting some of the practical steps offered by records and information management, librarianship, and archival work, we can help tomorrow remember today. By committing ourselves to fully addressing bit rot now, we can help ensure that we will be a remembered generation.

• Marc Kosciejew is head of department and lecturer in the Department of Library Information and Archive Sciences in the Faculty of Media and Knowledge Sciences, University of Malta.

Sign up to our free newsletters

Get the best updates straight to your inbox:
Please select at least one mailing list.

You can unsubscribe at any time by clicking the link in the footer of our emails. We use Mailchimp as our marketing platform. By subscribing, you acknowledge that your information will be transferred to Mailchimp for processing.