A profile of Mark Graham and the team at the Internet Archive.
From smart dust and spimes, through to online journaling and social media, to machine learning, big data and digital preservation…
Is the archive where information goes to live forever, or where data goes to die?
I share many of these concerns.
The web is huge. Even bigger than Google. I love that the web preserves all the work. I don’t think anyone has the right to change the web so they no longer work.
A terrific piece by Maria Bustillos on digital preservation and the power of archives, backed up with frightening real-world examples.
Because history is a fight we’re having every day. We’re battling to make the truth first by living it, and then by recording and sharing it, and finally, crucially, by preserving it. Without an archive, there is no history.
That’s the web I want; a place with spare corners where un-monetisable enthusiasms can be preserved, even if they’ve not been updated for seven years.
Off-site backups of humanity’s knowledge and culture, stored in different media (including pyramidal crystals) placed in near-Earth orbit, the moon, and Mars.
We are developing specialized next-generation devices that we call Archs™ (pronounced “Arks”), which are designed to hold and transmit large amounts of data over long periods of time in extreme environments, including outer space and on the surfaces of other planetary bodies.
Our goal is to collect and curate important data sets and to install them on Archs™ that will be delivered to as many locations as possible for safekeeping.
To increase the chances that Archs™ will be found in the future, we aim for durability and massive redundancy across a broad diversity of locations and materials – a strategy that nature itself has successfully employed.
You can’t log into the same Facebook twice.
The world as we experience it seems to be growing more opaque. More of life now takes place on digital platforms that are different for everyone, closed to inspection, and massively technically complex. What we don’t know now about our current experience will resound through time in historians of the future knowing less, too. Maybe this era will be a new dark age, as resistant to analysis then as it has become now.
A conference in my old stomping grounds of Freiburg on archives, preservation, and long-term thinking:
It will present the state of art in long-term archiving as well as the present problems in preservation of information and scientific data in archives and libraries. Perhaps the most interesting aspect is that, since all conceivable systems are finite but can be quite large, a choice on the contents has to be made. This requires thinking of the human condition: Who we are, what we are and what do we find worth to preserve.
There are three parts to digital preservation: format, medium, and licensing. Film and television archives are struggling with all three.
Codecs—the software used to compress and decompress digital video files—keep changing, as do the hardware and software for playback.
As each new generation of LTO comes to market, an older generation of LTO becomes obsolete. LTO manufacturers guarantee at most two generations of backward compatibility. What that means for film archivists with perhaps tens of thousands of LTO tapes on hand is that every few years they must invest millions of dollars in the latest format of tapes and drives and then migrate all the data on their older tapes—or risk losing access to the information altogether.
Studios didn’t see any revenue potential in their past work. They made money by selling movie tickets; absent the kind of follow-on markets that exist today, long-term archiving didn’t make sense economically.
It adds up to a potential cultural disaster:
If technology companies don’t come through with a long-term solution, it’s possible that humanity could lose a generation’s worth of filmmaking, or more.
Cancelling the future.
The future lives and dies by the state of the archives. To look hard at this world and honestly, diligently articulate what happened and what it was like in the present is a sort of promise to the future, a new layer to the palimpsest of history that can become someone else’s foundation.
The Encrypted Media Extensions (EME) addition to HTML is effectively DRM with the blessing of the W3C. It’s bad for accessibility, bad for usability, bad for security, and as the Internet Archive rightly points out, it’s bad for digital preservation.
The Digital Transition: How the Presidential Transition Works in the Social Media Age | whitehouse.gov
Kori Schulman describes the archiving of social media and other online artefacts of the outgoing US president. It’s a shame that a lot of URLs will break, but I’m glad there’s going to be a public backup available.
Best of all, you can get involved:
In the interim, we’re inviting the American public – from students and data engineers, to artists and researchers – to come up with creative ways to archive this content and make it both useful and available for years to come. From Twitter bots and art projects to printed books and query tools, we’re open to it all.
Prompted by the way Craig is handling the shutdown of hi.co, Glenn Fleishman takes a look at other digital preservation efforts and talk to Laura Welcher at the Long Now Foundation.
A time capsule is bottled optimism. It makes material the belief that human beings will survive long enough to retrieve and decode artifacts of the distant past.
A profile of the Internet Archive, but this time focusing on its physical space.
The Archive is a third place unlike any other.
History, as the future will know it, is happening today on the web. And so it is the web that we must capture, package, and preserve for future generations to see who we are today.
Digital archivists run up against mismatched expectations:
But did you know that a large majority of web users think that when sharing their thoughts, images, and videos online they are going to be preserved in perpetuity? No matter how many licenses the general population clicks “Agree” to, or however many governing policies are developed that state the contrary, the millions of people sharing their content on websites still believe that there is an implicit accountability that should be upheld by the site owners.
360 terabytes of data stored for over 13 billion years:
Coined as the ‘Superman memory crystal’, as the glass memory has been compared to the “memory crystals” used in the Superman films, the data is recorded via self-assembled nanostructures created in fused quartz. The information encoding is realised in five dimensions: the size and orientation in addition to the three dimensional position of these nanostructures.
A note of optimism for digital preservation:
Where a lack of action may have been more of the case in the 01990s, it is certainly less so today. In the early days, there were just a handful of pioneers talking about and working on digital preservation, but today there are hundreds of tremendously intelligent and skilled people focused on preserving access to the yottabytes of digital cultural heritage and science data we have and will create.
Such a vividly nostalgic project. Choose an obsolete browser. Enter a URL. Select which slice of the past you want to see.
Digital archives in action. Access drives preservation.