There is much hand-wringing in the media about the impending death of journalism, usually blamed on the rise of the web or more specifically bloggers. I’m sympathetic to their plight, but sometimes journalists are their own worst enemy, especially when they publish badly-researched articles that fuel moral panic with little regard for facts (if you’ve ever been in a newspaper article yourself, you’ll know that you’re lucky if they manage to spell your name right).
Exhibit A: an article published in The Guardian called How I became a Foursquare cyberstalker. Actually, the article isn’t nearly as bad as the comments, which take ignorance and narrow-mindedness to a new level.
Fortunately Ben is on hand to set the record straight. He wrote Concerning Foursquare and communicating privacy. Far from being a lesser form of writing, this blog post is more accurate than the article it is referencing, helping to balance the situation with a different perspective …and a nice big dollop of facts and research. Ben is actually quite kind to The Guardian article but, in my opinion, his own piece is more interesting and thoughtful.
Exhibit B: an article by Jeffrey Rosen in The New York Times called The Web Means the End of Forgetting. That’s a bold title. It’s also completely unsupported by the contents of the article. The article contains anecdotes about people getting into trouble about something they put on the web, and—even though the consequences for that action played out in the present—he talks about
the permanent memory bank of the Web and writes:
The fact that the Internet never seems to forget is threatening, at an almost existential level, our ability to control our identities.
Bollocks. Or, to use the terminology of Wikipedia,
Rosen presents his premise — that information once posted to the Web is permanent and indelible — as a given. But it’s highly debatable. In the near future, we are, I’d argue, far more likely to find ourselves trying to cope with the opposite problem: the Web “forgets” far too easily.
Exactly! I get irate whenever I hear the
the web never forgets presented without any supporting data. It’s right up there with
eskimos have fifty words for snow and
people in the middle ages thought that the world was flat. These falsehoods are irritating at best. At worst, as is the case with the myth of the never-forgetting web, the lie is downright dangerous. As Rosenberg puts it:
I’m a lot less worried about the Web that never forgets than I am about the Web that can’t remember.
That’s a real problem. And yet there’s no moral panic about the very real threat that, once digitised, our culture could be in more danger of being destroyed. I guess that story doesn’t sell papers.
This problem has a number of thorns. At the most basic level, there’s the issue of link rot. I love the fact that the web makes it so easy for people to publish anything they want. I love that anybody else can easily link to what has been published. I hope that the people doing the publishing consider the commitment they are making by putting a linkable resource on the web.
As I’ve said before, a big part of this problem lies with the DNS system:
Domain names aren’t bought, they are rented. Nobody owns domain names, except ICANN.
I’m not saying that we should ditch domain names. But there’s something fundamentally flawed about a system that thinks about domain names in time periods as short as a year or two.
Then there’s the fact that so much of our data is entrusted to third-party sites. There’s no guarantee that those third-party sites give a rat’s ass about the long-term future of our data. Quite the opposite. The callous destruction of Geocities by Yahoo is a testament to how little our hopes and dreams mean to a company concerned with the bottom line.
We can host our own data but that isn’t quite as easy as it should be. And even with the best of intentions, it’s possible to have the canonical copies wiped from the web by accident. I’m very happy to see services like Vaultpress come on the scene:
Your WordPress site or blog is your connection to the world. But hosting issues, server errors, and hackers can wipe out in seconds what took years to build. VaultPress is here to protect what’s most important to you.
We need one or more institutions that can manage electronic trusts over very long periods of time.
The institutions need to be long-lived and have the technical know-how to manage static archives. The organizations should need the service themselves, so they would be likely to advance the art over time. And the cost should be minimized, so that the most people could do it.
It’s what my technology friends call a non-trivial task, for all kinds of technical, social and legal reasons. But it’s about as important for our future as anything I can imagine. We are creating vast amounts of information, and a lot of it is not just worth preserving but downright essential to save.
There’s an even longer-term problem with digital preservation. The very formats that we use to store our most treasured memories can become obsolete over time. This goes to the very heart of why standards such as HTML—the format I’m betting on—are so important.
Mark Pilgrim wrote about the problem of format obsolescence back in 2006. I found his experiences echoed more recently by Paul Glister, author of the superb Centauri Dreams, one of my favourite websites. He usually concerns himself with challenges on an even longer timescale, like the construction of a feasible means of interstellar travel but he gives a welcome long zoom perspective on digital preservation in Burying the Digital Genome, pointing to a project called PLANETS: Preservation and Long-term Access Through Networked Services.
Their plan involves the storage, not just of data, but of data formats such as JPEG and PDF: the equivalent of a Rosetta stone for our current age. A box containing format-decoding documentation has been buried in a bunker under the Swiss Alps. That’s a good start.
David Eagleman recently gave a talk for The Long Now Foundation entitled Six Easy Steps to Avert the Collapse of Civilization. Step two is
Don’t lose things:
As proved by the destruction of the Alexandria Library and of the literature of Mayans and Minoans, “knowledge is hard won but easily lost.”
I’m worried that we’re spending less and less time thinking about the long-term future of our data, our culture, and ultimately, our civilisation. Currently we are preoccupied with The Long Now Foundation and Tau Zero Foundation offer a much-needed sense of perspective.
As with that other great challenge of our time—the alteration of our biosphere through climate change—the first step to confronting the destruction of our collective digital knowledge must be to think in terms greater than the local and the present.