Tags: preservation



Web! What is it good for?

You can listen to an audio version of Web! What is it good for?

I have a blind spot. It’s the web.

I just can’t get excited about the prospect of building something for any particular operating system, be it desktop or mobile. I think about the potential lifespan of what would be built and end up asking myself “why bother?” If something isn’t on the web—and of the web—I find it hard to get excited about it. I’m somewhat jealous of people who can get equally excited about the web, native, hardware, print …in my mind, if it hasn’t got a URL, it’s missing some vital spark.

I know that this is a problem, but I can’t help it. At the very least, I have enough presence of mind to recognise it as being my problem.

Given these unreasonable feelings of attachment towards the web, you might expect me to wish it to become the one technology to rule them all. But I’ve never felt that any such victory condition would make sense. If anything, I’ve always been grateful for alternative avenues of experimentation and expression.

When Flash was a thriving ecosystem for artists to push the boundaries of what was possible to deliver to a web browser, I never felt threatened by it. I never wished for web technologies to emulate those creations. Don’t get me wrong: I’m happy that we’ve got nice smooth animations in CSS, but I never thought the lack of animation was crippling the web’s potential.

Now we have native technologies that can do more than the web can do. iOS and Android apps can access device APIs that web browsers can’t (yet). And, once again, while I look forward to the day that websites will be able to do all the things that native apps can do today, I don’t think that the lack of those capabilities is dooming the web to irrelevance.

There will always be some alternative that is technologically more advanced than the web. First there were CD-ROMs. Then we had Flash. Now we have native apps. Each one of those platforms offered more power and functionality than you could get from a web browser. And yet the web persists. That’s because none of the individual creations made with those technologies could compete with the collective power of all of the web, hyperlinked together. A single native app will “beat” a single website every time …but an app store pales when compared to the incredible reach and scope of the entire World Wide Web.

The web will always be lagging behind some other technology. I’m okay with that. If anything, I see these other technologies as the research and development arm of the web. CD-ROMs, Flash, and now native apps show us what authors want to be able to do on the web. Slowly but surely, those abilities start becoming available in web browsers.

The pace of this standardisation can seem infuriatingly slow. Sometimes it is too slow. But it’s important that we get it right—the web should hold itself to a higher standard. And so the web plays the tortoise while other technologies race ahead as the hare.

Like I said, I’m okay with that. I’m okay with the web not being as advanced as some other technology stack at any particular moment. I can wait.

In fact, as PPK points out, we could do real damage to the web by attempting to make it mimic some platform that’s currently in the ascendent. I disagree with his framing of it as a battle—rather than conceding defeat, I see it more like waiting out a siege—but I agree completely with this assessment:

The web cannot emulate native perfectly, and it never will.

If we accept that, then we can play to the web’s strengths (while at the same time, playing a slow game of catch-up behind the scenes). The danger comes when we try to emulate the capabilities of something that isn’t the web:

Emulating native leads to bad UX (or, at least, a UX that’s clearly a sub-optimal copy of native UX).

Whenever a website tries to emulate something from an operating system—be it desktop or mobile—the result is invariably something that gets really, really close …but falls just a little bit short. It feels like entering an uncanny valley of interaction design.

Frank sums this up nicely:

I think you make what I call “bicycle bear websites.” Why? Because my response to both is the same.

“Listen bub,” I say, “it is very impressive that you can teach a bear to ride a bicycle, and it is fascinating and novel. But perhaps it’s cruel? Because that’s not what bears are supposed to do. And look, pal, that bear will never actually be good at riding a bicycle.”

This is how I feel about so many of the fancy websites I see. “It is fascinating that you can do that, but it’s really not what a website is supposed to do.”

Enough is enough, says PPK:

It’s time to recognise that this is the wrong approach. We shouldn’t try to compete with native apps in terms set by the native apps. Instead, we should concentrate on the unique web selling points: its reach, which, more or less by definition, encompasses all native platforms, URLs, which are fantastically useful and don’t work in a native environment, and its hassle-free quality.

This is something that Cennydd talked about recently on an episode of the Design Details podcast. The web, he argues, is great for the sharing of information, but not so great for applications.

I think PPK, Cennydd, and I are all in broad agreement, but we almost certainly differ in the details. PPK, for example, argues that maybe news sites should be native apps instead, but for me, those are exactly the kind of sites that benefit from belonging to no particular platform. And when Cennydd talks about applications on the web, it raises the whole issue of what constitutes a web app anyway. If we’re talking about having access to device APIs—cameras, microphones, accelerometers—then yes, native is the way to go. But if we’re talking about interface elements and motion design, then I think the web can hold its own …sometimes.

Of course not every web browser can match the capabilities of a native app—that’s why it’s so important to approach web development through the lens of progressive enhancement rather than treating it like software development no different than that of native platforms. The web is not a platform—that’s the whole point of the web; it’s cross-platform. As Baldur put it:

Treating the web like another app platform makes sense if app platforms are all you’re used to. But doing so means losing the reach, adaptability, and flexibility that makes the web peerless in both the modern media and software industries.

The price we pay for that incredible cross-platform reach is that features on the web will always be lagging behind, and even when do they do arrive, they won’t be available in all web browsers.

To paraphrase William Gibson: capabilities on the web will always be here, but they will never be evenly distributed.

But let’s take a step back from the surface-level differences between web and native. Just as happened with CD-ROMs and Flash, the web is catching up with native when it comes to motion design, visual feedback, and gestures like swiping and dragging. I don’t think those are where the fundamental differences lie. I don’t even think the fundamental differences lie in accessing device APIs like cameras, microphones, and offline storage—the web is (slowly) catching up in those areas too.

What if the fundamental differences lie deeper than the technical implementation? What if the web is suited to some things more than others, not because of technical limitations, but because of philosophical mismatches?

The web was born at CERN, an amazing environment that’s free of many of the economic and hierarchical pressures that shape technology decisions elsewhere. The web’s heritage as a hypertext document sharing system for pure scientific research is often treated as a handicap, something that must be overcome in this age of applications and monetisation. But I see this heritage as a feature, not a bug. It promotes ideals of universal access above individual convenience, creation above consumption, and sharing above financial gain.

In yet another great article by Baldur, called The new age of HTML: the web is being torn apart, he opens with this:

For web development to grow as a craft and as an industry, we have to follow the money. Without money the craft becomes a hobby and unmaintained software begins to rot.

But I think there’s a danger here. If we allow the web to be led by money-making, we may end up changing the fundamental nature of the web, and not for the better.

Now, personally, I believe that it’s entirely possible to run a profitable business on the web. There are plenty of them out there. But suppose we allow that other avenues are more profitable. Let’s assume that there’s more friction in making money on the web than there is in, say, making money on iOS (or Android, or Facebook, or some other monolithic stack). If that were the case …would that be so bad?

Suppose, to use PPK’s phrase, we “concede defeat” to Apple, Google, Microsoft, and Facebook. When you think about it, it makes sense that platforms borne from profit-driven companies are going to be better at generating profit than something created by a bunch of idealistic scientists trying to improve the knowledge of the human race. Suppose we acknowledged that the web isn’t that well-suited to capitalism.

I think I’d be okay with that.

Would the web become little more than a hobbyist’s playground? A place for amateurs rather than professional businesses?


I’d be okay with that too.

Y’see, what attracted me to the web—to the point where I have this blind spot—wasn’t the opportunity to make money. What attracted me to the web was its remarkable ability to allow anyone to share anything, not just for the here and now, but for the future too.

If you’ve been reading my journal or following my links for any time, you’ll be aware that two of my biggest interests are progressive enhancement and digital preservation. In my mind, these two things are closely intertwingled.

For me, progressive enhancement is a means of practicing universal design, a way of providing access to as many people as possible. That includes access across time, hence the crossover with digital preservation. I’ve noticed again and again that what’s good for accessibility is also good for longevity, and vice versa.

Bret Victor writes:

Whenever the ephemerality of the web is mentioned, two opposing responses tend to surface. Some people see the web as a conversational medium, and consider ephemerality to be a virtue. And some people see the web as a publication medium, and want to build a “permanent web” where nothing can ever disappear.

I don’t want a web where “nothing can ever disappear” but I also don’t want the default lifespan of a resource on the web to be ephemeral. I think that whoever published that resource should get to decide how long or short its lifespan is. The problem, as Maciej points out, is in the mismatch of expectations:

I’ve come to believe that a lot of what’s wrong with the Internet has to do with memory. The Internet somehow contrives to remember too much and too little at the same time, and it maps poorly on our concepts of how memory should work.

I completely agree with Bret’s woeful assessment of the web when it comes to link rot:

It is this common record of public thought — the “great conversation” — whose stability and persistence is crucial, both for us alive today and for those who will come after.

I believe we can and should do better. But I completely and utterly disagree with him when he says:

Photos from your friend’s party are not part of the common record.

Nor are most casual conversations. Nor are search histories, commercial transactions, “friend networks”, or most things that might be labeled “personal data”. These are not deliberate publications like a bound book; they are not intended to be lasting contributions to the public discourse.

We can agree when it comes to search histories and commercial transactions, but it makes no sense to lump those in with the ordinary plenty that I’ve written about before:

My words might not be as important as the great works of print that have survived thus far, but because they are digital, and because they are online, they can and should be preserved …along with all the millions of other words by millions of other historical nobodies like me out there on the web.

For me, this lies at the heart of what the web does. The web removes the need for tastemakers who get to decide what gets published. The web removes the need for gatekeepers who get to decide what gets saved.

Other avenues of expressions will always be more powerful than the web in the short term: CD-ROMs, Flash, and now native. But they all come with gatekeepers. The collective output of the human race—from the most important scholarly papers to the most trivial blog post—is too important to put in the hands of the gatekeepers of today who may not even be around tomorrow: Apple, Google, Microsoft, et al.

The web has no gatekeepers. The web has no quality control. The web is a mess. The web is for everyone.

I have a blind spot. It’s the web.

Forgetting again

In an article entitled The future of loneliness Olivia Laing writes about the promises and disappointments provided by the internet as a means of sharing and communicating. This isn’t particularly new ground and she readily acknowledges the work of Sherry Turkle in this area. The article is the vanguard of a forthcoming book called The Lonely City. I’m hopeful that the book won’t be just another baseless luddite reactionary moral panic as exemplified by the likes of Andrew Keen and Susan Greenfield.

But there’s one section of the article where Laing stops providing any data (or even anecdotal evidence) and presents a supposition as though it were unquestionably fact:

With this has come the slowly dawning realisation that our digital traces will long outlive us.

Citation needed.

I recently wrote a short list of three things that are not true, but are constantly presented as if they were beyond question:

  1. Personal publishing is dead.
  2. JavaScript is ubiquitous.
  3. Privacy is dead.

But I didn’t include the most pernicious and widespread lie of all:

The internet never forgets.

This truism is so pervasive that it can be presented as a fait accompli, without any data to back it up. If you were to seek out the data to back up the claim, you would find that the opposite is true—the internet is in constant state of forgetting.

Laing writes:

Faced with the knowledge that nothing we say, no matter how trivial or silly, will ever be completely erased, we find it hard to take the risks that togetherness entails.

Really? Suppose I said my trivial and silly thing on Friendfeed. Everything that was ever posted to Friendfeed disappeared three days ago:

You will be able to view your posts, messages, and photos until April 9th. On April 9th, we’ll be shutting down FriendFeed and it will no longer be available.

What if I shared on Posterous? Or Vox (back when that domain name was a social network hosting 6 million URLs)? What about Pownce? Geocities?

These aren’t the exceptions—this is routine. And yet somehow, despite all the evidence to the contrary, we still keep a completely straight face and say “Be careful what you post online; it’ll be there forever!”

The problem here is a mismatch of expectations. We expect everything that we post online, no matter how trivial or silly, to remain forever. When instead it is callously destroyed, our expectation—which was fed by the “knowledge” that the internet never forgets—is turned upside down. That’s where the anger comes from; the mismatch between expected behaviour and the reality of this digital dark age.

Being frightened of an internet that never forgets is like being frightened of zombies or vampires. These things do indeed sound frightening, and there’s something within us that readily responds to them, but they bear no resemblance to reality.

If you want to imagine a truly frightening scenario, imagine an entire world in which people entrust their thoughts, their work, and pictures of their family to online services in the mistaken belief that the internet never forgets. Imagine the devastation when all of those trivial, silly, precious moments are wiped out. For some reason we have a hard time imagining that dystopia even though it has already played out time and time again.

I am far more frightened by an internet that never remembers than I am by an internet that never forgets.

And worst of all, by propagating the myth that the internet never forgets, we are encouraging people to focus in exactly the wrong area. Nobody worries about preserving what they put online. Why should they? They’re constantly being told that it will be there forever. The result is that their history is taken from them:

If we lose the past, we will live in an Orwellian world of the perpetual present, where anybody that controls what’s currently being put out there will be able to say what is true and what is not. This is a dreadful world. We don’t want to live in this world.

Brewster Kahle

Cerf rocks

After I wrote about digital preservation and the need to save everything, not just the so-called “important” stuff, Jason wrote a lovely piece with his own thoughts on the matter:

In order to write a history, you need evidence of what happened. When we talk about preserving the stuff we make on the web, it isn’t because we think a Facebook status update, or those GeoCities sites have such significance now. It’s because we can’t know.

In a timely coincidence, Vint Cerf also spoke about the importance of digital preservation:

When you think about the quantity of documentation from our daily lives that is captured in digital form, like our interactions by email, people’s tweets, and all of the world wide web, it’s clear that we stand to lose an awful lot of our history.

He warns of the dangers of rapidly-obsoleting file formats:

We are nonchalantly throwing all of our data into what could become an information black hole without realising it. We digitise things because we think we will preserve them, but what we don’t understand is that unless we take other steps, those digital versions may not be any better, and may even be worse, than the artefacts that we digitised.

It was a little weird that the Guardian headline refers to Vint Cerf as “Google boss”. On the BBC he’s labelled as “Google’s Vint Cerf”. Considering he’s one of the creators of the internet itself, it’s a bit like referring to Neil Armstrong as a NASA employee.

I have to say, I just love listening to him talk. He’s so smooth. I’m sure that the character of The Architect from The Matrix Reloaded is modelled on him.

Vint Cerf knows a thing or two about long-term thinking when it comes to data formats. He has written many RFCs for the IETF (my favourite being RFC 2468). Back in 1969, he wrote RFC 20, proposing the ASCII format for network interchange. If you’ve ever used the keypress event in JavaScript and wondered why, for example, the number 13 corresponds to a carriage return, this is where all those numbers come from.

Last month, over 45 years after the RFC’s original publication, it became an official standard.

So when Vint Cerf warns about the dangers of digitising into file formats that could become unreadable, I think we should pay attention to him.

Ordinary plenty

Aaron asked a while back “What do we own?”

I love the idea of owning your content and then syndicating it out to social networks, photo sites, and the like. It makes complete sense… Web-based services have a habit of disappearing, so we shouldn’t rely on them. The only Web that is permanent is the one we control.

But he quite rightly points out that we never truly own our own domains: we rent them. And when it comes to our servers, most of us are renting those too.

It looks like print is a safer bet for long-term storage. Although when someone pointed out that print isn’t any guarantee of perpetuity either, Aaron responded:

Sure, print pieces can be destroyed, but important works can be preserved in places like the Beinecke

Ah, but there’s the crux—that adjective, “important”. Print’s asset—the fact that it is made of atoms, not bits—is also its weak point: there are only so many atoms to go around. And so we pick and choose what we save. Inevitably, we choose to save the works that we deem to be important.

The problem is that we can’t know today what the future value of a work will be. A future president of the United States is probably updating their Facebook page right now. The first person to set foot on Mars might be posting a picture to her Instagram feed at this very moment.

One of the reasons that I love the Internet Archive is that they don’t try to prioritise what to save—they save it all. That’s in stark contrast to many national archival schemes that only attempt to save websites from their own specific country. And because the Internet Archive isn’t a profit-driven enterprise, it doesn’t face the business realities that caused Google to back-pedal from its original mission. Or, as Andy Baio put it, never trust a corporation to do a library’s job.

But even the Internet Archive, wonderful as it is, suffers from the same issue that Aaron brought up with the domain name system—it’s centralised. As long as there is just one Internet Archive organisation, all of our preservation eggs are in one magnificent basket:

Should we be concerned that the technical expertise and infrastructure for doing this work is becoming consolidated in a single organization?

Which brings us back to Aaron’s original question. Perhaps it’s less about “What do we own?” and more about “What are we responsible for?” If we each take responsibility for our own words, our own photos, our own hopes, our own dreams, we might not be able guarantee that they’ll survive forever, but we can still try everything in our power to keep them online. Maybe by acknowledging that responsibility to preserve our own works, instead of looking for some third party to do it for us, we’re taking the most important first step.

My words might not be as important as the great works of print that have survived thus far, but because they are digital, and because they are online, they can and should be preserved …along with all the millions of other words by millions of other historical nobodies like me out there on the web.

There was a beautiful moment in Cory Doctorow’s closing keynote at last year’s dConstruct. It was an aside to his main argument but it struck like a hammer. Listen in at the 20 minute mark:

They’re the raw stuff of communication. Same for tweets, and Facebook posts, and the whole bit. And this is where some cynic usually says, “Pah! This is about preserving all that rubbish on Facebook? All that garbage on Twitter? All those pictures of cats?” This is the emblem of people who want to dismiss all the stuff that happens on the internet.

And I’m supposed to turn around and say “No, no, there’s noble things on the internet too. There’s people talking about surviving abuse, and people reporting police violence, and so on.” And all that stuff is important but I’m going to speak for the banal and the trivial here for a moment.

Because when my wife comes down in the morning—and I get up first; I get up at 5am; I’m an early riser—when my wife comes down in the morning and I ask her how she slept, it’s not because I want to know how she slept. I sleep next to my wife. I know how my wife slept. The reason I ask how my wife slept is because it is a social signal that says:

I see you. I care about you. I love you. I’m here.

And when someone says something big and meaningful like “I’ve got cancer” or “I won” or “I lost my job”, the reason those momentous moments have meaning is because they’ve been built up out of this humus of a million seemingly-insignificant transactions. And if someone else’s insignificant transactions seem banal to you, it’s because you’re not the audience for that transaction.

The medieval scribes of Ireland, out on the furthermost edges of Europe, worked to preserve the “important” works. But occasionally they would also note down their own marginalia like:

Pleasant is the glint of the sun today upon these margins, because it flickers so.

Short observations of life in fewer than 140 characters. Like this lovely example written in ogham, a morse-like system of encoding the western alphabet in lines and scratches. It reads simply “latheirt”, which translates to something along the lines of “massive hangover.”

I’m glad that those “unimportant” words have also been preserved.

Centuries later, the Irish poet Patrick Kavanagh would write about the desire to “wallow in the habitual, the banal”:

Wherever life pours ordinary plenty.

Isn’t that a beautiful description of the web?

For Chloe

We all grieve in different ways. We all find solace and comfort in different places.

There can be solace in walking. There can be comfort in music. Tears. Rage. Sadness. Whatever it takes.

Personally, I have found comfort in reading what others have written about Chloe …but I know Chloe would be really embarrassed. She never liked getting attention.

Chloe must have known that people would want to commemorate her in some way. She didn’t want a big ceremony. She didn’t want any fuss. She left specific instructions (her suicide was not a spur-of-the moment decision).

If you would like to mourn the death—and celebrate the life—of Chloe Weil, she asked that you contribute to one or both of these institutions:

  1. The Oregon Humane Society. This is where Chloe found FACE, her constant companion.
  2. The Internet Archive. Chloe cared deeply about the web and digital preservation.

If you choose to make a donation; thank you. It’s what Chloe wanted.

I still can’t believe she’s gone.

The telescope in the woods

I met Sandijs of Froont fame when I was in Austin for Artifact back in May. He mentioned how he’d like to put on an event in his home city of Riga, and I said I’d be up for that. So last weekend I popped over to Latvia to speak at an event he organised at a newly-opened co-working space in the heart of Riga.

That was on Friday, so Jessica I had the rest of the weekend to be tourists. Sandijs rented a car and took us out into the woods. There, in the middle of a forest, was an observatory: the Baldone Schmidt telescope.

Baldone Schmidt Telescope Baldone Schmidt Telescope

The day we visited was the Summer soltice and we were inside the observatory getting a tour of the telescope at the precise moment that the astronomical summer began.

It’s a beautiful piece of machinery. It has been cataloging and analysing carbon stars since the ’60s.

Controls Controls

Nowadays, the images captured by the telescope go straight into a computer, but they used to be stored on glass plates. Those glass plates are now getting digitised too. There’s one person doing all the digitising. It takes about forty minutes to digitise one glass plate. There are approximately 22,000 glass plates in the archive.

Archives Glass plates

It’s going to be a long process. But once all that data is available in a machine-readable format, there will inevitably be some interesting discoveries to made from mining that treasure trove.

The telescope has already been used to discover a dwarf planet in the asteroid belt. It’s about 1.5 kilometers wide. Its name is Baldone.

The tragedy of the commons

Flickr Commons is a wonderful thing. That’s why I’m concerned:

Y’know, I’m worried about what will happen to my own photos when Flickr inevitably goes down the tubes (there are still some good people there fighting the good fight, but they’re in the minority and they’re battling against the douchiest of Silicon Valley managerial types who have been brought in to increase “engagement” by stripping away everything that makes Flickr special) …but what really worries me is what’s going to happen to Flickr Commons. It’s an unbelievably important and valuable resource.

The Brooklyn Museum is taking pre-emptive measures:

As of today, we have left Flickr (including The Commons).

Unfortunately, they didn’t just leave their Flickr collection; they razed it to the ground. All those links, all those comments, and all those annotations have been wiped out.

They’ve moved their images over to Wikimedia Commons …for now. It turns out that they have a very cavalier attitude towards online storage (a worrying trait for a museum). They’re jumping out of the frying pan of Flickr and into the fire of Tumblr:

In the past few months, we’ve been testing Tumblr and it’s been a much better channel for this type of content.

Audio and video is being moved around to where the eyeballs and earholes currently are:

We have left iTunesU in favor of sharing content via YouTube and SoundCloud.

I find this quite disturbing. A museum should be exactly the kind of institution that should be taking a thoughtful, considered approach to how it stores content online. Digital preservation should be at the heart of its activities. Instead, it takes a back seat to chasing the fleeting thrill of “engagement.”

Leaving Flickr Commons could have been the perfect opportunity to invest in long-term self-hosting. Instead they’re abandoning the Titanic by hitching a ride on the Hindenberg.

9,125 days later

The World Wide Web turned 25 last week. Happy birthday!

As is so often the case when web history is being discussed, there is much conflating of “the web” and “the internet” in some mainstream media outlets. The internet—the network of networks that allows computers to talk to each other across the globe—is older than 25 years. The web—a messy collection of HTML files linked via URLs and delivered with the Hypertext Transfer Protocol (HTTP)—is just one of the many types of information services that uses the pipes of the internet (yes, pipes …or tubes, if you prefer—anything but “cloud”).

Now, some will counter that although the internet and the web are technically different things, for most people they are practically the same, because the web is by far the most common use-case for the internet in everyday life. But I’m not so sure that’s true. Email is a massive part of the everyday life of many people—for some poor souls, email usage outweighs web usage. Then there’s streaming video services like Netflix, and voice-over-IP services like Skype. These sorts of proprietary protocols make up an enormous chunk of the internet’s traffic.

The reason I’m making this pedantic distinction is that there’s been a lot of talk in the past year about keeping the web open. I’m certainly in agreement on that front. But if you dig deeper, it turns out that most of the attack vectors are at the level of the internet, not the web.

Net neutrality is hugely important for the web …but it’s hugely important for every other kind of traffic on the internet too.

The Snowden revelations have shown just how shockingly damaging the activities of the NSA and GCHQ are …to the internet. But most of the communication protocols they’re intercepting are not web-based. The big exception is SSL, and the fact that they even thought it would be desirable to attempt to break it shows just how badly they need to be stopped—that’s the mindset of a criminal organisation, pure and simple.

So, yes, we are under attack, but let’s be clear about where those attacks are targeted. The internet is under attack, not the web. Not that that’s a very comforting thought; without a free and open internet, there can be no World Wide Web.

But by and large, the web trundles along, making incremental improvements to itself: expanding the vocabulary of HTML, updating the capabilities of HTTP, clarifying the documentation of URLs. Forgive my anthropomorphism. The web, of course, does nothing to itself; people are improving the web. But the web always has been—and always will be—people.

For some time now, my primary concern for the web has centred around what I see as its killer feature—the potential for long-term storage of knowledge. Yes, the web can be (and is) used for real-time planet-spanning communication, but there are plenty of other internet technologies that can do that. But the ability to place a resource at a URL and then to access that same resource at that same URL after many years have passed …that’s astounding!

Using any web browser on any internet-enabled device, you can instantly reach the first web page ever published. 23 years on, it’s still accessible. That really is something special. Digital information is not usually so long-lived.

On the 25th anniversary of the web, I was up in London with the rest of the Clearleft gang. Some of us were lucky enough to get a behind-the-scenes peak at the digital preservation work being done at the British Library:

In a small, unassuming office, entire hard drives, CD-ROMs and floppy disks are archived, with each item meticulously photographed to ensure any handwritten notes are retained. The wonderfully named ‘ancestral computing’ corner of the office contains an array of different computer drives, including 8-inch, 5 1⁄4-inch, and 3 1⁄2-inch floppy disks.

Most of the data that they’re dealing with isn’t much older than the web, but it’s an order of magnitude more difficult to access; trapped in old proprietary word-processing formats, stuck on dying storage media, readable only by specialised hardware.

Standing there looking at how much work it takes to rescue our cultural heritage from its proprietary digital shackles, I was struck once again by the potential power of the web. With such simple components—HTML, HTTP, and URLs—we have the opportunity to take full advantage of the planet-spanning reach of the internet, without sacrificing long-term access.

As long as we don’t screw it up.

Right now, we’re screwing it up all the time. The simplest way that we screw it up is by taking it for granted. Every time we mindlessly repeat the fallacy that “the internet never forgets,” we are screwing it up. Every time we trust some profit-motivated third-party service to be custodian of our writings, our images, our hopes, our fears, our dreams, we are screwing it up.

The evening after the 25th birthday of the web, I was up in London again. I managed to briefly make it along to the 100th edition of Pub Standards. It was a long time coming. In fact, there was a listing on Upcoming.org for the event. The listing was posted on February 5th, 2007.

Of course, you can’t see the original URL of that listing. Upcoming.org was “sunsetted” by Yahoo, the same company that “sunsetted” Geocities in much the same way that the Enola Gay sunsetted Hiroshima. But here’s a copy of that listing.

Fittingly, there was an auction held at Pub Standards 100 in aid of the Internet Archive. The schwag of many a “sunsetted” startup was sold off to the highest bidder. I threw some of my old T-shirts into the ring and managed to raise around £80 for Brewster Kahle’s excellent endeavour. My old Twitter shirt went for a pretty penny.

I was originally planning to bring my old Pownce T-shirt along too. But at the last minute, I decided I couldn’t part with it. The pain is still too fresh. Also, it serves a nice reminder for me. Trusting any third-party service—even one as lovely as Pownce—inevitably leads to destruction and disappointment.

That’s another killer feature of the web: you don’t need anyone else. You can publish to this world-changing creation without asking anyone for permission. I wish it were easier for people to do this: entrusting your heritage to the Yahoos and Pownces of the world is seductively simple …but only in the short term.

In 25 years time, I want to be able to access these words at this URL. I’m going to work to make that happen.


It was a crazy time in Brighton last week: Reasons To Be Creative followed by Improving Reality followed by dConstruct followed by Maker Faire and Indie Web Camp. After getting some hacking done, I had to duck out of Indie Web Camp before the demos so that I could hop on a plane to Germany for Smashing Conference—the geek party was relocating from Brighton to Freiburg.

I was there to deliver the closing keynote and I had planned to reprise a talk that I had already given once or twice. But then Vitaly opened up proceedings by declaring that the event should be full of stories …and not just stories of success either; stories of failure. Then Elliot opened the show by showing some of his embarrassing early Flash websites. I decided that, in the spirit of Vitaly’s entreaty, I would try something similar. After all, I didn’t have anything quite as embarrassing as Atomic Kitten or Hilary Duff e-cards in my closet.

So I threw away my slidedeck and went Keynote commando. My laptop was connected to the projector but I only used it to bring up a browser to show embarrassing old sites like the first version of adactio.com complete with frames, tables for layout, and gratuitous DHTML animation. But I spent most of the time just talking, telling the story of how I first started making websites back when I used to live in Freiburg, and describing the evolution of The Session—a long-term project that’s given me a lot of perspective on how we often approach our work from too short a timescale.

It was fun. It was nice to be able to ditch the safety net of slides and talk off-the-cuff to a group of fellow geeks in the intimate surroundings of Freiburg’s medieval merchant’s hall.

Preparing to speak Leaving Smashingconf

I finished by encouraging people to look out the window of the merchant’s hall across to the splendid cathedral. The Freiburger Münster is a beautiful, magnificent creation …just like the web. But it’s made of sandstone and so it requires constant upkeep …just like the web. The Münsterbauverein are responsible for repairing and maintaining the building. They can only ever work on small parts at a time, but the overall result—over many generations—is a monument that’s protected for the future.

I hope that when we work on the web, we are also contributing to a magnificent treasure for the future.


August in America, day ten

Today was another sunny day in Arizona.

I saw a snake; it had a rattle. I admired prickly pear cacti, and when I picked up a prickly pear that had fallen to the ground, I discovered exactly why it’s called a prickly pear.

Prickly pear cactus

But I spent much of this sunny Arizona day in the dark.

We went to Kartchner Caverns, a series of limestone caves fifty mega-years old. It was quite beautiful.

The caverns might be ancient, but the state park is relatively young. The caves were first discovered in 1974. The story of what happened next is quite fascinating. The cavers who discovered the caverns teamed up with the landowner to negotiate with the State about creating a publicly-accessible state park (negotiations that had to happen in secret so that the caverns wouldn’t be despoiled if word got out).

They had come to the conclusion that the best chance of preserving the caverns was not to keep them secret, but to make them public under appropriate stewardship. It reminded me of the mantra of the Internet Archive:

Access drives preservation.

Not tumbling, but spiralling

Tumblr is traditionally the home of fun and frivolous blogs: Moustair, Kim Jong-Ill Looking At Things, Missed High Fives, Selleck Waterfall Sandwich, and the weird but wonderful Consume Consume (warning: you may lose an entire day in there).

But there are also some more thoughtful collections on Tumblr:

  • Abondonedography documents the strangely hypnotic lure of abandoned man-made structures, as does Abandoned Playgrounds.
  • Adiphany shows some of the cleverer pieces from the world of advertising.
  • Histories Past is a collection of fascinating historical photographs.
  • Found is also a collection of photographs, all of them from the archives of National Geographic, many of them hitherto-unpublished.

It’s going to be real shame when Tumblr shuts down and deletes all that content.

Of course that will never happen. Just like that never would’ve happened to Posterous or Pownce or Vox or GeoCities — publishing platforms where millions of people published a panoply of posts from the frivolous to the sublime, all of them now destroyed, their URLs purged from the web.

That reminds me: there’s one other Tumblr-hosted blog I came across recently: Our Incredible Journey documents those vile and disgusting announcements that start-ups make when they get acquired by a larger company, right before they flush their user’s content (and trust) down the toilet.

Oh, and I’ve got a Tumblr blog too. I just use it for silly pictures, YouTube videos, and quotes. I don’t want it to hurt too much when it gets destroyed.

Clearleft.com past and present

We finally launched the long-overdue redesign of the Clearleft website last week. We launched it late on Friday afternoon, because, hey! that’s not a stupid time to push something live or anything.

The actual moment of launch was initiated by Josh who had hacked together a physical launch button containing a teensy USB development board.

The launch button Preparing to launch

So nerdy.

Mind you, just because the site is now live doesn’t mean the work is done. Far from it, as Paul pointed out:

But it’s nice to finally have something new up. We were all getting quite embarrassed by the old site.

Still, rather than throw the old design away and never speak of it again, we’ve archived it. We’ve archived every iteration of the site:

  1. Version 1 launched in 2005. I wrote about it back then. It looked very much of its time. This was before responsive design, but it was, of course, nice and liquid.
  2. Version 2 came a few years later. There were some little bits I liked it about it but it always felt a bit “off”.
  3. Version 3 was more of a re-alignment than a full-blown redesign: an attempt to fix some of the things that felt “off” about the previous version.
  4. Version 4 is where we are now. We don’t love it, but we don’t hate it either. Considering how long it took to finally get this one done, we should probably start planning the next iteration now.

I’m glad that we’ve kept the archived versions online. I quite enjoy seeing the progression of the visual design and the technologies used under the hood.


I went out to The Albert the other night to see Twilight Hotel play. There were really good, so after the show I bought their CD, When The Wolves Go Blind.

It was only when I got home that I realised that I had no device that could play Compact Discs. I play all my music on my iPod or iPhone connected to a speaker dock. And my computer is a Macbook Air …no disc drive. So I had to bring the CD into work with me, stick into my iMac and rip the songs from there.

It’s funny how format (or storage medium) obsolescence creeps up on you like that. I wonder how long it will be until I’m not using any kind of magnetic medium at all.

Of Time and the Network and the Long Bet

When I went to Webstock, I prepared a new presentation called Of Time And The Network:

Our perception and measurement of time has changed as our civilisation has evolved. That change has been driven by networks, from trade routes to the internet.

I was pretty happy with how it turned out. It was a 40 minute talk that was pretty evenly split between the past and the future. The first 20 minutes spanned from 5,000 years ago to the present day. The second 20 minutes looked towards the future, first in years, then decades, and eventually in millennia. I was channeling my inner James Burke for the first half and my inner Jason Scott for the second half, when I went off on a digital preservation rant.

You can watch the video and I had the talk transcribed so you can read the whole thing.

It’s also on Huffduffer, if you’d rather listen to it.

Adactio: Articles—Of Time And The Network on Huffduffer

Webstock: Jeremy Keith

During the talk, I pointed to my prediction on the Long Bets site:

The original URL for this prediction (www.longbets.org/601) will no longer be available in eleven years.

I made the prediction on February 22nd last year (a terrible day for New Zealand). The prediction will reach fruition on 02022-02-22 …I quite like the alliteration of that date.

Here’s how I justified the prediction:

“Cool URIs don’t change” wrote Tim Berners-Lee in 01999, but link rot is the entropy of the web. The probability of a web document surviving in its original location decreases greatly over time. I suspect that even a relatively short time period (eleven years) is too long for a resource to survive.

Well, during his excellent Webstock talk Matt announced that he would accept the challenge. He writes:

Though much of the web is ephemeral in nature, now that we have surpassed the 20 year mark since the web was created and gone through several booms and busts, technology and strategies have matured to the point where keeping a site going with a stable URI system is within reach of anyone with moderate technological knowledge.

The prediction has now officially been added to the list of bets.

We’re playing for $1000. If I win, that money goes to the Bletchley Park Trust. If Matt wins, it goes to The Internet Archive.

The sysadmin for the Long Bets site is watching this bet with great interest. I am, of course, treating this bet in much the same way that Paul Gilster is treating this optimistic prediction about interstellar travel: I would love to be proved wrong.

The detailed terms of the bet have been set as follows:

On February 22nd, 2022 from 00:01 UTC until 23:59 UTC,
entering the characters http://www.longbets.org/601 into the address bar of a web browser or command line tool (like curl)
using a web browser to follow a hyperlink that points to http://www.longbets.org/601
return an HTML document that still contains the following text:
“The original URL for this prediction (www.longbets.org/601) will no longer be available in eleven years.”

The suspense is killing me!

Space by Botwest

I had a whole day of good talks yesterday at South By Southwest yesterday …and none of them were in the Austin Convention Center. In a very real sense, the good stuff at this event is getting pushed to the periphery.

The day started off in the Driskill Hotel with the New Aesthetic panel that James assembled. It was great, like a mini-conference packed into one hour with wonderfully dense knowledge bombs lobbed from all concerned. Joanne McNeil gave us the literary background, Ben searched for meaning (and humour) in advertising trends, Russell looked at how machines are changing what we read and write, and Aaron …um, talked about the helium-balloon predator drone in the corner of the room.

With our brains primed for the intersections where humans and machines meet, it wasn’t hard to keep pattern-matching for it. In fact, the panel right afterwards on technology and fashion was filled with wonderful wearable expressions of the New Aesthetic.

Alas, I wasn’t able to attend that panel because I had to get to the green room to prepare for my own appearance on Get Excited and Make Things With Science with Ariel and Matt. It was a lot of fun and it was a real pleasure to be on a panel with such smart people.

I basically used the panel as an opportunity to geek out about some of my favourite science-related hacks and websites:

After that I stayed in the Driskill for a panel on robots and AI. One of the panelists was Bina48.

I heard had heard about Bina48 from a Radiolab episode.

Radiolab - Talking to Machines on Huffduffer

Jon Ronson described the strange experience of interviewing her—how the questions always tended to the profound and meaningful rather than trivial and chatty. Sure enough, once Bina was (literally) unveiled on the panel—a move that was wisely left till halfway through because, as the panelists said, “after that, you’re not going to pay attention to a word we say”—people started asking questions like “Do you dream?” and “What is the meaning of life?”

I asked her “Where were you before you were here?” She calmly answered that she was made in Texas. The New Aesthetic panelists would’ve loved her.

I was surprised by how much discussion of digital preservation there was on the robots/AI panel. Then again, the panel was hosted by a researcher from The Digital Beyond.

Bina48’s personality is based on the mind file of a real person containing exactly the kind of data that we are publishing every day to third-party sites. The question of what happens to that data was the subject of the final panel I attended, Saying Goodbye to Your Digital Self, featuring representatives from The Internet Archive, Archive Team, and Google’s Data Liberation Front.

Digital preservation is an incredibly important topic—one close to my heart—but the panel (in the Omni hotel) was, alas, sparsely attended.

Like I said, at this year’s South by Southwest, a lot of the good stuff was at the edges.

Cool your eyes don’t change

At last November’s Build conference I gave a talk on digital preservation called All Our Yesterdays:

Our communication methods have improved over time, from stone tablets, papyrus, and vellum through to the printing press and the World Wide Web. But while the web has democratised publishing, allowing anyone to share ideas with a global audience, it doesn’t appear to be the best medium for preserving our cultural resources: websites and documents disappear down the digital memory hole every day. This presentation will look at the scale of the problem and propose methods for tackling our collective data loss.

The video is now on vimeo.

The audio has been huffduffed.

Adactio: Articles—All Our Yesterdays on Huffduffer

I’ve published a transcription over in the “articles” section.

I blogged a list of relevant links shortly after the presentation.

You can also download the slides or view them on speakerdeck but, as usual, they won’t make much sense out of context.

I hope you’ll enjoy watching or reading or listening to the talk as much as I enjoyed presenting it.

The forgotten house

The Never Forgotten House is a beautifully-written piece with a central premise that is utterly, utterly flawed. Once again the truism that “the internet never forgets” is presented as though it needed no verification.

Someday soon, the internet will fulfill its promise as a time machine. It will provide images for every space and moment so we can fact check our memories. Flickr and Facebook albums will only accumulate.

Citation needed. Badly.

Read the article. Enjoy it. But question its unquestioningness. It made me sad for exactly the opposite reasons that the author intended.

Every essential moment of a child’s life is documented if he was born in the West. With digital album after album for every birthday, every Christmas, he will never struggle to remember what his childhood home looked like.

I wish that were true.


It’s hard to believe that it’s been half a decade since The Show from Ze Frank graced our tubes with its daily updates. Five years ago to the day, he recorded the greatest three minutes of speech ever committed to video.

In the midst of his challenge to find the ugliest MySpace page ever, he received this comment:

Having an ugly Myspace contest is like having a contest to see who can eat the most cheeseburgers in 24 hours… You’re mocking people who, for the most part, have no taste or artistic training.

Ze’s response is a manifesto to the democratic transformative disruptive power of the web. It is magnificent.

In Myspace, millions of people have opted out of pre-made templates that “work” in exchange for ugly. Ugly when compared to pre-existing notions of taste is a bummer. But ugly as a representation of mass experimentation and learning is pretty damn cool.

Regardless of what you might think, the actions you take to make your Myspace page ugly are pretty sophisticated. Over time as consumer-created media engulfs the other kind, it’s possible that completely new norms develop around the notions of talent and artistic ability.

Spot on.

That’s one of the reasons why I dread the inevitable GeoCities-style shutdown of MySpace. Let’s face it, it’s only a matter of time. And when it does get shut down, we will forever lose a treasure trove of self-expression on a scale never seen before in the history of the planet. That’s so much more important than whether it’s ugly or not. As Phil wrote about the ugly and neglected fragments of Geocities:

GeoCities is an awful, ugly, decrepit mess. And this is why it will be sorely missed. It’s not only a fine example of the amateur web vernacular but much of it is an increasingly rare example of a period web vernacular. GeoCities sites show what normal, non-designer, people will create if given the tools available around the turn of the millennium.

Substitute MySpace for GeoCities and you get an idea of the loss we are facing.

Let’s not make the same mistake twice.

Digital Deathwatch

The Deatchwatch page on the Archive Team website makes for depressing reading, filled as it is with an ongoing list of sites that are going to be—or have already been—shut down. There are a number of corporations that are clearly repeat offenders: Yahoo!, AOL, Microsoft. As Aaron said last year when speaking of Museums and the Web:

Whether or not they asked to be, entire communities are now assuming that those companies will not only preserve and protect the works they’ve entrusted or the comments and other metadata they’ve contributed, but also foster their growth and provide tools for connecting the threads.

These are not mandates that most businesses take up willingly, but many now find themselves being forced to embrace them because to do otherwise would be to invite a betrayal of the trust of their users, from which they might never recover.

But occasionally there is a glimmer of hope buried in the constant avalanche of shit from these deletionist third-party custodians of our collective culture. Take Google Video, for example.

Earlier this year, Google sent out emails to Google Video users telling them the service was going to be shut down and their videos deleted as of April 29th. There was an outcry from people who rightly felt that Google were betraying their stated goal to organize the world‘s information and make it universally accessible and useful. Google backtracked:

Google Video users can rest assured that they won’t be losing any of their content and we are eliminating the April 29 deadline. We will be working to automatically migrate your Google Videos to YouTube. In the meantime, your videos hosted on Google Video will remain accessible on the web and existing links to Google Videos will remain accessible.

This gives me hope. If the BBC wish to remain true to their mission to enrich people’s lives with programmes and services that inform, educate and entertain, then they will have to abandon their plan to destroy 172 websites.

There has been a stony silence from the BBC on this issue for months now. Ian Hunter—who so proudly boasted of the planned destruction—hasn’t posted to the BBC blog since writing a follow-up “clarification” that did nothing to reassure any of us.

It could be that they’re just waiting for a nice quiet moment to carry out the demolition. Or maybe they’ve quietly decided to drop their plans. I sincerely hope that it’s the second scenario. But, just in case, I’ve begun to create my own archive of just some of the sites that are on the BBC’s death list.

By the way, if you’re interested in hearing more about the story of Archive Team, I recommend checking out these interviews and talks from Jason Scott that I’ve huffduffed.

All Our Yesterdays: the links

If you were at An Event Apart in Boston and you want to follow up on some of the things I mentioned in my talk, here are some links:

Here are some related posts of my own:

More recently, Nora Young interviewed Jason Scott on online video and digital heritage.

Full Interview: Jason Scott on online video and digital heritage | Spark | CBC Radio on Huffduffer