Tags: linkrot

28

sparkline

Tuesday, August 3rd, 2021

A Few Notes on A Few Notes on The Culture

When I post a link, I do it for two reasons.

First of all, it’s me pointing at something and saying “Check this out!”

Secondly, it’s a way for me to stash something away that I might want to return to. I tag all my links so when I need to find one again, I just need to think “Now what would past me have tagged it with?” Then I type the appropriate URL: adactio.com/links/tags/whatever

There are some links that I return to again and again.

Back in 2008, I linked to a document called A Few Notes on The Culture. It’s a copy of a post by Iain M Banks to a newsgroup back in 1994.

Alas, that link is dead. Linkrot, innit?

But in 2013 I linked to the same document on a different domain. That link still works even though I believe it was first published around twenty(!) years ago (view source for some pre-CSS markup nostalgia).

Anyway, A Few Notes On The Culture is a fascinating look at the world-building of Iain M Banks’s Culture novels. He talks about the in-world engineering, education, biology, and belief system of his imagined utopia. The part that sticks in my mind is when he talks about economics:

Let me state here a personal conviction that appears, right now, to be profoundly unfashionable; which is that a planned economy can be more productive - and more morally desirable - than one left to market forces.

The market is a good example of evolution in action; the try-everything-and-see-what-works approach. This might provide a perfectly morally satisfactory resource-management system so long as there was absolutely no question of any sentient creature ever being treated purely as one of those resources. The market, for all its (profoundly inelegant) complexities, remains a crude and essentially blind system, and is — without the sort of drastic amendments liable to cripple the economic efficacy which is its greatest claimed asset — intrinsically incapable of distinguishing between simple non-use of matter resulting from processal superfluity and the acute, prolonged and wide-spread suffering of conscious beings.

It is, arguably, in the elevation of this profoundly mechanistic (and in that sense perversely innocent) system to a position above all other moral, philosophical and political values and considerations that humankind displays most convincingly both its present intellectual immaturity and — through grossly pursued selfishness rather than the applied hatred of others — a kind of synthetic evil.

Those three paragraphs might be the most succinct critique of unfettered capitalism I’ve come across. The invisible hand as a paperclip maximiser.

Like I said, it’s a fascinating document. In fact I realised that I should probably store a copy of it for myself.

I have a section of my site called “extras” where I dump miscellaneous stuff. Most of it is unlinked. It’s mostly for my own benefit. That’s where I’ve put my copy of A Few Notes On The Culture.

Here’s a funny thing …for all the times that I’ve revisited the link, I never knew anything about the site is was hosted on—vavatch.co.uk—so this most recent time, I did a bit of clicking around. Clearly it’s the personal website of a sci-fi-loving college student from the early 2000s. But what came as a revelation to me was that the site belonged to …Adrian Hon!

I’m impressed that he kept his old website up even after moving over to the domain mssv.net, founding Six To Start, and writing A History Of The Future In 100 Objects. That’s a great snackable book, by the way. Well worth a read.

Tuesday, July 20th, 2021

Hope

My last long-distance trip before we were all grounded by The Situation was to San Francisco at the end of 2019. I attended Indie Web Camp while I was there, which gave me the opportunity to add a little something to my website: an “on this day” page.

I’m glad I did. While it’s probably of little interest to anyone else, I enjoy scrolling back to see how the same date unfolded over the years.

’Sfunny, when I look back at older journal entries they’re often written out of frustration, usually when something in the dev world is bugging me. But when I look back at all the links I’ve bookmarked the vibe is much more enthusiastic, like I’m excitedly pointing at something and saying “Check this out!” I feel like sentiment analyses of those two sections of my site would yield two different results.

But when I scroll down through my “on this day” page, it also feels like descending deeper into the dark waters of linkrot. For each year back in time, the probability of a link still working decreases until there’s nothing but decay.

Sadly this is nothing new. I’ve been lamenting the state of digital preservation for years now. More recently Jonathan Zittrain penned an article in The Atlantic on the topic:

Too much has been lost already. The glue that holds humanity’s knowledge together is coming undone.

In one sense, linkrot is the price we pay for the web’s particular system of hypertext. We don’t have two-way linking, which means there’s no centralised repository of links which would be prohibitively complex to maintain. So when you want to link to something on the web, you just do it. An a element with an href attribute. That’s it. You don’t need to check with the owner of the resource you’re linking to. You don’t need to check with anyone. You have complete freedom to link to any URL you want to.

But it’s that same simple system that makes the act of linking a gamble. If the URL you’ve linked to goes away, you’ll have no way of knowing.

As I scroll down my “on this day” page, I come across more and more dead links that have been snapped off from the fabric of the web.

If I stop and think about it, it can get quite dispiriting. Why bother making hyperlinks at all? It’s only a matter of time until those links break.

And yet I still keep linking. I still keep pointing to things and saying “Check this out!” even though I know that over a long enough timescale, there’s little chance that the link will hold.

In a sense, every hyperlink on the World Wide Web is little act of hope. Even though I know that when I link to something, it probably won’t last, I still harbour that hope.

If hyperlinks are built on hope, and the web is made of hyperlinks, then in a way, the World Wide Web is quite literally made out of hope.

I like that.

Saturday, July 3rd, 2021

The Internet Is Rotting - The Atlantic

A terrific piece by Jonathan Zittrain on bitrot and online digital preservation:

Too much has been lost already. The glue that holds humanity’s knowledge together is coming undone.

Sunday, February 28th, 2021

Robin Rendle ・ Inheritance

My work shouldn’t be presented in the Smithsonian behind glass or anything, I’m just pointing at this enormous flaw in the architecture of the web itself: you’re renting servers and renting URLs. Nothing is permanent because on the web we don’t really own any space, we’re just borrowing land temporarily.

Friday, November 23rd, 2018

FlickrJubilee (@FlickrJubilee) / Twitter

Flickr is removing anything over 1,000 photos on accounts that are not “pro” (paid for) in 2019. We highlight large and amazing accounts that could use a gift to go pro. We take nominations and track when these accounts are saved.

Thursday, November 8th, 2018

The Commons: The Past Is 100% Part of Our Future | Flickr Blog

This is very, very good news. Following on from the recent announcement that a huge swathe of Flickr photos would soon be deleted, there’s now an update: any photos that are Creative Commons licensed won’t be deleted after all. Phew!

I wonder if I can get a refund for that pro account I just bought last week to keep my Creative Commons licensed Flickr pictures online.

Sunday, November 4th, 2018

Why we’re changing Flickr free accounts | Flickr Blog

I’ve got a lot of photos on Flickr (even though I don’t use it directly much these days) and I’ve paid up for a pro account to protect those photos, but I’m very worried about this:

Beginning January 8, 2019, Free accounts will be limited to 1,000 photos and videos.

That in itself is fine, but any existing non-pro accounts with more than 1000 photos will have older photos deleted until the total comes down to 1000. This means that anyone linking to those photos (or embedding them in blog posts or articles) will have broken links and images.

Tears in the rain.

Monday, October 8th, 2018

Monday, November 7th, 2016

“If it’s not curlable, it’s not on the web.” by Tantek Çelik

It was fun spelunking with Tantek, digging into some digital archeology in an attempt to track down a post by Ben Ward that I remembered reading years ago.

Thursday, January 28th, 2016

AMBER

This is intriguing—a Pinboard-like service that will create local copies of pages you link to from your site. There are plug-ins for WordPress and Drupal, and modules for Apache and Nginx.

Amber is an open source tool for websites to provide their visitors persistent routes to information. It automatically preserves a snapshot of every page linked to on a website, giving visitors a fallback option if links become inaccessible.

Wednesday, October 14th, 2015

The Internet’s Dark Ages - The Atlantic

The promise of the web is that Alexandria’s library might be resurrected for the modern world. But today’s great library is being destroyed even as it is being built.

A fascinating account of one story’s linkrot that mirrors the woeful state of our attitude to cultural preservation on the web.

Historians and digital preservationists agree on this fact: The early web, today’s web, will be mostly lost to time.

Tuesday, January 6th, 2015

HTTPS

Tim Berners-Lee is quite rightly worried about linkrot:

The disappearance of web material and the rotting of links is itself a major problem.

He brings up an interesting point that I hadn’t fully considered: as more and more sites migrate from HTTP to HTTPS (A Good Thing), and the W3C encourages this move, isn’t there a danger of creating even more linkrot?

…perhaps doing more damage to the web than any other change in its history.

I think that may be a bit overstated. As many others point out, almost all sites making the switch are conscientious about maintaining redirects with a 301 status code.

(There’s also a similar 308 status code that I hadn’t come across, but after a bit of investigating, that looks to be a bit of mess.)

Anyway, the discussion does bring up some interesting points. Transport Layer Security is something that’s handled between the browser and the server—does it really need to be visible in the protocol portion of the URL? Or is that visibility a positive attribute that makes it clear that the URL is “good”?

And as more sites move to HTTPS, should browsers change their default behaviour? Right now, typing “example.com” into a browser’s address bar will cause it to automatically expand to http://example.com …shouldn’t browsers look for https://example.com first?

All good food for thought.

There’s a Google Doc out there with some advice for migrating to HTTPS. Unfortunately, the trickiest part—getting and installing certificates—is currently an owl-drawing tutorial, but hopefully it will get expanded.

If you’re looking for even more reasons why enabling TLS for your site is a good idea, look no further than the latest shenanigans from ISPs in the UK (we lost the battle for net neutrality in this country some time ago).

They can’t do that to pages served over HTTPS.

Friday, October 17th, 2014

What is still on the web after 10 years of archiving? - UK Web Archive blog

The short answer: not much.

The UK Web Archive at The British Library outlines its process for determining just how bad the linkrot is after just one decade.

Tuesday, July 22nd, 2014

“The Internet Never Forgets” — sixtwothree.org

The Internet forgets every single day.

I’m with Jason.

I encourage you all to take a moment and consider the importance of preserving your online creations for yourself, your family, and for future generations.

Wednesday, September 25th, 2013

Perma: Scoping and addressing the problem of “link rot” :: Future of the Internet – And how to stop it.

Lawrence Lessig and Jonathan Zittrain are uncovering disturbing data on link rot in Supreme Court documents: 50% of the the links cited no longer work.

Monday, July 1st, 2013

The spread of link rot by Felix Salmon

What I fear is that the entire web is basically becoming a slow-motion Snapchat, where content lives for some unknowable amount of time before it dies, lost forever.

Friday, March 23rd, 2012

» Long Bets Bet – How Durable Are URLs? - Blog of the Long Now

The Long Now blog is featuring the bet between myself and Matt on URL longevity. Just being mentioned on that site gives me a warm glow.

Sunday, February 20th, 2011

Voice of the Beeb hive

Ian Hunter at the BBC has written a follow-up post to his initial announcement of the plans to axe 172 websites. The post is intended to clarify and reassure. It certainly clarifies, but it is anything but reassuring.

He clarifies that, yes, these websites will be taken offline. But, he reassures us, they will be stored …offline. Not on the web. Without URLs. Basically, they’ll be put in a hole in the ground. But it’s okay; it’s a hole in the ground operated by the BBC, so that’s alright then.

The most important question in all of this is why the sites are being removed at all. As I said, the BBC’s online mothballing policy has—up till now—been superb. Well, now we have an answer. Here it is:

But there still may come a time when people interested in the site are better served by careful offline storage.

There may be a parallel universe where that sentence makes sense, but it would have to be one in which the English language is used very differently.

As an aside, the use of language in the “explanation” is quite fascinating. The post is filled with the kind of mealy-mouthed filler words intended to appease those of us who are concerned that this is a terrible mistake. For example, the phrase “we need to explore a range of options including offline storage” can be read as “the sites are going offline; live with it.”

That’s one of the most heartbreaking aspects of all of this: the way that it is being presented as a fait accompli: these sites are going to be ripped from the fabric of the network to be tossed into a single offline point of failure and there’s nothing that we—the license-payers—can do about it.

I know that there are many people within the BBC who do not share this vision. I’ve received some emails from people who worked on some of the sites scheduled for deletion and needless to say, they’re not happy. I was contacted by an archivist at the BBC, for whom this plan was unwelcome news that he first heard about here on adactio.com. The subsequent reaction was:

It was OK to put a videotape on a shelf, but putting web pages offline isn’t OK.

I hope that those within the BBC who disagree with the planned destruction will make their voices heard. For those of us outside the BBC, it isn’t clear how we can best voice our concerns. You could make a complaint to the BBC, though that seems to be intended more for complaints about programme content.

In the meantime, you can download all or some of the 172 sites and plop them elsewhere on the web. That’s not an ideal solution—ideally, the BBC shouldn’t be practicing a deliberate policy of link rot—but it allows us to prepare for the worst.

I hope that whoever at the BBC has responsibility for this decision will listen to reason. Failing that, I hope that we can get a genuine explanation as to why this is happening, because what’s currently being offered up simply doesn’t cut it. Perhaps the truth behind this decision lies not so much with the BBC, but with their technology partner, Siemens, who have a notorious track record for shafting the BBC, charging ludicrous amounts of money to execute the most trivial of technical changes.

If this decision is being taken for political reasons, I would hope that someone at the BBC would have the honesty to say so rather than simply churning out more mealy-mouthed blog posts devoid of any genuine explanation.

Tuesday, February 8th, 2011

Linkrotting

Yesterday’s account of the BBC’s decision to cull 172 websites caused quite a stir on Twitter.

Most people were as saddened as I was, although Emma described my post as being “anti-BBC.” For the record, I’m a big fan of the BBC—hence my disappointment at this decision. And, also for the record, I believe anyone should be allowed to voice their criticism of an organisational decision without being labelled “anti” said organisation …just as anyone should be allowed to criticise a politician without being labelled unpatriotic.

It didn’t take long for people to start discussing an archiving effort, which was heartening. I started to think about the best way to coordinate such an effort; probably a wiki. As well as listing handy archiving tools, it could serve as a place for people to claim which sites they want to adopt, and point to their mirrors once they’re up and running. Marko already has a head start. Let’s do this!

But something didn’t feel quite right.

I reached out to Jason Scott for advice on coordinating an effort like this. He has plenty of experience. He’s currently trying to figure out how to save the more than 500,000 videos that Yahoo is going to delete on March 15th. He’s more than willing to chat, but he had some choice words about the British public’s relationship to the BBC:

This is the case of a government-funded media group deleting. In other words, this is something for The People, and by The People I mean The Media and the British and the rest to go HEY BBC STOP

He’s right.

Yes, we can and should mirror the content of those 172 sites—lots of copies keep stuff safe—but fundamentally what we want is to keep the fabric of the web intact. Cool URIs don’t change.

The BBC has always been an excellent citizen of the web. Their own policy on handling outdated content explains the situation beautifully:

We don’t want to delete pages which users may have bookmarked or linked to in other ways.

Moving a site to a different domain will save the content but it won’t preserve the inbound connections; the hyperlinks that weave the tapestry of the web together.

Don’t get me wrong: I love the Internet Archive. I think that is doing fantastic work. But let’s face it; once a site only exists in the archive, it is effectively no longer a part of the living web. Yet, whenever a site is threatened with closure, we invoke the Internet Archive as a panacea.

So, yes, let’s make and host copies of the 172 sites scheduled for termination, but let’s not get distracted from the main goal here. What we are fighting against is .

I don’t want the BBC to take any particular action. Quite the opposite: I want them to continue with their existing policy. It will probably take more effort for them to remove the sites than to simply let them sit there. And let’s face it, it’s not like the bandwidth costs are going to be a factor for these sites.

Instead, many believe that the BBC’s decision is politically motivated: the need to be seen to “cut” top level directories, as though cutting content equated to cutting costs. I can’t comment on that. I just know how I feel about the decision:

I don’t want them to archive it. I just want them to leave it the fuck alone.

“What do we want?” “Inaction!”

“When do we want it?” “Continuously!”

Monday, February 7th, 2011

Erase and rewind

In the 1960s and ’70s, it was common practice at the BBC to reuse video tapes. Old recordings were taped over with new shows. Some Doctor Who episodes have been lost forever. Jimi Hendrix’s unruly performance on Happening for Lulu would have also been lost if a music-loving engineer hadn’t sequestered the tapes away, preventing them from being over-written.

Except - a VT engineer called Bob Pratt, who really ought to get a medal, was in the habit of saving stuff he liked. Even then, the BBC policy of wiping practically everything was notorious amongst those who’d made it. Bob had the job of changing the heads on 2” VT machines. He’d be in at 0600 before everyone else and have two hours to sort the equipment before anyone else came in. Rock music was his passion, and knowing everything would soon disappear, would spend some of that time dubbing off the thing he liked onto junk tapes, which would disappear under the VT department floor.

To be fair to the BBC, the tape-wiping policy wasn’t entirely down to crazy internal politics—there were convoluted rights issues involving the actors’ union, Equity.

Those issues have since been cleared up. I’m sure the BBC has learned from the past. I’m sure they wouldn’t think of mindlessly throwing away content, when they have such an impressive archive.

And yet, when it comes to the web, the BBC is employing a slash-and-burn policy regarding online content. 172 websites are going to disappear down the memory hole.

Just to be clear, these sites aren’t going to be archived. They are going to be deleted from the web. Server space is the new magnetic tape.

This callous attitude appears to be based entirely on the fact that these sites occupy URLs in top-level directories—repeatedly referred to incorrectly as top level domains on the BBC internet blog—a space that the decision-makers at the BBC are obsessed with.

Instead of moving the sites to, say, bbc.co.uk/archive and employing a little bit of .htaccess redirection, the BBC (and their technology partner, Siemens) would rather just delete the lot.

Martin Belam is suitably flabbergasted by the vandalism of the BBC’s online history:

I’m really not sure who benefits from deleting the Politics 97 site from the BBC’s servers in 2011. It seems astonishing that for all the BBC’s resources, it may well be my blog posts from 5 years ago that provide a more accurate picture of the BBC’s early internet days than the Corporation does itself - and that it will have done so by choice.

Many of the 172 sites scheduled for deletion are currently labelled with a banner across the top indicating that the site hasn’t been updated for a while. There’s a link to a help page with the following questions and answers:

It’ll be interesting to see how those answers will be updated to reflect change in policy. Presumably, the new answers will read something along the lines of “Fuck ‘em.”

Kiss them all goodbye. And perhaps most egregious of all, you can also kiss goodbye to WW2 People’s War:

The BBC asked the public to contribute their memories of World War Two to a website between June 2003 and January 2006. This archive of 47,000 stories and 15,000 images is the result.

I’m very saddened to see the BBC join the ranks of online services that don’t give a damn for posterity. That attitude might be understandable, if not forgivable, from a corporation like Yahoo or AOL, driven by short-term profits for shareholders, as summarised by Connor O’Brien in his superb piece on link rot:

We push our lives into the internet, expecting the web to function as a permanent and ever-expanding collective memory, only to discover the web exists only as a series of present moments, every one erasing the last. If your only photo album is Facebook, ask yourself: since when did a gratis web service ever demonstrate giving a flying fuck about holding onto the past?

I was naive enough to think that the BBC was above that kind of short-sighted approach. Looks like I was wrong.

Sad face.