A profile of the Internet Archive, but this time focusing on its physical space.
The Archive is a third place unlike any other.
A profile of the Internet Archive, but this time focusing on its physical space.
The Archive is a third place unlike any other.
History, as the future will know it, is happening today on the web. And so it is the web that we must capture, package, and preserve for future generations to see who we are today.
Digital archivists run up against mismatched expectations:
But did you know that a large majority of web users think that when sharing their thoughts, images, and videos online they are going to be preserved in perpetuity? No matter how many licenses the general population clicks “Agree” to, or however many governing policies are developed that state the contrary, the millions of people sharing their content on websites still believe that there is an implicit accountability that should be upheld by the site owners.
A really good explanation of how a peer-to-peer model for the web would differ from the current location-centric approach.
What really interests me is the idea of having both models co-exist.
You just have to think about the ways in which our location-centrism is contributing to the problems we are hitting, from the rise of Facebook, to the lack of findability of OER, to the Wikipedia Edit Wars.
Science fiction as a means of energising climatic and economic change:
Fiction, and science fiction in particular, can help us imagine many futures, and in particular can help us to direct our imaginations towards the futures we want. Imagining a particular kind of future isn’t just day dreaming: it’s an important and active framing that makes it possible for us to construct a future that approaches that imagined vision. In other words, imagining the future is one way of making that future happen.
But it’s important that these visions are preserved:
It’s very likely that our next Octavia Butler is today writing on WattPad or Tumblr or Facebook. When those servers cease to respond, what will we lose? More than the past is at stake—all our imagined futures are at risk, too.
360 terabytes of data stored for over 13 billion years:
Coined as the ‘Superman memory crystal’, as the glass memory has been compared to the “memory crystals” used in the Superman films, the data is recorded via self-assembled nanostructures created in fused quartz. The information encoding is realised in five dimensions: the size and orientation in addition to the three dimensional position of these nanostructures.
A note of optimism for digital preservation:
Where a lack of action may have been more of the case in the 01990s, it is certainly less so today. In the early days, there were just a handful of pioneers talking about and working on digital preservation, but today there are hundreds of tremendously intelligent and skilled people focused on preserving access to the yottabytes of digital cultural heritage and science data we have and will create.
This is intriguing—a Pinboard-like service that will create local copies of pages you link to from your site. There are plug-ins for WordPress and Drupal, and modules for Apache and Nginx.
Amber is an open source tool for websites to provide their visitors persistent routes to information. It automatically preserves a snapshot of every page linked to on a website, giving visitors a fallback option if links become inaccessible.
Such a vividly nostalgic project. Choose an obsolete browser. Enter a URL. Select which slice of the past you want to see.
Digital archives in action. Access drives preservation.
Had anyone from the archive been in touch with ESPN? Was there any hope that the treasured collection of Grantland stories might remain accessible?
“We don’t ‘get in touch,’” Jason Scott, a digital historian at the Internet Archive, told me in an email. “We act.”
The significant challenges in archiving audio.
The promise of the web is that Alexandria’s library might be resurrected for the modern world. But today’s great library is being destroyed even as it is being built.
A fascinating account of one story’s linkrot that mirrors the woeful state of our attitude to cultural preservation on the web.
Historians and digital preservationists agree on this fact: The early web, today’s web, will be mostly lost to time.
I absolutely love the way that my archive is presented here. Matt and Hannah have set the bar in how to shut down a service in an honest, dignified way.
This is a wonderful, wonderful description of what it feels like to publish on your own site.
When my writing is on my own server, it will always be there. I may forget about it for a while, but eventually I’ll run into it again. I can torch those posts or save them, rewrite them or repost them. But they’re mine to rediscover.
The title is hyperbolic, and while I certainly think that the criticisms of HTTP here are justified, I don’t think it will be swept aside by IPFS—I imagine more of a peaceful coexistence. Still, there’s some really good thinking in here and this is well worth paying attention to.
Will the Big Think piece you just posted to Medium be there in 2035? That may sound like it’s very far off in the future, and who could possibly care, but if there’s any value to your writing, you should care. Having good records is how knowledge builds.
It’s a real shame that Hannah and Matt are shutting down This Is My Jam—it’s such a lovely little service—but their reliance on ever-changing third-party APIs sounds like no fun, and the way they’re handling the shutdown is exemplary: the site is going into read-only mode, and of course all of your data is exportable.
Yahoo, Google, and other destroyers could learn a thing or two from this—things like “dignity” and “respect”.
Exemplars proposing various solutions for the resilience of digital data and computation over long timeframes include the Internet Archive; redundantly distributed storage platforms such GlusterFS, LOCKSS, and BitTorrent Sync; and the Lunar supercomputer proposal of Ouliang Chang.
Each of these differs in its approach and its focus; yet each shares with Vessel and with one another a key understanding: The prospects of Earth-originating life in the future, whether vast or diminishing, depend upon our actions and our foresight in this current cultural moment of opportunity, agency, awareness, ability, capability, and willpower.
I really like this impassioned love letter to the web. This resonates:
The web is a worthy monument for society. It cannot be taken away by apps in the app store or link bait on Facebook, but it can be lost if we don’t continue to steward this creation of ours. The web is a garden that needs constant tending to thrive. And in the true fashion of the world wide web, this is no task for one person or entity. It will require vigilance and work from us all.
I had a lot of fun recording this episode with Andrew and Jeffrey. It is occasionally surreal.
Stick around for the sizzling hot discussion of advertising at the end in which we compare and contrast Mad Men and Triumph Of The Will.
Much of the web’s early cultural and design history is at risk, despite efforts by the Internet Archive and renegade archivists. One of our realizations after 20 years on the web is that our responsibility isn’t just to the new; we also need to preserve what’s been built in the past.
Because in 10 years nothing you built today that depends on JS for the content will be available, visible, or archived anywhere on the web.
The most ambitious project from Archive Team yet: backing up the Internet Archive.
We can do this, people! Moore’s Law and all that.
Brewster Kahle’s short presentation at NetGain.
Remember Aaron’s dConstruct talk? Well, the Atlantic has more details of his work at the Cooper Hewitt museum in this wide-ranging piece that investigates the role of museums, the value of APIs, and the importance of permanent URLs.
As I was leaving, Cope recounted how, early on, a curator had asked him why the collections website and API existed. Why are you doing this?
His retrospective answer wasn’t about scholarship or data-mining or huge interactive exhibits. It was about the web.
I find this incredibly inspiring.
A profile of the wonderful Internet Archive.
No one believes any longer, if anyone ever did, that “if it’s on the Web it must be true,” but a lot of people do believe that if it’s on the Web it will stay on the Web. Chances are, though, that it actually won’t.
Brewster Kahle is my hero.
Kahle is a digital utopian attempting to stave off a digital dystopia. He views the Web as a giant library, and doesn’t think it ought to belong to a corporation, or that anyone should have to go through a portal owned by a corporation in order to read it. “We are building a library that is us,” he says, “and it is ours.”
Dropping our films down the memory hole. Welcome to the digital dark age.
A look at long-term cultural and linguistic preservation through the lens of Egyptology.
Language death and digital preservation.
Aaron raises a point that I’ve discussed before in regards to the indie web (and indeed, the web in general): we don’t buy domain names; we rent them.
It strikes me that all the good things about the web are decentralised (one-way linking, no central authority required to add a node), but all the sticking points are centralised: ICANN, DNS.
Aaron also points out that we are beholden to our hosting companies, although—having moved hosts a number of times myself—that’s an issue that DNS (and URLs in general) helps alleviate. And there’s now some interesting work going on in literally owning your own website: a web server in the home.
The transcript of Owen’s talk at The Web Is. It’s a wonderful, thoughtful meditation on writing, web design, and long-term thinking.
One of the promises of the web is to act as a record, a repository for everything we put there. Yet the web forgets constantly, despite that somewhat empty promise of digital preservation: articles and data are sacrificed to expediency, profit and apathy; online attention, acknowledgement and interest wax and wane in days, hours even.
This fracturing of context is, I suspect, peculiar to these early decades of online writing. It’s possible that, in the future, webmentions and the like may heal that up to some extent. But everything from the 90s to today is going to remain mostly broken in that respect. Most of what we said and did had ephemerality long before apps started selling us ephemeral nature as a positive advertising point. Possibly no other generation threw so many words at such velocity into a deep dark well of ghosts.
The short answer: not much.
The UK Web Archive at The British Library outlines its process for determining just how bad the linkrot is after just one decade.
I’d go along with pretty much everything Anil says here. Wise words from someone who’s been writing on their own website for fifteen years (congratulations!).
Link to everything you create elsewhere on the web. And if possible, save a copy of it on your own blog. Things disappear so quickly, and even important work can slip your mind months or years later when you want to recall it. If it’s in one, definitive place, you’ll be glad for it.
A documentary on our digital dark age. Remember this the next time someone trots out the tired old lie that “the internet never forgets.”
If we lose the past, we will live in an Orwellian world of the perpetual present, where anybody that controls what’s currently being put out there will be able to say what is true and what is not. This is a dreadful world. We don’t want to live in this world. —Brewster Kahle
It’s a terrible indictment of where our priorities were for the last 20 years that we depend essentially on children and maniacs to save our history of this sort. —Jason Scott
Glenn eloquently gives his reasons for building Transmat:
When I was a child, my brothers and I all had a shoebox each. In these we kept our mementoes. A seashell from a summer holiday where I played for hours in the rock pools, the marble from the schoolyard victory against a bully and a lot of other objects that told a story.
Now this is how to shut down a service: switch to a read-only archive, and make the codebase (without user credentials) available on Github.
Over 700 screenshots of ZX Spectrum games, captured by Jason Scott. Some of these bring back memories.
The first Lunar Orbiter, Andy Warhol’s Amiga, and George R.R. Martin’s WordStar …the opening address to the Digital Preservation 2014 conference July 22 in Washington, DC.
Just as early filmmakers couldn’t have predicted the level of ongoing interest in their work over a hundred years later, who can say what future generations will find important to know and preserve about the early history of software?
(Mind you, I can’t help but feel that the chances of this particular text have a long life at a Medium URL are pretty slim.)
The Internet forgets every single day.
I’m with Jason.
I encourage you all to take a moment and consider the importance of preserving your online creations for yourself, your family, and for future generations.
On the fifth anniversary of Pinboard, Maciej reflects on working on long-term projects:
Avoiding burnout is difficult to write about, because the basic premise is obnoxious. Burnout is a rich man’s game. Rice farmers don’t get burned out and spend long afternoons thinking about whether to switch to sorghum.
The good news is, as you get older, you gain perspective. Perspective helps alleviate burnout.
The bad news is, you gain perspective by having incredibly shitty things happen to you and the people you love. Nature has made it so that perspective is only delivered in bulk quantities. A railcar of perspective arrives and dumps itself on your lawn when all you needed was a microgram.
Some good ideas from Matt on the importance of striving to maintain digital works. I find it very encouraging to see other people writing about this, especially when it’s this thoughtful.
A truly wonderful piece by Mandy detailing why and how she writes, edits, and publishes on her own website:
No one owns this domain but me, and no one but me can take it down. I will not wake up one morning to discover that my service has been “sunsetted” and I have some days or weeks to export my data (if I have that at all). These URLs will never break.
A thoughtful in-depth piece that pulls together my hobby horses of independent publishing, responsive design, and digital preservation, all seen through the lens of music:
Music, Publishing, Art and Memory in the Age of the Internet
We need a web design museum.
I am, unsurprisingly, in complete agreement. And let’s make lots of copies while we’re at it.
A short video featuring Jason Scott and Brewster Kahle. The accompanying text has a shout-out to the line-mode browser hack event at CERN.
Having experienced the death of a friend, I wonder how many have considered the ghosts in the machine.
The video of my closing talk at this year’s Full Frontal conference, right here in Brighton.
I had a lot of fun with this, although I was surprisingly nervous before I started: I think it was because I didn’t want to let Remy down.
Lawrence Lessig and Jonathan Zittrain are uncovering disturbing data on link rot in Supreme Court documents: 50% of the the links cited no longer work.
An epic tale of data recovery.
Of course Jason Scott was involved.
Some good advice on how to mothball (rather than destroy) a project when it reaches the end of its useful life. In short, build a switch so that, when the worst comes to the worst, you can output static files and walk away.
In all your excitement starting a new project, spend a little time thinking about the end.
I took a little time out of the hacking here at CERN to answer a few questions about the line-mode browser project.
A heartfelt response from Vitaly to .net magazine’s digital destruction.
This is what I’m working on today (where by “working on”, I mean “watching other far more talented people work on”).
Michael Chabon muses on The Future, prompted by the Clock of the Long Now.
Aaron Straup-Cope and Seb Chan on the challenges of adding (and keeping) code to the Cooper-Hewitt collection:
The distinction between preservation and access is increasingly blurred. This is especially true for digital objects.
The internet never forgets? Bollocks!
We were told — warned, even — that what we put on the internet would be forever; that we should think very carefully about what we commit to the digital page. And a lot of us did. We put thought into it, we put heart into, we wrote our truths. We let our real lives bleed onto the page, onto the internet, onto the blog. We were told, “Once you put this here, it will remain forever.” And we acted accordingly.
This is a beautiful love-letter to the archival web, and a horrifying description of its betrayal:
When they’re erased by a company abruptly and without warning, it’s something of a new-age arson.
Oh, dear. An otherwise perfectly well-reasoned article makes this claim:
But the internet is peculiarly adapted to deftly pricking pomposity. This is partly because nothing dies online, meaning your past indiscretions are never yesterday’s news, wrapped round the proverbial fish and chips.
Bollocks. Show me the data to back up this claim.
The insidious truism that “the internet never forgets” is extremely harmful. The true problem is the opposite: the internet forgets all the time.
Geocities, Pownce, Posterous, Vox, and thousands more sites are very much yesterday’s news, wrapped round the proverbial fish and chips.
A beauty of a post by Jason giving you even more reasons to donate to Archive.org.
Seriously. Do it now. It would mean a lot to me.
Related: I’m going to be in San Francisco next week and by hook or by crook, I plan to visit the Internet Archive’s HQ.
The Internet, day one. A sad tale of data loss.
This is why the Internet Archive matters. It is now the public record of Obama’s broken promise to protect whistleblowers.
I feel very bad for the smart, passionate, talented people who worked their asses off on change.gov, only to see their ideals betrayed.
A good article on Medium on Medium.
What I fear is that the entire web is basically becoming a slow-motion Snapchat, where content lives for some unknowable amount of time before it dies, lost forever.
A great history lesson from Dave.
Ah, I remember when the CSS Zen Garden was all fields. Now get off my CSS lawn.
I gave the opening keynote at the Beyond Tellerand conference a few weeks back. I’m talked about the web from my own perspective, so expect excitement and anger in equal measure.
This was a new talk but it went down well, and I’m quite happy with it.
Ben proposes an alternative to archive.org: changing the fundamental nature of DNS.
Regarding the boo-hooing of how hard companies have it maintaining unprofitable URLs, I think Ben hasn’t considered the possibility of a handover to a cooperative of users—something that might yet happen with MySpace (at least there’s a campaign to that effect; it will probably come to naught). As Ben rightly points on, domain names are leased, not bought, so the idea of handing them over to better caretakers isn’t that crazy.
This is a breath of fresh air: a blogging platform that promises to keep its URLs online in perpetuity.
Mark writes about his work with CERN to help restore the first website to its original URL.
I have two young children and I want them to experience the early web and understand how it came to be. To understand that the early web wasn’t that rudimentary but incredibly advanced in many ways.
A beautiful short film on the amazing work being done at the Internet Archive, produced on the occasion of their 10 petabyte celebration.
A profile in The Guardian of the Internet Archive and my hero, Brewster Kahle (who also pops up in the comments).
Heartbreaking and angry-making.
The story of one site’s disgraceful handling of acquisition and shutdown (Punchfork, acquired by Pinterest) and how its owner actively tried to block efforts to preserve user’s data.
A collection of those appalling doublespeek announcements that sites and services give when they get acquired. You know the ones: they begin with “We’re excited to announce…” and end with people’s data being flushed down the toilet.
Charles Arthur analyses the data from Google’s woeful history of shutting down its services.
So if you want to know when Google Keep, opened for business on 21 March 2013, will probably shut - again, assuming Google decides it’s just not working - then, the mean suggests the answer is: 18 March 2017. That’s about long enough for you to cram lots of information that you might rely on into it; and also long enough for Google to discover that, well, people aren’t using it to the extent that it hoped.
A wonderful rallying cry from Drew.
Ever since the halcyon days of Web 2.0, we’ve been netting our butterflies and pinning them to someone else’s board.
Hope that what you’ve created never has to die. Make sure that if something has to die, it’s you that makes that decision. Own your own data, friends, and keep it safe.
A really lovely piece on the repositories of information that aren’t catalogued—a fourth quadrant in the Rumsfeldian taxonomy, these dark archives are the unknown knowns.
Honestly, if you value the content you create and put online, then you need to be in control of your own stuff.
What an Orwellian title for a blog post announcing the wholesale destruction of user’s content. Oh, Yahoo, you sound so proud of your cavalier attitude towards the collective culture that you have harvested.
A fascinating discussion on sharecropping vs. homesteading. Josh Miller from Branch freely admits that he’s only ever known a web where your content is held by somone else. Gina Trapani’s response is spot-on:
For me, publishing on a platform I have some ownership and control over is a matter of future-proofing my work. If I’m going to spend time making something I really care about on the web—even if it’s a tweet, brevity doesn’t mean it’s not meaningful—I don’t want to do it somewhere that will make it inaccessible after a certain amount of time, or somewhere that might go away, get acquired, or change unrecognizably.
When you get old and your memory is long and you lose parents and start having kids, you value your own and others’ personal archive much more.
From the cave paintings at Lascaux to the Pioneer plaques and Voyager golden records to Trevor Paglen’s “The Last Pictures” project, Paul Glister examines the passage and preservation of art and information through time. Fascinating.
Or perhaps, as Paglen envisions, those who find a Pioneer Plaque, a Voyager Record, or one of our electromagnetic transmissions will be interested enough to search us out, coming upon a future Earth where all that is left of humanity are our terrestrial ruins and that artificial ring of geosynchronous satellites, with one of them having a particular golden artifact bolted to its pitted hull. In that scenario, about all that would be left for the visiting ETI to do in terms of learning about us would be grand-scale dumpster diving.
I hereby declare that this song is my official anthem.
I want some files that last, data that will not stray.
Files just as fresh tomorrow as they were yesterday.
Investigating the options for off-world backups.
Data is only as safe as the planet it sits on. It only takes one rock, not too big, not moving that fast, to hit the Earth at a certain angle and: WHAM! Most living species are done for.
How the hell is your Twitter archive supposed to survive that?
Here’s a treasure trove of web history: an archive of the www-talk list dating back to 1991. Watch as HTML gets hammered out by a small group of early implementors: Tim Berners-Lee, Dave Raggett, Marc Andreessen, Dan Connolly…
Jason goes into detail describing the File Format problem that he and others are going to tackle in the effort known as Just Solve The Problem.
Live in or near San Francisco? Interested in preserving computer history? Then you should meet up with Jason this Friday:
This Friday, October 5th, the Internet Archive has an open lunch where there’s tours of the place, including the scanning room, and people get up and talk about what they’re up to. The Internet Archive is at 300 Funston Street. I’m here all week and into next.
This ticks all my boxes: a podcast by Eric and Jen about the history of the web. I can’t wait for this to start!
This is an important subject (and one very close to my heart) so I’m very glad to see these data protection guidelines nailed to the wall of the web over at Contents Magazine.
A cautionary tale from Dave Winer of not considering digital preservation from the outside. We must learn the past. We must.
Kellan explains the tech behind Old Tweets …and also the thinking behind it:
I think our history is what makes us human, and the push to ephemerality and disposability “as a feature” is misguided. And a key piece of our personal histories is becoming “the story we want to remember”, aka what we’ve shared.
A beautiful short film about The Long Now Foundation’s Rosetta Project.
An introduction to the important work of digital archivists:
Much like the family member that collects, organizes, and identifies old family photos to preserve one’s heritage, digital archivists seek to do the same for all mankind.