Tags: storage

46

sparkline

Friday, March 30th, 2018

Putting Civilization in a Box Means Choosing Our Legacy

A run-down of digital preservation technologies for very, very long-term storage …in space.

Tuesday, March 6th, 2018

Brendan Dawes - Back your sh*t up!

My back-up strategy is similar to Brendan’s (using Super Duper and Backblaze):

In backup parlance there’s a thing called 3-2-1. That is, you should three copies of your files — two locally on different devices and one off site.

But I only do my local back-ups once a week (eek!)—I should do better.

Monday, February 26th, 2018

The Internet Isn’t Forever

A terrific piece by Maria Bustillos on digital preservation and the power of archives, backed up with frightening real-world examples.

Because history is a fight we’re having every day. We’re battling to make the truth first by living it, and then by recording and sharing it, and finally, crucially, by preserving it. Without an archive, there is no history.

Friday, February 9th, 2018

Workers at Your Service | WebKit

Here’s an interesting insight on how WebKit is going to handle the cleanup of unused service workers and caches:

Service worker and Cache API stored information will grow as a user is browsing content. To keep only the stored information that is useful to the user, WebKit will remove unused service worker registrations after a period of a few weeks. Caches that do not get opened after a few weeks will also be removed.

Friday, January 26th, 2018

Arch Mission

Off-site backups of humanity’s knowledge and culture, stored in different media (including pyramidal crystals) placed in near-Earth orbit, the moon, and Mars.

We are developing specialized next-generation devices that we call Archs™ (pronounced “Arks”), which are designed to hold and transmit large amounts of data over long periods of time in extreme environments, including outer space and on the surfaces of other planetary bodies.

Our goal is to collect and curate important data sets and to install them on Archs™ that will be delivered to as many locations as possible for safekeeping.

To increase the chances that Archs™ will be found in the future, we aim for durability and massive redundancy across a broad diversity of locations and materials – a strategy that nature itself has successfully employed.

Tuesday, January 2nd, 2018

Monday, October 2nd, 2017

eBay’s Font Loading Strategy | eBay Tech Blog

Here’s the flow that eBay use for the font-loading. They’ve decided that on the very first page view, seeing a system font is an acceptable trade-off. I think that makes sense for their situation.

Interestingly, they set a flag for subsequent visits using localStorage rather than a cookie. I wonder why that is? For me, the ability to read cookies on the server as well as the client make them quite handy for situations like this.

Friday, September 22nd, 2017

How much storage space is my Progressive Web App using? | Dean Hume

You can use navigator.storage.estimate() to get a (vague) idea of how much space is available on a device for your service worker caches.

Tuesday, May 23rd, 2017

Going offline at Indie Web Camp Düsseldorf

I’ve just come back from a ten-day trip to Germany. The trip kicked off with Indie Web Camp Düsseldorf over the course of a weekend.

IndieWebCamp Düsseldorf 2017

Once again the wonderful people at Sipgate hosted us in their beautiful building, and once again myself and Aaron helped facilitate the two days.

IndieWebCamp Düsseldorf 2017

Saturday was the BarCamp-like discussion day. Plenty of interesting topics were covered. I led a session on service workers, and that’s also what I decided to work on for the second day—that’s when the talking is done and we get down to making.

IndieWebCamp Düsseldorf 2017 IndieWebCamp Düsseldorf 2017 IndieWebCamp Düsseldorf 2017 IndieWebCamp Düsseldorf 2017

I like what Ethan is doing on his offline page. He shows a list of pages that have been cached, but instead of just listing URLs, he shows a title and description for each page.

I’ve already got a separate cache for pages that gets added to as the user browses around my site. I needed to figure out a way to store the metadata for those pages so that I could then display it on the offline page. I came up with a workable solution, and interestingly, it involved no changes to the service worker script at all.

When you visit any blog post, I put metadata about the page into localStorage (after first checking that there’s an active service worker):

if (navigator.serviceWorker && navigator.serviceWorker.controller) {
  window.addEventListener('load', function() {
    var data = {
      "title": "A minority report on artificial intelligence",
      "description": "Revisiting Spielberg’s films after a decade and a half.",
      "published": "May 7th, 2017",
      "timestamp": "1494171049"
    };
    localStorage.setItem(
      window.location.href,
      JSON.stringify(data)
    );
  });
}

In my case, I’m outputting the metadata from the server, but you could just as easily grab some from the DOM like this:

var data = {
  "title": document.querySelector("title").innerText,
  "description": document.querySelector("meta[name='description']").getAttribute("contents")
}

Meanwhile in my service worker, when you visit that same page, it gets added to a cache called “pages”. Both localStorage and the cache API are using URLs as keys. I take advantage of that on my offline page.

The nice thing about writing JavaScript on my offline page is that I know the page will only be seen by modern browsers that support service workers, so I can use all sorts of fancy from ES6, or whatever we’re calling it now.

I start by looping through the keys of the “pages” cache (that’s right—the cache API isn’t just for service workers; you can access it from any script). Then I check to see if there is a corresponding localStorage key with the same string (a URL). If there is, I pull the metadata out of local storage and add it to an array called browsingHistory:

const browsingHistory = [];
caches.open('pages')
.then( cache => {
  cache.keys()
  .then(keys => {
    keys.forEach( request => {
      let data = JSON.parse(localStorage.getItem(request.url));
        if (data) {
          data['url'] = request.url;
          browsingHistory.push(data);
      }
    });

Then I sort the list of pages in reverse chronological order:

browsingHistory.sort( (a,b) => {
  return b.timestamp - a.timestamp;
});

Now I loop through each page in the browsing history list and construct a link to each URL, complete with title and description:

let markup = '';
browsingHistory.forEach( data => {
  markup += `
<h2><a href="${ data.url }">${ data.title }</a></h2>
<p>${ data.description }</p>
<p class="meta">${ data.published }</p>
`;
});

Finally I dump the constructed markup into a waiting div in the page with an ID of “history”:

let container = document.getElementById('history');
container.insertAdjacentHTML('beforeend', markup);

All those steps need to be wrapped inside the then clause attached to caches.open("pages") because the cache API is asynchronous.

There you have it. Now if you’re browsing adactio.com and your network connection drops (or my server goes offline), you can choose from a list of pages you’ve previously visited.

The current situation isn’t ideal though. I’ve got a clean-up operation in my service worker to limit the number of items stored in my “pages” cache. The cache never gets bigger than 35 items. But there’s no corresponding clean-up of metadata stored in localStorage. So there could be a lot more bits of metadata in local storage than there are pages in the cache. It’s not harmful, but it’s a bit wasteful.

I can’t do a clean-up of localStorage from my service worker because service workers can’t access localStorage. There’s a very good reason for that: the localStorage API is synchronous, and everything that happens in a service worker needs to be asynchronous.

Service workers can access indexedDB: it’s asynchronous. I could use indexedDB instead of localStorage, but I’m not a masochist. My best bet would be to use the localForage library, which wraps indexedDB in the simple syntax of localStorage.

Maybe I’ll do that at the next Homebrew Website Club here in Brighton.

Wednesday, May 3rd, 2017

The Lost Picture Show: Hollywood Archivists Can’t Outpace Obsolescence - IEEE Spectrum

There are three parts to digital preservation: format, medium, and licensing. Film and television archives are struggling with all three.

Format:

Codecs—the software used to compress and decompress digital video files—keep changing, as do the hardware and software for playback.

Medium:

As each new generation of LTO comes to market, an older generation of LTO becomes obsolete. LTO manufacturers guarantee at most two generations of backward compatibility. What that means for film archivists with perhaps tens of thousands of LTO tapes on hand is that every few years they must invest millions of dollars in the latest format of tapes and drives and then migrate all the data on their older tapes—or risk losing access to the information altogether.

Licensing:

Studios didn’t see any revenue potential in their past work. They made money by selling movie tickets; absent the kind of follow-on markets that exist today, long-term archiving didn’t make sense economically.

It adds up to a potential cultural disaster:

If technology companies don’t come through with a long-term solution, it’s possible that humanity could lose a generation’s worth of filmmaking, or more.

Monday, May 1st, 2017

The People’s Cloud

A documentary by Matt Parker (brother of Andy) that follows in the footsteps of people like Andrew Blum, James Bridle, and Ingrid Burrington, going in search of the physical locations of the internet, and talking to the people who maintain it. Steven Pemberton makes an appearance in the first and last of five episodes:

  1. What is the Cloud vs What Existed Before?
  2. Working out the Internet: it’s a volume game
  3. The Submarine Cable Network
  4. How Much Data Is There?
  5. Convergence

The music makes it feel quite sinister.

Tuesday, March 21st, 2017

Thursday, January 19th, 2017

Memory of Mankind: All of Human Knowledge Buried in a Salt Mine - The Atlantic

Like cuneiform crossed with the Long Now Foundation’s Rosetta Project.

He will laser-print a microscopic font onto 1-mm-thick ceramic sheets, encased in wafer-thin layers of glass. One 20 cm piece of this microfilm can store 5 million characters; whole libraries of information—readable with a 10x-magnifying lens—could be slotted next to each other and hardly take up any space.

Saturday, October 29th, 2016

Is DNA the Future of Data Storage? - WSJ

It’s still many years away from being a viable storage option, but here’s the latest on using DNA to back up our collective data.

Magnetic tape may survive a few decades, and DVDs even longer, but they are by no means immortal. Data stored in DNA, provided it’s kept cold and dry, could last for thousands of years.

Friday, July 1st, 2016

.generation on Vimeo

A cautionary tale of digital preservation.

.generation is a short film that intimately documents three millennials in the year 2054 - uncovering their relationships with technology in the aftermath of the information age.

.generation

Monday, June 27th, 2016

Persistent Storage | Web Updates - Google Developers

Here’s an interesting proposal from Google for a user-initiated way of declaring a site’s offline assets should be prioritised (and not wiped out in a clean-up). Also interesting: the way that this idea is being tried out is through a token that you can request …sure beats prefixes!

Saturday, February 20th, 2016

Eternal 5D data storage could record the history of humankind

360 terabytes of data stored for over 13 billion years:

Coined as the ‘Superman memory crystal’, as the glass memory has been compared to the “memory crystals” used in the Superman films, the data is recorded via self-assembled nanostructures created in fused quartz. The information encoding is realised in five dimensions: the size and orientation in addition to the three dimensional position of these nanostructures.

Tuesday, March 3rd, 2015

localFont - A localStorage solution for web font loading

A quick drag’n’drop way to base 64 encode your web fonts so you can stick ‘em in local storage.

Wednesday, January 21st, 2015

The Smithsonian’s Cooper Hewitt: Finally, the Museum of the Future Is Here - The Atlantic

Remember Aaron’s dConstruct talk? Well, the Atlantic has more details of his work at the Cooper Hewitt museum in this wide-ranging piece that investigates the role of museums, the value of APIs, and the importance of permanent URLs.

As I was leaving, Cope recounted how, early on, a curator had asked him why the collections website and API existed. Why are you doing this?

His retrospective answer wasn’t about scholarship or data-mining or huge interactive exhibits. It was about the web.

I find this incredibly inspiring.

Monday, September 22nd, 2014

[this is aaronland] upload.js

A really handy bit of code from Aaron for building a robust file uploader. A way to make your web-based photo sharing more Instagrammy-clever.