Journal tags: storage

4

sparkline

Apple’s attack on service workers

Apple aren’t the best at developer relations. But, bad as their communications can be, I’m willing to cut them some slack. After all, they’re not used to talking with the developer community.

John Wilander wrote a blog post that starts with some excellent news: Full Third-Party Cookie Blocking and More. Safari is catching up to Firefox and disabling third-party cookies by default. Wonderful! I’ve had third-party cookies disabled for a few years now, and while something occassionally breaks, it’s honestly a pretty great experience all around. Denying companies the ability to track users across sites is A Good Thing.

In the same blog post, John said that client-side cookies will be capped to a seven-day lifespan, as previously announced. Just to be clear, this only applies to client-side cookies. If you’re setting a cookie on the server, using PHP or some other server-side language, it won’t be affected. So persistent logins are still doable.

Then, in an audacious example of burying the lede, towards the end of the blog post, John announces that a whole bunch of other client-side storage technologies will also be capped to seven days. Most of the technologies are APIs that, like cookies, can be used to store data: Indexed DB, Local Storage, and Session Storage (though there’s no mention of the Cache API). At the bottom of the list is this:

Service Worker registrations

Okay, let’s clear up a few things here (because they have been so poorly communicated in the blog post)…

The seven day timer refers to seven days of Safari usage, not seven calendar days (although, given how often most people use their phones, the two are probably interchangable). So if someone returns to your site within a seven day period of using Safari, the timer resets to zero, and your service worker gets a stay of execution. Lucky you.

This only applies to Safari. So if your site has been added to the home screen and your web app manifest has a value for the “display” property like “standalone” or “full screen”, the seven day timer doesn’t apply.

That piece of information was missing from the initial blog post. Since the blog post was updated to include this clarification, some people have taken this to mean that progressive web apps aren’t affected by the upcoming change. Not true. Only progressive web apps that have been added to the home screen (and that have an appropriate “display” value) will be spared. That’s a vanishingly small percentage of progressive web apps, especially on iOS. To add a site to the home screen on iOS, you need to dig and scroll through the share menu to find the right option. And you need to do this unprompted. There is no ambient badging in Safari to indicate that a site is installable. Chrome’s install banner isn’t perfect, but it’s better than nothing.

Just a reminder: a progressive web app is a website that

  • runs on HTTPS,
  • has a service worker,
  • and a web manifest.

Adding to the home screen is something you can do with a progressive web app (or any other website). It is not what defines progressive web apps.

In any case, this move to delete service workers after seven days of using Safari is very odd, and I’m struggling to find the connection to the rest of the blog post, which is about technologies that can store data.

As I understand it, with the crackdown on setting third-party cookies, trackers are moving to first-party technologies. So whereas in the past, a tracking company could tell its customers “Add this script element to your pages”, now they have to say “Add this script element and this script file to your pages.” That JavaScript file can then store a unique idenitifer on the client. This could be done with a cookie, with Local Storage, or with Indexed DB, for example. But I’m struggling to understand how a service worker script could be used in this way. I’d really like to see some examples of this actually happening.

The best explanation I can come up with for this move by Apple is that it feels like the neatest solution. That’s neat as in tidy, not as in nifty. It is definitely not a nifty solution.

If some technologies set by a specific domain are being purged after seven days, then the tidy thing to do is purge all technologies from that domain. Service workers are getting included in that dragnet.

Now, to be fair, browsers and operating systems are free to clean up storage space as they see fit. Caches, Local Storage, Indexed DB—all of those are subject to eventually getting cleaned up.

So I was curious. Wanting to give Apple the benefit of the doubt, I set about trying to find out how long service worker registrations currently last before getting deleted. Maybe this announcement of a seven day time limit would turn out to be not such a big change from current behaviour. Maybe currently service workers last for 90 days, or 60, or just 30.

Nope:

There was no time limit previously.

This is not a minor change. This is a crippling attack on service workers, a technology specifically designed to improve the user experience for return visits, whether it’s through improved performance or offline access.

I wouldn’t be so stunned had this announcement come with an accompanying feature that would allow Safari users to know when a website is a progressive web app that can be added to the home screen. But Safari continues to ignore the existence of progressive web apps. And now it will actively discourage people from using service workers.

If you’d like to give feedback on this ludicrous development, you can file a bug (down in the cellar in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying “Beware of the Leopard”).

No doubt there will still be plenty of Apple apologists telling us why it’s good that Safari has wished service workers into the cornfield. But make no mistake. This is a terrible move by Apple.

I will say this though: given The Situation we’re all living in right now, some good ol’ fashioned Hot Drama by a browser vendor behaving badly feels almost comforting.

Going offline at Indie Web Camp Düsseldorf

I’ve just come back from a ten-day trip to Germany. The trip kicked off with Indie Web Camp Düsseldorf over the course of a weekend.

IndieWebCamp Düsseldorf 2017

Once again the wonderful people at Sipgate hosted us in their beautiful building, and once again myself and Aaron helped facilitate the two days.

IndieWebCamp Düsseldorf 2017

Saturday was the BarCamp-like discussion day. Plenty of interesting topics were covered. I led a session on service workers, and that’s also what I decided to work on for the second day—that’s when the talking is done and we get down to making.

IndieWebCamp Düsseldorf 2017 IndieWebCamp Düsseldorf 2017 IndieWebCamp Düsseldorf 2017 IndieWebCamp Düsseldorf 2017

I like what Ethan is doing on his offline page. He shows a list of pages that have been cached, but instead of just listing URLs, he shows a title and description for each page.

I’ve already got a separate cache for pages that gets added to as the user browses around my site. I needed to figure out a way to store the metadata for those pages so that I could then display it on the offline page. I came up with a workable solution, and interestingly, it involved no changes to the service worker script at all.

When you visit any blog post, I put metadata about the page into localStorage (after first checking that there’s an active service worker):

if (navigator.serviceWorker && navigator.serviceWorker.controller) {
  window.addEventListener('load', function() {
    var data = {
      "title": "A minority report on artificial intelligence",
      "description": "Revisiting Spielberg’s films after a decade and a half.",
      "published": "May 7th, 2017",
      "timestamp": "1494171049"
    };
    localStorage.setItem(
      window.location.href,
      JSON.stringify(data)
    );
  });
}

In my case, I’m outputting the metadata from the server, but you could just as easily grab some from the DOM like this:

var data = {
  "title": document.querySelector("title").innerText,
  "description": document.querySelector("meta[name='description']").getAttribute("contents")
}

Meanwhile in my service worker, when you visit that same page, it gets added to a cache called “pages”. Both localStorage and the cache API are using URLs as keys. I take advantage of that on my offline page.

The nice thing about writing JavaScript on my offline page is that I know the page will only be seen by modern browsers that support service workers, so I can use all sorts of fancy from ES6, or whatever we’re calling it now.

I start by looping through the keys of the “pages” cache (that’s right—the cache API isn’t just for service workers; you can access it from any script). Then I check to see if there is a corresponding localStorage key with the same string (a URL). If there is, I pull the metadata out of local storage and add it to an array called browsingHistory:

const browsingHistory = [];
caches.open('pages')
.then( cache => {
  cache.keys()
  .then(keys => {
    keys.forEach( request => {
      let data = JSON.parse(localStorage.getItem(request.url));
        if (data) {
          data['url'] = request.url;
          browsingHistory.push(data);
      }
    });

Then I sort the list of pages in reverse chronological order:

browsingHistory.sort( (a,b) => {
  return b.timestamp - a.timestamp;
});

Now I loop through each page in the browsing history list and construct a link to each URL, complete with title and description:

let markup = '';
browsingHistory.forEach( data => {
  markup += `
<h2><a href="${ data.url }">${ data.title }</a></h2>
<p>${ data.description }</p>
<p class="meta">${ data.published }</p>
`;
});

Finally I dump the constructed markup into a waiting div in the page with an ID of “history”:

let container = document.getElementById('history');
container.insertAdjacentHTML('beforeend', markup);

All those steps need to be wrapped inside the then clause attached to caches.open("pages") because the cache API is asynchronous.

There you have it. Now if you’re browsing adactio.com and your network connection drops (or my server goes offline), you can choose from a list of pages you’ve previously visited.

The current situation isn’t ideal though. I’ve got a clean-up operation in my service worker to limit the number of items stored in my “pages” cache. The cache never gets bigger than 35 items. But there’s no corresponding clean-up of metadata stored in localStorage. So there could be a lot more bits of metadata in local storage than there are pages in the cache. It’s not harmful, but it’s a bit wasteful.

I can’t do a clean-up of localStorage from my service worker because service workers can’t access localStorage. There’s a very good reason for that: the localStorage API is synchronous, and everything that happens in a service worker needs to be asynchronous.

Service workers can access indexedDB: it’s asynchronous. I could use indexedDB instead of localStorage, but I’m not a masochist. My best bet would be to use the localForage library, which wraps indexedDB in the simple syntax of localStorage.

Maybe I’ll do that at the next Homebrew Website Club here in Brighton.

Twilight

I went out to The Albert the other night to see Twilight Hotel play. There were really good, so after the show I bought their CD, When The Wolves Go Blind.

It was only when I got home that I realised that I had no device that could play Compact Discs. I play all my music on my iPod or iPhone connected to a speaker dock. And my computer is a Macbook Air …no disc drive. So I had to bring the CD into work with me, stick into my iMac and rip the songs from there.

It’s funny how format (or storage medium) obsolescence creeps up on you like that. I wonder how long it will be until I’m not using any kind of magnetic medium at all.

Simple Storage Service

Amazon’s new S3 service looks very interesting indeed. At first glance, it just looks like a very cheap way of storing and retrieving files — which it is — but the really fascinating aspect is that there is no user interface. It is purely a web service. As Sam Newman says:

When you get down to it, Amazon S3 is simply a large, distributed hash map with an API. Unless people build applications on top of it, it’s useless.

The creators of S3 have gone out of their way to keep the architecture as simple as possible. This is a smart move. I’m a great believer in the power of stupid networks.

Leaving aside the underlying technology, S3 is good news in purely practical terms. If nothing else, this should start a price war for data storage. Yet another barrier to entry has been lowered for anyone looking to publish anything online. Odeo and YouTube are good for audio and video respectively, but the agnostic nature of S3 means that you can store and stream on your own terms.

Hardware has been getting cheaper and cheaper for some time. Now it looks like bandwidth is heading the same way (for some amusing anecdotes on bandwidth issues, be sure to listen to Bernie Burns’ keynote from SXSW).

I’m looking forward to playing around with S3. For a service with no face, it sure looks like it’s got legs.