Tags: caching

44

sparkline

Thursday, December 6th, 2018

Introducing Background Fetch  |  Web  |  Google Developers

I’m going to have to read through this article by Jake a few times before I begin to wrap my head around this background fetch thing, but it looks like it would be perfect for something like the dConstruct Audio Archive, where fairly large files can be saved for offline listening.

Saturday, November 24th, 2018

PushAPI without Notifications | Seblog

Remember when I wrote about using push without notifications? Sebastiaan has written up the details of the experiment he conducted at Indie Web Camp Berlin.

Tuesday, November 20th, 2018

Push and ye shall receive | CSS-Tricks

Imagine a PWA podcast app that works offline and silently receives and caches new podcasts. Sweet. Now we need a permissions model that allows for silent notifications.

Tuesday, November 13th, 2018

Inlining or Caching? Both Please! | Filament Group, Inc., Boston, MA

This just blew my mind! A fiendishly clever pattern that allows you to inline resources (like critical CSS) and cache that same content for later retrieval by a service worker.

Crazy clever!

Sunday, November 11th, 2018

Push without notifications

On the first day of Indie Web Camp Berlin, I led a session on going offline with service workers. This covered all the usual use-cases: pre-caching; custom offline pages; saving pages for offline reading.

But on the second day, Sebastiaan spent a fair bit of time investigating a more complex use of service workers with the Push API.

The Push API is what makes push notifications possible on the web. There are a lot of moving parts—browser, server, service worker—and, frankly, it’s way over my head. But I’m familiar with the general gist of how it works. Here’s a typical flow:

  1. A website prompts the user for permission to send push notifications.
  2. The user grants permission.
  3. A whole lot of complicated stuff happens behinds the scenes.
  4. Next time the website publishes something relevant, it fires a push message containing the details of the new URL.
  5. The user’s service worker receives the push message (even if the site isn’t open).
  6. The service worker creates a notification linking to the URL, interrupting the user, and generally adding to the weight of information overload.

Here’s what Sebastiaan wanted to investigate: what if that last step weren’t so intrusive? Here’s the alternate flow he wanted to test:

  1. A website prompts the user for permission to send push notifications.
  2. The user grants permission.
  3. A whole lot of complicated stuff happens behinds the scenes.
  4. Next time the website publishes something relevant, it fires a push message containing the details of the new URL.
  5. The user’s service worker receives the push message (even if the site isn’t open).
  6. The service worker fetches the contents of the URL provided in the push message and caches the page. Silently.

It worked.

I think this could be a real game-changer. I don’t know about you, but I’m very, very wary of granting websites the ability to send me push notifications. In fact, I don’t think I’ve ever given a website permission to interrupt me with push notifications.

You’ve seen the annoying permission dialogues, right?

In Firefox, it looks like this:

Will you allow name-of-website to send notifications?

[Not Now] [Allow Notifications]

In Chrome, it’s:

name-of-website wants to

Show notifications

[Block] [Allow]

But in actual fact, these dialogues are asking for permission to do two things:

  1. Receive messages pushed from the server.
  2. Display notifications based on those messages.

There’s no way to ask for permission just to do the first part. That’s a shame. While I’m very unwilling to grant permission to be interrupted by intrusive notifications, I’d be more than willing to grant permission to allow a website to silently cache timely content in the background. It would be a more calm technology.

Think of the use cases:

  • I grant push permission to a magazine. When the magazine publishes a new article, it’s cached on my device.
  • I grant push permission to a podcast. Whenever a new episode is published, it’s cached on my device.
  • I grant push permission to a blog. When there’s a new blog post, it’s cached on my device.

Then when I’m on a plane, or in the subway, or in any other situation without a network connection, I could still visit these websites and get content that’s fresh to me. It’s kind of like background sync in reverse.

There’s plenty of opportunity for abuse—the cache could get filled with content. But websites can already do that, and they don’t need to be granted any permissions to do so; just by visiting a website, it can add multiple files to a cache.

So it seems that the reason for the permissions dialogue is all about displaying notifications …not so much about receiving push messages from the server.

I wish there were a way to implement this background-caching pattern without requiring the user to grant permission to a dialogue that contains the word “notification.”

I wonder if the act of adding a site to the home screen could implicitly grant permission to allow use of the Push API without notifications?

In the meantime, the proposal for periodic synchronisation (using background sync) could achieve similar results, but in a less elegant way; periodically polling for new content instead of receiving a push message when new content is published. Also, it requires permission. But at least in this case, the permission dialogue should be more specific, and wouldn’t include the word “notification” anywhere.

Friday, September 28th, 2018

Thoughts on Offline-first | Trys Mudford

Service Workers have such huge potential power, and I feel like we (developers on the web) have barely scratched the surface with what’s possible.

Needless to say, I couldn’t agree more!

Trys is thinking through some of the implicatons of service workers, like how we refresh stale content, and how we deal with slow networks—something that’s actually more of a challenge than dealing with no network connection at all.

There’s some good food for thought here.

I’m so excited to see how we can use Service Workers to improve the web.

Saturday, September 1st, 2018

Offline Content with Service Worker — Chris Ruppel

A step-by-step walkthrough of a really useful service worker pattern: allowing users to save articles for offline reading at the click of a button (kind of like adding the functionality of Instapaper or Pocket to your own site).

Thursday, August 16th, 2018

On HTTPS and Hard Questions - TimKadlec.com

A great post by Tim following on from the post by Eric I linked to last week.

Is a secure site you can’t access better than an insecure one you can?

He rightly points out that security without performance is exlusionary.

…we’ve made a move to increase the security of the web by doing everything we can to get everything running over HTTPS. It’s undeniably a vital move to make. However this combination—poor performance but good security—now ends up making the web inaccessible to many.

Security. Performance. Accessibility. All three matter.

Tuesday, August 7th, 2018

Securing Web Sites Made Them Less Accessible – Eric’s Archived Thoughts

This is a heartbreaking observation by Eric. He’s not anti-HTTPS by any stretch, but he is pointing out that caching servers become a thing of the past on a more secure web.

Can we do anything? For users of up-to-date browsers, yes: service workers create a “good” man in the middle that sidesteps the HTTPS problem, so far as I understand. So if you’re serving content over HTTPS, creating a service worker should be one of your top priorities right now, even if it’s just to do straightforward local caching and nothing fancier.

Tuesday, July 10th, 2018

When 7 KB Equals 7 MB - Cloud Four

I remember Jason telling me about this weird service worker caching behaviour a little while back. This piece is a great bit of sleuthing in tracking down the root causes of this strange issue, followed up with a sensible solution.

Thursday, July 5th, 2018

The trimCache function in Going Offline

Paul Yabsley wrote to let me know about an error in Going Offline. It’s rather embarrassing because it’s code that I’m using in the service worker for adactio.com but for some reason I messed it up in the book.

It’s the trimCache function in Chapter 7: Tidying Up. That’s the reusable piece of code that recursively reduces the number of items in a specified cache (cacheName) to a specified amount (maxItems). On page 95 and 96 I describe the process of creating the function which, in the book, ends up like this:

 function trimCache(cacheName, maxItems) {
   cacheName.open( cache => {
     cache.keys()
     .then( items => {
       if (items.length > maxItems) {
         cache.delete(items[0])
         .then(
           trimCache(cacheName, maxItems)
         ); // end delete then
       } // end if
     }); // end keys then
   }); // end open
 } // end function

See the problem? It’s right there at the start when I try to open the cache like this:

cacheName.open( cache => {

That won’t work. The open method only works on the caches object—I should be passing the name of the cache into the caches.open method. So the code should look like this:

caches.open( cacheName )
.then( cache => {

Everything else remains the same. The corrected trimCache function is here:

function trimCache(cacheName, maxItems) {
  caches.open(cacheName)
  .then( cache => {
    cache.keys()
    .then(items => {
      if (items.length > maxItems) {
        cache.delete(items[0])
        .then(
          trimCache(cacheName, maxItems)
        ); // end delete then
      } // end if
    }); // end keys then
  }); // end open then
} // end function

Sorry about that! I must’ve had some kind of brainfart when I was writing (and describing) that one line of code.

You may want to deface your copy of Going Offline by taking a pen to that code example. Normally I consider the practice of writing in books to be barbarism, but in this case …go for it.

Tuesday, June 5th, 2018

Clearleft.com is a progressive web app

What’s that old saying? The cobbler’s children have no shoes that work offline. Or something.

It’s been over a year since the Clearleft site relaunched and I listed some of the next steps I had planned:

Service worker. It’s a no-brainer. Now that the Clearleft site is (finally!) running on HTTPS, having a simple service worker to cache static assets like CSS, JavaScript and some images seems like the obvious next step.

You know how it is. Those no-brainer tasks are exactly the kind of thing that end up on a to-do list without ever quite getting to-done. Meanwhile I’ve been writing and speaking about how any website can be a progressive web app. I think Alanis Morissette used to sing about this sort of situation.

Enough is enough! Clearleft.com is now a progressive web app. It has a manifest file and a service worker script.

The service worker logic is fairly straightforward, and taken almost verbatim from Going Offline. As you navigate around the site, the service worker applies different logic depending on the kind of file you’re requesting:

  • Pages are served fresh from the network, falling back to the cache when there’s a problem.
  • Everything else is served from the cache where possible, resorting to the network only if there’s no match in the cache—quite the performance boost!

In both cases, if a page or a file is retrieved from the network, it’s gets put into a cache. I’ve got one cache for pages, and another for everything else. And even if a file is retrieved from that cache, I still fire off a fetch request to grab a fresh copy for the cache. So while there’s a chance that a stale file might be served up, it will only ever be slightly stale, and the next time it’s requested, it’ll be fresh.

In the worst-case scenario, when a page can’t be retrieved from the network or the cache, you end up seeing a custom offline page. There you can see a list of any pages that are cached (meaning you can revisit them even without an internet connection).

A custom offline page showing a list of URLs.

It’s not ideal—page titles would be friendlier than URLs—but it’s a start. I’m sure I’ll revisit it soon. Honest.

Oh, and after a year of procrastinating about doing this, guess how long it took? About half a day. Admittedly, this isn’t my first progressive web app, and the more you build ‘em, the easier it gets. Still, it’s a classic example of a small investment of time leading to a big improvement in performance and user experience.

If you think your company’s website could benefit from being a progressive web app (and believe me, it definitely could), you have a couple of options:

  1. Arm yourself with a copy of Going Offline and give it a go yourself. Or
  2. Get in touch with Clearleft. We can help you. (See, I can say that with a straight face now that we’re practicing what we preach.)

Either way, don’t dilly dally …like I did.

Fresher service workers, by default

“Ah, this is good news!”, I thought, reading this update about how service worker scripts won’t be cached.

And that was the moment when I realised what an utter nerd I had become.

Tuesday, March 6th, 2018

Minimal viable service worker

I really, really like service workers. They’re one of those technologies that have such clear benefits to users that it seems like a no-brainer to add a service worker to just about any website.

The thing is, every website is different. So the service worker strategy for every website needs to be different too.

Still, I was wondering if it would be possible to create a service worker script that would work for most websites. Here’s the script I came up with.

The logic works like this:

  • If there’s a request for an HTML page, fetch it from the network and store a copy in a cache (but if the network request fails, try looking in the cache instead).
  • For any other files, look for a copy in the cache first but meanwhile fetch a fresh version from the network to update the cache (and if there’s no existing version in the cache, fetch the file from the network and store a copy of it in the cache).

So HTML files are served network-first, while all other files are served cache-first, but in both cases a fresh copy is always put in the cache. The idea is that HTML content will always be fresh (unless there’s a problem with the network), while all other content—images, style sheets, scripts—might be slightly stale, but get refreshed with every request.

My original attempt was riddled with errors. Jake came to my rescue and we revised the script into something that actually worked. In the process, my misunderstanding of how await works led Jake to write a great blog post on await vs return vs return await.

I got there in the end and the script seems solid enough. It’s a fairly simplistic strategy that could work for quite a few sites, but it has some issues…

Service workers don’t perform any automatic cleanup of caches—that’s up to you to do (usually during the activate event). This script doesn’t do any cleanup so the cache might grow and grow and grow. For that reason, I think the script is best suited for fairly small sites.

The strategy also assumes that a file will either be fetched from the network or the cache. There’s no contingency for when both attempts fail. So there’s no fallback offline page, for example.

I decided to test it in the wild, but I expanded it slightly to fix the fallback issue. The version on the Ampersand 2018 website includes a worst-case-scenario option to show a custom offline page that has been pre-cached. (By the way, if you haven’t got a ticket for Ampersand yet, get a ticket now—it’s going to be superb day of web typography nerdery.)

Anyway, this fairly basic script seems to be delivering some good performance improvements. If you’ve got a site that you think would benefit from this network/caching strategy, and it’s served over HTTPS, then:

  1. Feel free to download the script or copy and paste it into a file called serviceworker.js,
  2. Put that file in the root directory of your website,
  3. Add this in a script element at the bottom of your HTML pages:

if (navigator.serviceWorker && !navigator.serviceWorker.controller) { navigator.serviceWorker.register('/serviceworker.js'); }

You can also use the script as a starting point. You might find issues specific to your particular website. That’s okay—you can tweak and adjust the script to suit your needs.

If this minimal service worker script proves in any way useful to you, thank Jake.

Thursday, November 2nd, 2017

The dConstruct Audio Archive works offline

The dConstruct conference is as old as Clearleft itself. We put on the first event back in 2005, the year of our founding. The last dConstruct was in 2015. It had a good run.

I’m really proud of the three years I ran the show—2012, 2013, and 2014—and I have great memories from each event. I’m inordinately pleased that the individual websites are still online after all these years. I’m equally pleased with the dConstruct audio archive that we put online in 2012. Now that the event itself is no longer running, it truly is an archive—a treasury of voices from the past.

I think that these kinds of online archives are eminently suitable for some offline design. So I’ve added a service worker script to the dConstruct archive.

Caching

To start with, there’s the no-brainer: as soon as someone hits the website, pre-cache static assets like CSS, JavaScript, the logo, and icon images. Now subsequent page loads will be quicker—those assets are taken straight from the cache.

But what about the individual pages? For something like Resilient Web Design—another site that won’t be updated—I pre-cache everything. I could do that with the dConstruct archive. All of the pages with all of the images add up to less than two megabytes; the entire site weighs less than a single page on Wired.com or The Verge.

In the end, I decided to go with a cache-as-you-go strategy. Every time a page or an image is fetched from the network, it is immediately put in a cache. The next time that page or image is requested, the file is served from that cache instead of the network.

Here’s the logic for fetch requests:

  1. First, look to see if the file is in a cache. If it is, great! Serve that.
  2. If the file isn’t in a cache, make a network request and serve the response …but put a copy of a file in the cache.
  3. The next time that file is requested, go to step one.

Save for offline

That caching strategy works great for pages, images, and other assets. But there’s one kind of file on the dConstruct archive that’s a bit different: the audio files. They can be fairly big, so I don’t want to cache those unless the user specifically requests it.

If you end up on the page for a particular talk, and your browser supports service workers, you’ll get an additional UI element in the list of options: a toggle to “save offline” (under the hood, it’s a checkbox). If you activate that option, then the audio file gets put into a cache.

Now if you lose your network connection while browsing the site, you’ll get a custom offline page with the option to listen to any audio files you saved for offline listening. You’ll also see this collection of talks on the homepage, regardless of whether you’ve got an internet connection or not.

So if you’ve got a long plane journey ahead of you, have a browse around the dConstruct archive and select some talks for your offline listening pleasure.

Or just enjoy the speediness of browsing the site.

Turning another website into a Progressive Web App.

Sunday, October 29th, 2017

The meaning of AMP

Ethan quite rightly points out some semantic sleight of hand by Google’s AMP team:

But when I hear AMP described as an open, community-led project, it strikes me as incredibly problematic, and more than a little troubling. AMP is, I think, best described as nominally open-source. It’s a corporate-led product initiative built with, and distributed on, open web technologies.

But so what, right? Tom-ay-to, tom-a-to. Well, here’s a pernicious example of where it matters: in a recent announcement of their intent to ship a new addition to HTML, the Google Chrome team cited the mood of the web development community thusly:

Web developers: Positive (AMP team indicated desire to start using the attribute)

If AMP were actually the product of working web developers, this justification would make sense. As it is, we’ve got one team at Google citing the preference of another team at Google but representing it as the will of the people.

This is just one example of AMP’s sneaky marketing where some finely-shaved semantics allows them to appear far more reasonable than they actually are.

At AMP Conf, the Google Search team were at pains to repeat over and over that AMP pages wouldn’t get any preferential treatment in search results …but they appear in a carousel above the search results. Now, if you were to ask any right-thinking person whether they think having their page appear right at the top of a list of search results would be considered preferential treatment, I think they would say hell, yes! This is the only reason why The Guardian, for instance, even have AMP versions of their content—it’s not for the performance benefits (their non-AMP pages are faster); it’s for that prime real estate in the carousel.

The same semantic nit-picking can be found in their defence of caching. See, they’ve even got me calling it caching! It’s hosting. If I click on a search result, and I am taken to page that has a URL beginning with https://www.google.com/amp/s/... then that page is being hosted on the domain google.com. That is literally what hosting means. Now, you might argue that the original version was hosted on a different domain, but the version that the user gets sent to is the Google copy. You can call it caching if you like, but you can’t tell me that Google aren’t hosting AMP pages.

That’s a particularly low blow, because it’s such a bait’n’switch. One of the reasons why AMP first appeared to be different to Facebook Instant Articles or Apple News was the promise that you could host your AMP pages yourself. That’s the very reason I first got interested in AMP. But if you actually want the benefits of AMP—appearing in the not-search-results carousel, pre-rendered performance, etc.—then your pages must be hosted by Google.

So, to summarise, here are three statements that Google’s AMP team are currently peddling as being true:

  1. AMP is a community project, not a Google project.
  2. AMP pages don’t receive preferential treatment in search results.
  3. AMP pages are hosted on your own domain.

I don’t think those statements are even truthy, much less true. In fact, if I were looking for the right term to semantically describe any one of those statements, the closest in meaning would be this:

A statement used intentionally for the purpose of deception.

That is the dictionary definition of a lie.

Update: That last part was a bit much. Sorry about that. I know it’s a bit much because The Register got all gloaty about it.

I don’t think the developers working on the AMP format are intentionally deceptive (although they are engaging in some impressive cognitive gymnastics). The AMP ecosystem, on the other hand, that’s another story—the preferential treatment of Google-hosted AMP pages in the carousel and in search results; that’s messed up.

Still, I would do well to remember that there are well-meaning people working on even the fishiest of projects.

Except for the people working at the shitrag that is The Register.

(The other strong signal that I overstepped the bounds of decency was that this post attracted the pond scum of Hacker News. That’s another place where the “well-meaning people work on even the fishiest of projects” rule definitely doesn’t apply.)

Monday, September 25th, 2017

Network Information API

It looks like this is landing in Chrome. The navigator.connection.type property will allow us to progressively enhance based on connection type:

A web application that makes use of a service worker to cache resources during installation might have different bundles of assets that it might cache: a list of crucial assets that are cached unconditionally, and a bundle of larger, optional assets that are only cached ahead of time when navigator.connection.type is 'ethernet' or 'wifi'.

There are potential security issues around fingerprinting that are addressed in this document.

Friday, September 22nd, 2017

How much storage space is my Progressive Web App using? | Dean Hume

You can use navigator.storage.estimate() to get a (vague) idea of how much space is available on a device for your service worker caches.

Thursday, May 18th, 2017

It’s not working! Should I blame caching?

Finally, the answer to one of the two hard questions in computer science.

Monday, May 1st, 2017

Offline Web Applications | Udacity

This is a free online video course recorded by Jake a couple of years back. It’s got a really good step-by-step introduction to service workers, delivered in Jake’s typically witty way. Some of the details are a bit out of date, and I must admit that I bailed when it got to IndexedDB, but I highly recommend giving this a go.

There’s also a free course on web accessibility I’m planning to check out.