Tags: for

1000

sparkline

Saturday, September 21st, 2019

Going offline with microformats

For the offline page on my website, I’ve been using a mixture of the Cache API and the localStorage API. My service worker script uses the Cache API to store copies of pages for offline retrieval. But I used the localStorage API to store metadata about the page—title, description, and so on. Then, my offline page would rifle through the pages stored in a cache, and retreive the corresponding metadata from localStorage.

It all worked fine, but as soon as I read Remy’s post about the forehead-slappingly brilliant technique he’s using, I knew I’d be switching my code over. Instead of using localStorage—or any other browser API—to store and retrieve metadata, he uses the pages themselves! Using the Cache API, you can examine the contents of the pages you’ve stored, and get at whatever information you need:

I realised I didn’t need to store anything. HTML is the API.

Refactoring the code for my offline page felt good for a couple of reasons. First of all, I was able to remove a dependency—localStorage—and simplify the JavaScript. That always feels good. But the other reason for the warm fuzzies is that I was able to use data instead of metadata.

Many years ago, Cory Doctorow wrote a piece called Metacrap. In it, he enumerates the many issues with metadata—data about data. The source of many problems is when the metadata is stored separately from the data it describes. The data may get updated, without a corresponding update happening to the metadata. Metadata tends to rot because it’s invisible—out of sight and out of mind.

In fact, that’s always been at the heart of one of the core principles behind microformats. Instead of duplicating information—once as data and again as metadata—repurpose the visible data; mark it up so its meta-information is directly attached to the information itself.

So if you have a person’s contact details on a web page, rather than repeating that information somewhere else—in the head of the document, say—you could instead attach some kind of marker to indicate which bits of the visible information are contact details. In the case of microformats, that’s done with class attributes. You can mark up a page that already has your contact information with classes from the h-card microformat.

Here on my website, I’ve marked up my blog posts, articles, and links using the h-entry microformat. These classes explicitly mark up the content to say “this is the title”, “this is the content”, and so on. This makes it easier for other people to repurpose my content. If, for example, I reply to a post on someone else’s website, and ping them with a webmention, they can retrieve my post and know which bit is the title, which bit is the content, and so on.

When I read Remy’s post about using the Cache API to retrieve information directly from cached pages, I knew I wouldn’t have to do much work. Because all of my posts are already marked up with h-entry classes, I could use those hooks to create a nice offline page.

The markup for my offline page looks like this:

<h1>Offline</h1>
<p>Sorry. It looks like the network connection isn’t working right now.</p>
<div id="history">
</div>

I’ll populate that “history” div with information from a cache called “pages” that I’ve created using the Cache API in my service worker.

I’m going to use async/await to do this because there are lots of steps that rely on the completion of the step before. “Open this cache, then get the keys of that cache, then loop through the pages, then…” All of those thens would lead to some serious indentation without async/await.

All async functions have to have a name—no anonymous async functions allowed. I’m calling this one listPages, just like Remy is doing. I’m making the listPages function execute immediately:

(async function listPages() {
...
})();

Now for the code to go inside that immediately-invoked function.

I create an array called browsingHistory that I’ll populate with the data I’ll use for that “history” div.

const browsingHistory = [];

I’m going to be parsing web pages later on, so I’m going to need a DOM parser. I give it the imaginative name of …parser.

const parser = new DOMParser();

Time to open up my “pages” cache. This is the first await statement. When the cache is opened, this promise will resolve and I’ll have access to this cache using the variable …cache (again with the imaginative naming).

const cache = await caches.open('pages');

Now I get the keys of the cache—that’s a list of all the page requests in there. This is the second await. Once the keys have been retrieved, I’ll have a variable that’s got a list of all those pages. You’ll never guess what I’m calling the variable that stores the keys of the cache. That’s right …keys!

const keys = await cache.keys();

Time to get looping. I’m getting each request in the list of keys using a for/of loop:

for (const request of keys) {
...
}

Inside the loop, I pull the page out of the cache using the match() method of the Cache API. I’ll store what I get back in a variable called response. As with everything involving the Cache API, this is asynchronous so I need to use the await keyword here.

const response = await cache.match(request);

I’m not interested in the headers of the response. I’m specifically looking for the HTML itself. I can get at that using the text() method. Again, it’s asynchronous and I want this promise to resolve before doing anything else, so I use the await keyword. When the promise resolves, I’ll have a variable called html that contains the body of the response.

const html = await response.text();

Now I can use that DOM parser I created earlier. I’ve got a string of text in the html variable. I can generate a Document Object Model from that string using the parseFromString() method. This isn’t asynchronous so there’s no need for the await keyword.

const dom = parser.parseFromString(html, 'text/html');

Now I’ve got a DOM, which I have creatively stored in a variable called …dom.

I can poke at it using DOM methods like querySelector. I can test to see if this particular page has an h-entry on it by looking for an element with a class attribute containing the value “h-entry”:

if (dom.querySelector('.h-entry h1.p-name') {
...
}

In this particular case, I’m also checking to see if the h1 element of the page is the title of the h-entry. That’s so that index pages (like my home page) won’t get past this if statement.

Inside the if statement, I’m going to store the data I retrieve from the DOM. I’ll save the data into an object called …data!

const data = new Object;

Well, the first piece of data isn’t actually in the markup: it’s the URL of the page. I can get that from the request variable in my for loop.

data.url = request.url;

I’m going to store the timestamp for this h-entry. I can get that from the datetime attribute of the time element marked up with a class of dt-published.

data.timestamp = new Date(dom.querySelector('.h-entry .dt-published').getAttribute('datetime'));

While I’m at it, I’m going to grab the human-readable date from the innerText property of that same time.dt-published element.

data.published = dom.querySelector('.h-entry .dt-published').innerText;

The title of the h-entry is in the innerText of the element with a class of p-name.

data.title = dom.querySelector('.h-entry .p-name').innerText;

At this point, I am actually going to use some metacrap instead of the visible h-entry content. I don’t output a description of the post anywhere in the body of the page, but I do put it in the head in a meta element. I’ll grab that now.

data.description = dom.querySelector('meta[name="description"]').getAttribute('content');

Alright. I’ve got a URL, a timestamp, a publication date, a title, and a description, all retrieved from the HTML. I’ll stick all of that data into my browsingHistory array.

browsingHistory.push(data);

My if statement and my for/in loop are finished at this point. Here’s how the whole loop looks:

for (const request of keys) {
  const response = await cache.match(request);
  const html = await response.text();
  const dom = parser.parseFromString(html, 'text/html');
  if (dom.querySelector('.h-entry h1.p-name')) {
    const data = new Object;
    data.url = request.url;
    data.timestamp = new Date(dom.querySelector('.h-entry .dt-published').getAttribute('datetime'));
    data.published = dom.querySelector('.h-entry .dt-published').innerText;
    data.title = dom.querySelector('.h-entry .p-name').innerText;
    data.description = dom.querySelector('meta[name="description"]').getAttribute('content');
    browsingHistory.push(data);
  }
}

That’s the data collection part of the code. Now I’m going to take all that yummy information an output it onto the page.

First of all, I want to make sure that the browsingHistory array isn’t empty. There’s no point going any further if it is.

if (browsingHistory.length) {
...
}

Within this if statement, I can do what I want with the data I’ve put into the browsingHistory array.

I’m going to arrange the data by date published. I’m not sure if this is the right thing to do. Maybe it makes more sense to show the pages in the order in which you last visited them. I may end up removing this at some point, but for now, here’s how I sort the browsingHistory array according to the timestamp property of each item within it:

browsingHistory.sort( (a,b) => {
  return b.timestamp - a.timestamp;
});

Now I’m going to concatenate some strings. This is the string of HTML text that will eventually be put into the “history” div. I’m storing the markup in a string called …markup (my imagination knows no bounds).

let markup = '<p>But you still have something to read:</p>';

I’m going to add a chunk of markup for each item of data.

browsingHistory.forEach( data => {
  markup += `
<h2><a href="${ data.url }">${ data.title }</a></h2>
<p>${ data.description }</p>
<p class="meta">${ data.published }</p>
`;
});

With my markup assembled, I can now insert it into the “history” part of my offline page. I’m using the handy insertAdjacentHTML() method to do this.

document.getElementById('history').insertAdjacentHTML('beforeend', markup);

Here’s what my finished JavaScript looks like:

<script>
(async function listPages() {
  const browsingHistory = [];
  const parser = new DOMParser();
  const cache = await caches.open('pages');
  const keys = await cache.keys();
  for (const request of keys) {
    const response = await cache.match(request);
    const html = await response.text();
    const dom = parser.parseFromString(html, 'text/html');
    if (dom.querySelector('.h-entry h1.p-name')) {
      const data = new Object;
      data.url = request.url;
      data.timestamp = new Date(dom.querySelector('.h-entry .dt-published').getAttribute('datetime'));
      data.published = dom.querySelector('.h-entry .dt-published').innerText;
      data.title = dom.querySelector('.h-entry .p-name').innerText;
      data.description = dom.querySelector('meta[name="description"]').getAttribute('content');
      browsingHistory.push(data);
    }
  }
  if (browsingHistory.length) {
    browsingHistory.sort( (a,b) => {
      return b.timestamp - a.timestamp;
    });
    let markup = '<p>But you still have something to read:</p>';
    browsingHistory.forEach( data => {
      markup += `
<h2><a href="${ data.url }">${ data.title }</a></h2>
<p>${ data.description }</p>
<p class="meta">${ data.published }</p>
`;
    });
    document.getElementById('history').insertAdjacentHTML('beforeend', markup);
  }
})();
</script>

I’m pretty happy with that. It’s not too long but it’s still quite readable (I hope). It shows that the Cache API and the h-entry microformat are a match made in heaven.

If you’ve got an offline strategy for your website, and you’re using h-entry to mark up your content, feel free to use that code.

If you don’t have an offline strategy for your website, there’s a book for that.

Thursday, September 19th, 2019

An HTML attribute potentially worth $4.4M to Chipotle - Cloud Four

When I liveblogged Jason’s talk at An Event Apart in Chicago, I included this bit of reporting:

Jason proceeds to relate a long and involved story about buying burritos online from Chipotle.

Well, here is that story. It’s a good one, with some practical takeaways (if you’ll pardon the pun):

  1. Use HTML5 input features
  2. Support autofill
  3. Make autofill part of your test plans

Friday, September 13th, 2019

5G Will Definitely Make the Web Slower, Maybe | Filament Group, Inc.

The Jevons Paradox in action:

Faster networks should fix our performance problems, but so far, they have had an interesting if unintentional impact on the web. This is because historically, faster network speed has enabled developers to deliver more code to users—in particular, more JavaScript code.

And because it’s JavaScript we’re talking about:

Even if folks are on a new fast network, they’re very likely choking on the code we’re sending, rendering the potential speed improvements of 5G moot.

The longer I spend in this field, the more convinced I am that web performance is not a technical problem; it’s a people problem.

Tuesday, September 10th, 2019

Request mapping

The Request Map Generator is a terrific tool. It’s made by Simon Hearne and uses the WebPageTest API.

You pop in a URL, it fetches the page and maps out all the subsequent requests in a nifty interactive diagram of circles, showing how many requests third-party scripts are themselves generating. I’ve found it to be a very effective way of showing the impact of third-party scripts to people who aren’t interested in looking at waterfall diagrams.

I was wondering… Wouldn’t it be great if this were built into browsers?

We already have a “Network” tab in our developer tools. The purpose of this tab is to show requests coming in. The browser already has all the information it needs to make a diagram of requests in the same that the request map generator does.

In Firefox, there’s a little clock icon in the bottom left corner of the “Network” tab. Clicking that shows a pie-chart view of requests. That’s useful, but I’d love it if there were the option to also see the connected circles that the request map generator shows.

Just a thought.

Website Carbon Calculator – Calculate your websites carbon emissions

Get an idea of how much your website is contributing to the climate crisis.

In total, the internet produces 2% of global carbon emissions, roughly the same as that bad boy of climate change, the aviation industry.

Saturday, September 7th, 2019

How Video Games Inspire Great UX – Scott Jenson

Six UX lessons from game design:

  1. Story vs Narrative (Think in terms of story arcs)
  2. Games are fractal (Break up the journey from big to small to tiny)
  3. Learning loop (figure out your core mechanic)
  4. Affordances (Prompt for known loops)
  5. Hintiness (Move to new loops)
  6. Pacing (Be sure to start here)

Tuesday, September 3rd, 2019

How Web Content Can Affect Power Usage | WebKit

The way you build web pages—using IntersectionObserver, for example—can have a direct effect on the climate emergency.

Webpages can be good citizens of battery life.

It’s important to measure the battery impact in Web Inspector and drive those costs down.

Saturday, August 31st, 2019

Less Data Doesn’t Mean a Lesser Experience| TimKadlec.com

If you treat data as a constraint in your design and development process, you’ll likely be able to brainstorm a large number of different ways to keep data usage to a minimum while still providing an excellent experience. Doing less doesn’t mean it has to feel broken.

Tuesday, August 27th, 2019

The Weight of the WWWorld is Up to Us by Patty Toland

It’s Patty Toland’s first time at An Event Apart! She’s from the fantabulous Filament Group. They’re dedicated to making the web work for everyone.

A few years ago, a good friend of Patty’s had a medical diagnosis that required everyone to pull together. Another friend shared an article about how not to say the wrong thing. This is ring theory. In a moment of crisis, the person involved is in the centre. You need to understand where you are in this ring structure, and only ever help and comfort inwards and dump concerns and problems outwards.

At the same time, Patty spent time with her family at the beach. Everyone reads the same books together. There was a book about a platoon leader in Vietnam. 80% of the story was literally a litany of stuff—what everyone was carrying. This was peppered with the psychic and emotional loads that they were carrying.

A month later there was a lot of coverage of Syrian refugees arriving in Europe. People were outraged to see refugees carrying smartphones as though that somehow showed they weren’t in a desperate situation. But smartphones are absolutely a necessity in that situation, and most of the phones were less expensive, lower-end devices. Refugeeinfo.eu was a useful site for people in crisis, but the navigation was designed to require JavaScript.

When people thing about mobile, they think about freedom and mobility. But with that JavaScript decision, the developers piled baggage on to the users.

There was a common assertion that slow networks were a third-world challenge. Remember Facebook’s network challenges? They always talked about new markets in India and Africa. The implication is that this isn’t our problem in, say, Omaha or New York.

Pew Research provided a lot of data back then that showed that this thinking was wrong. Use of cell phones, especially smartphones and tablets, escalated dramatically in the United States. There was a trend towards mobile-only usage. This was in low-income households—about one third of the population. Among 5,400 panelists, 15% did not have a JavaScript-enabled device.

Pew Research provided updated data this year. The research shows an increase in those trends. Half of the population access the web primarily on mobile. The cost of a broadband subscription is too expensive for many people. Sometimes broadband access simply isn’t available.

There’s a term called “the homework gap.” Two thirds of teachers assign broadband-dependent homework, while one third of students have no access to broadband.

At most 37% of people have unlimited data. Most people run out of data on a frequent basis.

Speed also varies wildly. 4G doesn’t really mean anything. The data is all over the place.

This shows that network issues are definitely not just a third world challenge.

On the 25th anniversary of the web, Tim Berners-Lee said the web’s potential was only just beginning to be glimpsed. Everyone has a role to play to ensure that the web serves all of humanity. In his contract for the web, Tim outlined what governments, companies, and users need to do. This reminded Patty of ring theory. The user is at the centre. Designers and developers are in the next circle out. Then there’s the circle of companies. Then there are platforms, browsers, and frameworks. Finally there’s the outer circle of governments.

Are we helping in or dumping in? If you look at the data for the average web page size (2 megabytes), we are definitely dumping in. The size of third-party JavaScript has octupled.

There’s no way for a user to know before clicking a link how big and bloated the page is going to be. Even if they abandon the page load, they’ve still used (and wasted) a lot of data.

Third party scripts—like ads—are really bad at dumping in (to use the ring theory model). The best practices for ads suggest that up to 100 additional HTTP requests is totally acceptable. Unbelievable! It doesn’t matter how performant you’ve made a site when this crap gets piled on top of it.

In 2018, the internet’s data centres alone may already have had the same carbon footprint as all global air travel. This will probably triple in the next seven years. The amount of carbon it takes to train a single AI algorithm is more than the entire life cycle of a car. Then there’s fucking Bitcoin. A single Bitcoin transaction could power 21 US households. It is designed to use—specifically, waste—more and more energy over time.

What should we be doing?

Accessibility should be at the heart of what we build. Plan, test, educate, and advocate. If advocacy doesn’t work, fear can be a motivator. There’s an increase in accessibility lawsuits.

Our websites should be as light as possible. Ask, measure, monitor, and optimise. RequestMap is a great tool for visualising requests. You can see the size and scale of third-party requests. You can also see when images are far, far bigger than they need to be.

Take a critical guide to everything and pare everything down. Set perforance budgets—file size budgets, for example. Optimise images, subset custom fonts, lazyload images and videos, get third-party tools out of the critical path (or out completely), and seek out lighter frameworks.

Test on real devices that real people are using. See Alex Russell’s data on the differences between the kind of devices we use and typical low-end devices. We literally need to stop people in JavaScript.

Push the boundaries. See the amazing work that Adrian Holovaty did with Soundslice. He had to make on-the-fly sheet music generation work on old iPads that musicians like to use. He recommends keeping old devices around to see how poorly your product is working on it.

If you have some power, then your job is to empower somebody else.

—Toni Morrison

Web Forms: Now You See Them, Now You Don’t! by Jason Grigsby

Jason is on stage at An Event Apart Chicago in a tuxedo. He wants to talk about how we can make web forms magical. Oh, I see. That explains the get-up.

We’re always being told to make web forms shorter. Luke Wroblewski has highlighted the work of companies that have reduced form fields and increased conversion.

But what if we could get rid of forms altogether? Wouldn’t that be magical!

Jason will reveal the secrets to this magic. But first—a volunteer from the audience, please! Please welcome Joe to the stage.

Joe will now log in on a phone. He types in the username. Then the password. The password is hodge-podge of special characters, numbers and upper and lowercase letters. Joe starts typing. Jason takes the phone and logs in without typing anything!

The secret: Jason was holding an NFC security key in his hand. That works with a new web standard called WebAuthn.

Passwords are terrible. People share them across sites, but who can blame them? It’s hard to remember lots of passwords. The only people who love usernames and passwords are hackers. So sites are developing other methods to try to keep people secure. Two factor authentication helps, although it doesn’t help us with phishing attacks. The hacker gets the password from the phished user …and then gets the one-time code from the phished user too.

But a physical device like a security key solves this problem. So why aren’t we all using security keys (apart from the fear of losing the key)? Well, until WebAuthn, there wasn’t a way for websites to use the keys.

A web server generates a challenge—a long string—that gets sent to a website and passed along to the user. The user’s device generates a credential ID and public and private keys for that domain. The web site stores the public key and credential ID. From then on, the credential ID is used by the website in challenges to users logging in.

There were three common ways that we historically proved who we claimed to be.

  1. Something you know (e.g. a password).
  2. Something you have (e.g. a security key).
  3. Something you are (e.g. biometric information).

These are factors of identification. So two-factor identification is the combination of any of those two. If you use a security key combined with a fingerprint scanner, there’s no need for passwords.

The browser support for the web authentication API (WebAuthn) is a bit patchy right now but you can start playing around with it.

There are a few other options for making logging in faster. There’s the Credential Management API. It allows someone to access passwords stored in their browser’s password manager. But even though it’s newer, there’s actually better browser support for WebAuthn than Credential Management.

Then there’s federated login, or social login. Jason has concerns about handing over log-in to a company like Facebook, Twitter, or Google, but then again, it means fewer passwords. As a site owner, there’s actually a lot of value in not storing log-in information—you won’t be accountable for data breaches. The problem is that you’ve got to decide which providers you’re going to support.

Also keep third-party password managers in mind. These tools—like 1Password—are great. In iOS they’re now nicely integrated at the operating system level, meaning Safari can use them. Finally it’s possible to log in to websites easily on a phone …until you encounter a website that prevents you logging in this way. Some websites get far too clever about detecting autofilled passwords.

Time for another volunteer from the audience. This is Tyler. Tyler will help Jason with a simple checkout form. Shipping information, credit card information, and so on. Jason will fill out this form blindfolded. Tyler will first verify that the dark goggles that Jason will be wearing don’t allow him to see the phone screen. Jason will put the goggles on and Tyler will hand him the phone with the checkout screen open.

Jason dons the goggles. Tyler hands him the phone. Jason does something. The form is filled in and submitted!

What was the secret? The goggles prevented Jason from seeing the phone …but they didn’t prevent the screen from seeing Jason. The goggles block everything but infrared. The iPhone uses infrared for Face ID. So the iPhone, it just looked like Jason was wearing funky sunglasses. Face ID then triggered the Payment Request API.

The Payment Request API allows us to use various payment methods that are built in to the operating system, but without having to make separate implementations for each payment method. The site calls the Payment Request API if it’s supported (use feature detection and progressive enhancement), then trigger the payment UI in the browser. The browser—not the website!—then makes a call to the payment processing provider e.g. Stripe.

E-commerce sites using the Payment Request API have seen a big drop in abandonment and a big increase in completed payments. The browser support is pretty good, especially on mobile. And remember, you can use it as a progressive enhancement. It’s kind of weird that we don’t encounter it more often—it’s been around for a few years now.

Jason read the fine print for Apple Pay, Google Pay, Microsoft Pay, and Samsung Pay. It doesn’t like there’s anything onerous in there that would stop you using them.

On some phones, you can now scan credit cards using the camera. This is built in to the operating system so as a site owner, you’ve just got to make sure not to break it. It’s really an extension of autofill. You should know what values the autocomplete attribute can take. There are 48 different values; it’s not just for checkouts. When users use autofill, they fill out forms 30% faster. So make sure you don’t put obstacles in the way of autofill in your forms.

Jason proceeds to relate a long and involved story about buying burritos online from Chipotle. The upshot is: use the autocomplete, type, maxlength, and pattern attributes correctly on input elements. Test autofill with your forms. Make it part of your QA process.

So, to summarise, here’s how you make your forms disappear:

  1. Start by reducing the number of form fields.
  2. Use the correct HTML to support autofill. Support password managers and password-pasting. At least don’t break that behaviour.
  3. Provide alternate ways of logging in. Federated login or the Credentials API.
  4. Test autofill and other form features.
  5. Look for opportunities to replace forms entirely with biometrics.

Any sufficiently advanced technology is indistinguishable from magic.

—Arthur C. Clarke’s Third Law

Don’t our users deserve magical experiences?

Monday, August 26th, 2019

Opening up the AMP cache

I have a proposal that I think might alleviate some of the animosity around Google AMP. You can jump straight to the proposal or get some of the back story first…

The AMP format

Google AMP is exactly the kind of framework I’d like to get behind. Unlike most front-end frameworks, its components take a declarative approach—no knowledge of JavaScript required. I think Lea’s excellent Mavo is the only other major framework that takes this inclusive approach. All the configuration happens in markup, and all the styling happens in CSS. Excellent!

But I cannot get behind AMP.

Instead of competing on its own merits, AMP is unfairly propped up by the search engine of its parent company, Google. That makes it very hard to evaluate whether AMP is being used on its own merits. Instead, the evidence suggests that most publishers of AMP pages are doing so because they feel they have to, rather than because they want to. That’s a real shame, because as a library of web components, AMP seems pretty good. But there’s just no way to evaluate AMP-the-format without taking into account AMP-the-ecosystem.

The AMP ecosystem

Google AMP ostensibly exists to make the web faster. Initially the focus was specifically on mobile performance, but that distinction has since fallen by the wayside. The idea is that by using AMP’s web components, your pages will be speedy. Though, as Andy Davies points out, this isn’t always the case:

This is where I get confused… https://independent.co.uk only have an AMP site yet it’s performance is awful from a user perspective - isn’t AMP supposed to prevent this?

See also: Google AMP lowered our page speed, and there’s no choice but to use it:

According to Google’s own Page Speed Insights audit (which Google recommends to check your performance), the AMP version of articles got an average performance score of 87. The non-AMP versions? 95.

Publishers who already have fast web pages—like The Guardian—are still compelled to make AMP versions of their stories because of the search benefits reserved for AMP. As Terence Eden reported from a meeting of the AMP advisory committee:

We heard, several times, that publishers don’t like AMP. They feel forced to use it because otherwise they don’t get into Google’s news carousel — right at the top of the search results.

Some people felt aggrieved that all the hard work they’d done to speed up their sites was for nothing.

The Google AMP team are at pains to point out that AMP is not a ranking factor in search. That’s true. But it is unfairly privileged in other ways. Only AMP pages can appear in the Top Stories carousel …which appears above any other search results. As I’ve said before:

Now, if you were to ask any right-thinking person whether they think having their page appear right at the top of a list of search results would be considered preferential treatment, I think they would say hell, yes! This is the only reason why The Guardian, for instance, even have AMP versions of their content—it’s not for the performance benefits (their non-AMP pages are faster); it’s for that prime real estate in the carousel.

From A letter about Google AMP:

Content that “opts in” to AMP and the associated hosting within Google’s domain is granted preferential search promotion, including (for news articles) a position above all other results.

That’s not the only way that AMP pages get preferential treatment. It turns out that the secret to the speed of AMP pages isn’t the web components. It’s the prerendering.

The AMP cache

If you’ve ever seen an AMP page in a list of search results, you’ll have noticed the little lightning icon. If you’ve ever tapped on that search result, you’ll have noticed that the page loads blazingly fast!

That’s not down to AMP-the-format, alas. That’s down to the fact that the page has been prerendered by Google before you even went to it. If any page were prerendered that way, it would load blazingly fast. But currently, this privilege is reserved for AMP pages only.

If, after tapping through to that AMP page, you looked at the address bar of your browser, you might have noticed something odd. Even though you might have thought you were visiting The Washington Post, or The New York Times, the URL of the (blazingly fast) page you’re looking at is still under Google’s domain. That’s because Google hosts any AMP pages that it prerenders.

Google calls this “the AMP cache”, but it would be better described as “AMP hosting”. The web page sent down the wire is hosted on Google’s domain.

Here’s that AMP letter again:

When a user navigates from Google to a piece of content Google has recommended, they are, unwittingly, remaining within Google’s ecosystem.

Through gritted teeth, I will refer to this as “the AMP cache”, because that’s what everyone else calls it. But make no mistake, Google is hosting—not caching—these pages.

But why host the pages on a Google domain? Why not prerender the original URLs?

Prerendering and privacy

Scott summed up the situation with AMP nicely:

The pitch I think site owners are hearing is: let us host your pages on our domain and we’ll promote them in search results AND preload them so they feel “instant.” To opt-in, build pages using this component syntax.

But perhaps we could de-couple the AMP format from the AMP cache.

That’s what Terence suggests:

My recommendation is that Google stop requiring that organisations use Google’s proprietary mark-up in order to benefit from Google’s promotion.

The AMP letter, too:

Instead of granting premium placement in search results only to AMP, provide the same perks to all pages that meet an objective, neutral performance criterion such as Speed Index.

Scott reiterates:

It’s been said before but it would be so good for the web if pages with a Lighthouse score over say, 90 could get into that top search result area, even if they’re not built using Google’s AMP framework. Feels wrong to have to rebuild/reproduce an already-fast site just for SEO.

This was also what I was calling for. But then Malte pointed out something that stumped me. Privacy.

Here’s the problem…

Let’s say Google do indeed prerender already-fast pages when they’re listed in search results. You, a search user, type something into Google. A list of results come back. Google begins pre-rendering some of them. But you don’t end up clicking through to those pages. Nonetheless, the servers those pages are hosted on have received a GET request coming from a Google search. Those publishers now know that a particular (cookied?) user could have clicked through to their site. That’s very different from knowing when someone has actually arrived at a particular site.

And that’s why Google host all the AMP pages that they prerender. Given the privacy implications of prerendering non-Google URLs, I must admit that I see their point.

Still, it’s a real shame to miss out on the speed benefit of prerendering:

Prerendering AMP documents leads to substantial improvements in page load times. Page load time can be measured in different ways, but they consistently show that prerendering lets users see the content they want faster. For now, only AMP can provide the privacy preserving prerendering needed for this speed benefit.

A modest proposal

Why is Google’s AMP cache just for AMP pages? (Y’know, apart from the obvious answer that it’s in the name.)

What if Google were allowed to host non-AMP pages? Google search could then prerender those pages just like it currently does for AMP pages. There would be no privacy leaks; everything would happen on the same domain—google.com or ampproject.org or whatever—just as currently happens with AMP pages.

Don’t get me wrong: I’m not suggesting that Google should make a 1:1 model of the web just to prerender search results. I think that the implementation would need to have two important requirements:

  1. Hosting needs to be opt-in.
  2. Only fast pages should be prerendered.

Opting in

Currently, by publishing a page using the AMP format, publishers give implicit approval to Google to host that page on Google’s servers and serve up this Google-hosted version from search results. This has always struck me as being legally iffy. I’ve looked in the AMP documentation to try to find any explicit granting of hosting permission (e.g. “By linking to this JavaScript file, you hereby give Google the right to serve up our copies of your content.”), but no luck. So even with the current situation, I think a clear opt-in for hosting would be beneficial.

This could be a meta element. Maybe something like:

<meta name="caches-allowed" content="google">

This would have the nice benefit of allowing comma-separated values:

<meta name="caches-allowed" content="google, yandex">

(The name is just a strawman, by the way—I’m not suggesting that this is what the final implementation would actually look like.)

If not a meta element, then perhaps this could be part of robots.txt? Although my feeling is that this needs to happen on a document-by-document basis rather than site-wide.

Many people will, quite rightly, never want Google—or anyone else—to host and serve up their content. That’s why it’s so important that this behaviour needs to be opt-in. It’s kind of appalling that the current hosting of AMP pages is opt-in-by-proxy-sort-of.

Criteria for prerendering

Which pages should be blessed with hosting and prerendering? The fast ones. That’s sorta the whole point of AMP. But right now, there’s a lot of resentment by people with already-fast websites who quite rightly feel they shouldn’t have to use the AMP format to benefit from the AMP ecosystem.

Page speed is already a ranking factor. It doesn’t seem like too much of a stretch to extend its benefits to hosting and prerendering. As mentioned above, there are already a few possible metrics to use:

  • Page Speed Index
  • Lighthouse
  • Web Page Test

Ah, but what if a page has good score when it’s indexed, but then gets worse afterwards? Not a problem! The version of the page that’s measured is the same version of the page that gets hosted and prerendered. Google can confidently say “This page is fast!” After all, they’re the ones serving up the page.

That does raise the question of how often Google should check back with the original URL to see if it has changed/worsened/improved. The answer to that question is however long it currently takes to check back in on AMP pages:

Each time a user accesses AMP content from the cache, the content is automatically updated, and the updated version is served to the next user once the content has been cached.

Issues

This proposal does not solve the problem with the address bar. You’d still find yourself looking at a page from The Washington Post or The New York Times (or adactio.com) but seeing a completely different URL in your browser. That’s not good, for all the reasons outlined in the AMP letter.

In fact, this proposal could potentially make the situation worse. It would allow even more sites to be impersonated by Google’s URLs. Where currently only AMP pages are bad actors in terms of URL confusion, opening up the AMP cache would allow equal opportunity URL confusion.

What I’m suggesting is definitely not a long-term solution. The long-term solutions currently being investigated are technically tricky and will take quite a while to come to fruition—web packages and signed exchanges. In the meantime, what I’m proposing is a stopgap solution that’s technically a lot simpler. But it won’t solve all the problems with AMP.

This proposal solves one problem—AMP pages being unfairly privileged in search results—but does nothing to solve the other, perhaps more serious problem: the erosion of site identity.

Measuring

Currently, Google can assess whether a page should be hosted and prerendered by checking to see if it’s a valid AMP page. That test would need to be widened to include a different measurement of performance, but those measurements already exist.

I can see how this assessment might not be as quick as checking for AMP validity. That might affect whether non-AMP pages could be measured quickly enough to end up in the Top Stories carousel, which is, by its nature, time-sensitive. But search results are not necessarily as time-sensitive. Let’s start there.

Assets

Currently, AMP pages can be prerendered without fetching anything other than the markup of the AMP page itself. All the CSS is inline. There are no initial requests for other kinds of content like images. That’s because there are no img elements on the page: authors must use amp-img instead. The image itself isn’t loaded until the user is on the page.

If the AMP cache were to be opened up to non-AMP pages, then any content required for prerendering would also need to be hosted on that same domain. Otherwise, there’s privacy leakage.

This definitely introduces an extra level of complexity. Paths to assets within the markup might need to be re-written to point to the Google-hosted equivalents. There would almost certainly need to be a limit on the number of assets allowed. Though, for performance, that’s no bad thing.

Make no mistake, figuring out what to do about assets—style sheets, scripts, and images—is very challenging indeed. Luckily, there are very smart people on the Google AMP team. If that brainpower were to focus on this problem, I am confident they could solve it.

Summary

  1. Prerendering of non-Google URLs is problematic for privacy reasons, so Google needs to be able to host pages in order to prerender them.
  2. Currently, that’s only done for pages using the AMP format.
  3. The AMP cache—and with it, prerendering—should be decoupled from the AMP format, and opened up to other fast web pages.

There will be technical challenges, but hopefully nothing insurmountable.

I honestly can’t see what Google have to lose here. If their goal is genuinely to reward fast pages, then opening up their AMP cache to fast non-AMP pages will actively encourage people to make fast web pages (without having to switch over to the AMP format).

I’ve deliberately kept the details vague—what the opt-in should look like; what the speed measurement should be; how to handle assets—I’m sure smarter folks than me can figure that stuff out.

I would really like to know what other people think about this proposal. Obviously, I’d love to hear from members of the Google AMP team. But I’d also love to hear from publishers. And I’d very much like to know what people in the web performance community think about this. (Write a blog post and send me a webmention.)

What am I missing here? What haven’t I thought of? What are the potential pitfalls (and are they any worse than the current acrimonious situation with Google AMP)?

I would really love it if someone with a fast website were in a position to say, “Hey Google, I’m giving you permission to host this page so that it can be prerendered.”

I would really love it if someone with a slow website could say, “Oh, shit! We’d better make our existing website faster or Google won’t host our pages for prerendering.”

And I would dearly love to finally be able to embrace AMP-the-format with a clear conscience. But as long as prerendering is joined at the hip to the AMP format, the injustice of the situation only harms the AMP project.

Google, open up the AMP cache.

Saturday, August 24th, 2019

“Never-Slow Mode” (a.k.a. “Slightly-Fast Mode”) Explained

I would very much like this to become a reality.

Never-Slow Mode (“NSM”) is a mode that sites can opt-into via HTTP header. For these sites, the browser imposes per-interaction resource limits, giving users a better user experience, potentially at the cost of extra developer work. We believe users are happier and more engaged on fast sites, and NSM attempts to make it easier for sites to guarantee speed to users. In addition to user experience benefits, sites might want to opt in because browsers could providing UI to users to indicate they are in “fast mode” (a TLS lock icon but for speed).

Friday, August 23rd, 2019

Is client side A/B testing always a bad idea in your experience? · Issue #53 · csswizardry/ama

Harry enumerates the reasons why client-side A/B testing is terrible:

  • It typically blocks rendering.
  • Providers are almost always off-site.
  • It happens on every page load.
  • No user-benefitting reuse.
  • They likely skip any governance process.

While your engineers are subject to linting, code-reviews, tests, auditors, and more, your marketing team have free rein of the front-end.

Note that the problem here is not A/B testing per se, it’s client-side A/B testing. For some reason, we seem to have collectively decided that A/B testing—like analytics—is something we should offload to the JavaScript parser in the user’s browser.

Thursday, August 22nd, 2019

Paint Holding - reducing the flash of white on same-origin navigations  |  Web  |  Google Developers

This is an excellent UX improvement in Chrome. For sites like The Session, where page loads are blazingly fast, this really makes them feel like single page apps.

Our goal with this work was for navigations in Chrome between two pages that are of the same origin to be seamless and thus deliver a fast default navigation experience with no flashes of white/solid-color background between old and new content.

This is exactly the kind of area where browsers can innovate and compete on the UX of the browser itself, rather than trying to compete on proprietary additions to what’s being rendered.

Wednesday, August 21st, 2019

Accessibility and web performance are not features, they’re the baseline | CSS-Tricks

Performance and accessibility aren’t features that can linger at the bottom of a Jira board to be considered later when it’s convenient.

Instead we must start to see inaccessible and slow websites for what they are: a form of cruelty. And if we want to build a web that is truly a World Wide Web, a place for all and everyone, a web that is accessible and fast for as many people as possible, and one that will outlive us all, then first we must make our websites something else altogether; we must make them kind.

Saturday, August 10th, 2019

Server Timing

Harry wrote a really good article all about the performance measurement Time To First Byte. Time To First Byte: What It Is and Why It Matters:

While a good TTFB doesn’t necessarily mean you will have a fast website, a bad TTFB almost certainly guarantees a slow one.

Time To First Byte has been the chink in my armour over at thesession.org, especially on the home page. Every time I ran Lighthouse, or some other performance testing tool, I’d get a high score …with some points deducted for taking too long to get that first byte from the server.

Harry’s proposed solution is to set up some Server Timing headers:

With a little bit of extra work spent implementing the Server Timing API, we can begin to measure and surface intricate timings to the front-end, allowing web developers to identify and debug potential bottlenecks previously obscured from view.

I rememberd that Drew wrote an excellent article on Smashing Magazine last year called Measuring Performance With Server Timing:

The job of Server Timing is not to help you actually time activity on your server. You’ll need to do the timing yourself using whatever toolset your backend platform makes available to you. Rather, the purpose of Server Timing is to specify how those measurements can be communicated to the browser.

He even provides some PHP code, which I was able to take wholesale and drop into the codebase for thesession.org. Then I was able to put start/stop points in my code for measuring how long some operations were taking. Then I could output the results of these measurements into Server Timing headers that I could inspect in the “Network” tab of a browser’s dev tools (Chrome is particularly good for displaying Server Timing, so I used that while I was conducting this experiment).

I started with overall database requests. Sure enough, that was where most of the time in time-to-first-byte was being spent.

Then I got more granular. I put start/stop points around specific database calls. By doing this, I was able to zero in on which operations were particularly costly. Once I had done that, I had to figure out how to make the database calls go faster.

Spoiler: I did it by adding an extra index on one particular table. It’s almost always indexes, in my experience, that make the biggest difference to database performance.

I don’t know why it took me so long to get around to messing with Server Timing headers. It has paid off in spades. I wish I had done it sooner.

And now thesession.org is positively zipping along!

Friday, August 9th, 2019

Redux: Lazy loading youtube embeds

Remy has an excellent improvement on that article I linked to yesterday on using srcdoc with iframes. Rather than using srcdoc instead of src, you can use srcdoc as well as src. That way you can support older browsers too!

Time to First Byte: What It Is and Why It Matters by Harry Roberts

Harry takes a deep dive into the performance metric of “time to first byte”, or TTFB if you using initialisms that take as long to say as the thing they’re abbreviating.

This makes a great companion piece to Drew’s article on server timing headers.

Thursday, August 8th, 2019

Lazy load embedded YouTube videos - DEV Community 👩‍💻👨‍💻

This is a clever use of the srcdoc attribute on iframes.

Native lazy-loading for the web  |  web.dev

The title is somewhat misleading—currently it’s about native lazy-loading for Chrome, which is not (yet) the web.

I’ve just been adding loading="lazy" to most of the iframes and many of the images on adactio.com, and it’s working a treat …in Chrome.