The transcript of Andy’s talk from this year’s State Of The Browser conference.
I don’t think using scale as an excuse for over-engineering stuff—especially CSS—is acceptable, even for huge teams that work on huge products.
The transcript of Andy’s talk from this year’s State Of The Browser conference.
I don’t think using scale as an excuse for over-engineering stuff—especially CSS—is acceptable, even for huge teams that work on huge products.
When your only tool seems like a smartphone, everything looks like an app.
Amber writes on Ev’s blog about products that deliberately choose to be dependent on smartphone connectivity:
We read service outage stories like these seemingly every week, and have become numb to the fundamental reality: The idea of placing the safety of yourself, your child, or another loved one in the hands of an app dependent on a server you cannot touch, control, or know the status of, is utterly unacceptable.
Back in the late 2000s, I used to go to Copenhagen every for an event called Reboot. It was a fun, eclectic mix of talks and discussions, but alas, the last one was over a decade ago.
It was organised by Thomas Madsen-Mygdal. I hadn’t seen Thomas in years, but then, earlier this year, our paths crossed when I was back at CERN for the 30th anniversary of the web. He got a real kick out of the browser recreation project I was part of.
I few months ago, I got an email from Thomas about the new event he’s running in Copenhagen called Techfestival. He was wondering if there was some way of making the WorldWideWeb project part of the event. We ended up settling on having a stand—a modern computer running a modern web browser running a recreation of the first ever web browser from almost three decades ago.
So I showed up at Techfestival and found that the computer had been set up in a Shoreditchian shipping container. I wasn’t exactly sure what I was supposed to do, so I just hung around nearby until someone wandering by would pause and start tentatively approaching the stand.
“Would you like to try the time machine?” I asked. Nobody refused the offer. I explained that they were looking at a recreation of the world’s first web browser, and then showed them how they could enter a URL to see how the oldest web browser would render a modern website.
Lots of people entered
google.com, but some people had their own websites, either personal or for their business. They enjoyed seeing how well (or not) their pages held up. They’d take photos of the screen.
People asked lots of questions, which I really enjoyed answering. After a while, I was able to spot the themes that came up frequently. Some people were confusing the origin story of the internet with the origin story of the web, so I was more than happy to go into detail on either or both.
The experience helped me clarify in my own mind what was exciting and interesting about the birth of the web—how much has changed, and how much and stayed the same.
All of this very useful fodder for a conference talk I’m putting together. This will be a joint talk with Remy at the Fronteers conference in Amsterdam in a couple of weeks. We’re calling the talk How We Built the World Wide Web in Five Days:
The World Wide Web turned 30 years old this year. To mark the occasion, a motley group of web nerds gathered at CERN, the birthplace of the web, to build a time machine. The first ever web browser was, confusingly, called WorldWideWeb. What if we could recreate the experience of using it …but within a modern browser! Join (Je)Remy on a journey through time and space and code as they excavate the foundations of Tim Berners-Lee’s gloriously ambitious and hacky hypertext system that went on to conquer the world.
Neither of us is under any illusions about the nature of a joint talk. It’s not half as much work; it’s more like twice the work. We’ve both seen enough uneven joint presentations to know what we want to avoid.
We’ve been honing the material and doing some run-throughs at the Clearleft HQ at 68 Middle Street this week. The talk has a somewhat unusual structure with two converging timelines. I think it’s going to work really well, but I won’t know until we actually deliver the talk in Amsterdam. I’m excited—and a bit nervous—about it.
Whether it’s in a shipping container in Copenhagen or on a stage in Amsterdam, I’m starting to realise just how much I enjoy talking about web history.
Drag this to your browser’s bookmark bar now!
The video of a talk in which Mark discusses pace layers, dogs, and design systems. He concludes:
It’s true many design systems are the blueprints for manufacturing and large scale application. But in almost every instance I can think of, once you move from design to manufacturing the horse has bolted. It’s very difficult to move back into design because the results of the system are in the wild. The more strict the system, the less able you are to change it. That’s why broad principles, just enough governance, and directional examples are far superior to locked-down cookie cutters.
When you ever had to fix just a few lines of CSS and it took two hours to get an ancient version of Gulp up and running, you know what I’m talking about.
I feel seen.
When everything works, it feels like magic. When something breaks, it’s hell.
I concur with Bastian’s advice:
I have a simple rule of thumb when it comes to programming:
less code === less potential issues
And this observation rings very true:
This dependency hell is also the reason why old projects are almost like sealed capsules. You can hardly let a project lie around for more than a year, because afterwards it’s probably broken.
An interesting proposal to allow websites to detect certain SMS messages. The UX implications are fascinating.
If you haven’t done so already, you should really switch to Firefox.
Then encourage your friends and family to switch to Firefox too.
Ignore the clickbaity headline and have a read of Whitney Kimball’s obituaries of Friendster, MySpace, Bebo, OpenSocial, ConnectU, Tribe.net, Path, Yik Yak, Ello, Orkut, Google+, and Vine.
I’m sure your content on Facebook, Twitter, and Instagram is perfectly safe.
If you treat data as a constraint in your design and development process, you’ll likely be able to brainstorm a large number of different ways to keep data usage to a minimum while still providing an excellent experience. Doing less doesn’t mean it has to feel broken.
A few years ago, a good friend of Patty’s had a medical diagnosis that required everyone to pull together. Another friend shared an article about how not to say the wrong thing. This is ring theory. In a moment of crisis, the person involved is in the centre. You need to understand where you are in this ring structure, and only ever help and comfort inwards and dump concerns and problems outwards.
At the same time, Patty spent time with her family at the beach. Everyone reads the same books together. There was a book about a platoon leader in Vietnam. 80% of the story was literally a litany of stuff—what everyone was carrying. This was peppered with the psychic and emotional loads that they were carrying.
There was a common assertion that slow networks were a third-world challenge. Remember Facebook’s network challenges? They always talked about new markets in India and Africa. The implication is that this isn’t our problem in, say, Omaha or New York.
Pew Research provided updated data this year. The research shows an increase in those trends. Half of the population access the web primarily on mobile. The cost of a broadband subscription is too expensive for many people. Sometimes broadband access simply isn’t available.
There’s a term called “the homework gap.” Two thirds of teachers assign broadband-dependent homework, while one third of students have no access to broadband.
At most 37% of people have unlimited data. Most people run out of data on a frequent basis.
Speed also varies wildly. 4G doesn’t really mean anything. The data is all over the place.
This shows that network issues are definitely not just a third world challenge.
On the 25th anniversary of the web, Tim Berners-Lee said the web’s potential was only just beginning to be glimpsed. Everyone has a role to play to ensure that the web serves all of humanity. In his contract for the web, Tim outlined what governments, companies, and users need to do. This reminded Patty of ring theory. The user is at the centre. Designers and developers are in the next circle out. Then there’s the circle of companies. Then there are platforms, browsers, and frameworks. Finally there’s the outer circle of governments.
There’s no way for a user to know before clicking a link how big and bloated the page is going to be. Even if they abandon the page load, they’ve still used (and wasted) a lot of data.
Third party scripts—like ads—are really bad at dumping in (to use the ring theory model). The best practices for ads suggest that up to 100 additional HTTP requests is totally acceptable. Unbelievable! It doesn’t matter how performant you’ve made a site when this crap gets piled on top of it.
In 2018, the internet’s data centres alone may already have had the same carbon footprint as all global air travel. This will probably triple in the next seven years. The amount of carbon it takes to train a single AI algorithm is more than the entire life cycle of a car. Then there’s fucking Bitcoin. A single Bitcoin transaction could power 21 US households. It is designed to use—specifically, waste—more and more energy over time.
What should we be doing?
Accessibility should be at the heart of what we build. Plan, test, educate, and advocate. If advocacy doesn’t work, fear can be a motivator. There’s an increase in accessibility lawsuits.
Our websites should be as light as possible. Ask, measure, monitor, and optimise. RequestMap is a great tool for visualising requests. You can see the size and scale of third-party requests. You can also see when images are far, far bigger than they need to be.
Take a critical guide to everything and pare everything down. Set perforance budgets—file size budgets, for example. Optimise images, subset custom fonts, lazyload images and videos, get third-party tools out of the critical path (or out completely), and seek out lighter frameworks.
Push the boundaries. See the amazing work that Adrian Holovaty did with Soundslice. He had to make on-the-fly sheet music generation work on old iPads that musicians like to use. He recommends keeping old devices around to see how poorly your product is working on it.
If you have some power, then your job is to empower somebody else.
Jason is on stage at An Event Apart Chicago in a tuxedo. He wants to talk about how we can make web forms magical. Oh, I see. That explains the get-up.
We’re always being told to make web forms shorter. Luke Wroblewski has highlighted the work of companies that have reduced form fields and increased conversion.
But what if we could get rid of forms altogether? Wouldn’t that be magical!
Jason will reveal the secrets to this magic. But first—a volunteer from the audience, please! Please welcome Joe to the stage.
Joe will now log in on a phone. He types in the username. Then the password. The password is hodge-podge of special characters, numbers and upper and lowercase letters. Joe starts typing. Jason takes the phone and logs in without typing anything!
The secret: Jason was holding an NFC security key in his hand. That works with a new web standard called WebAuthn.
Passwords are terrible. People share them across sites, but who can blame them? It’s hard to remember lots of passwords. The only people who love usernames and passwords are hackers. So sites are developing other methods to try to keep people secure. Two factor authentication helps, although it doesn’t help us with phishing attacks. The hacker gets the password from the phished user …and then gets the one-time code from the phished user too.
But a physical device like a security key solves this problem. So why aren’t we all using security keys (apart from the fear of losing the key)? Well, until WebAuthn, there wasn’t a way for websites to use the keys.
A web server generates a challenge—a long string—that gets sent to a website and passed along to the user. The user’s device generates a credential ID and public and private keys for that domain. The web site stores the public key and credential ID. From then on, the credential ID is used by the website in challenges to users logging in.
There were three common ways that we historically proved who we claimed to be.
These are factors of identification. So two-factor identification is the combination of any of those two. If you use a security key combined with a fingerprint scanner, there’s no need for passwords.
The browser support for the web authentication API (WebAuthn) is a bit patchy right now but you can start playing around with it.
There are a few other options for making logging in faster. There’s the Credential Management API. It allows someone to access passwords stored in their browser’s password manager. But even though it’s newer, there’s actually better browser support for WebAuthn than Credential Management.
Then there’s federated login, or social login. Jason has concerns about handing over log-in to a company like Facebook, Twitter, or Google, but then again, it means fewer passwords. As a site owner, there’s actually a lot of value in not storing log-in information—you won’t be accountable for data breaches. The problem is that you’ve got to decide which providers you’re going to support.
Also keep third-party password managers in mind. These tools—like 1Password—are great. In iOS they’re now nicely integrated at the operating system level, meaning Safari can use them. Finally it’s possible to log in to websites easily on a phone …until you encounter a website that prevents you logging in this way. Some websites get far too clever about detecting autofilled passwords.
Time for another volunteer from the audience. This is Tyler. Tyler will help Jason with a simple checkout form. Shipping information, credit card information, and so on. Jason will fill out this form blindfolded. Tyler will first verify that the dark goggles that Jason will be wearing don’t allow him to see the phone screen. Jason will put the goggles on and Tyler will hand him the phone with the checkout screen open.
Jason dons the goggles. Tyler hands him the phone. Jason does something. The form is filled in and submitted!
What was the secret? The goggles prevented Jason from seeing the phone …but they didn’t prevent the screen from seeing Jason. The goggles block everything but infrared. The iPhone uses infrared for Face ID. So the iPhone, it just looked like Jason was wearing funky sunglasses. Face ID then triggered the Payment Request API.
The Payment Request API allows us to use various payment methods that are built in to the operating system, but without having to make separate implementations for each payment method. The site calls the Payment Request API if it’s supported (use feature detection and progressive enhancement), then trigger the payment UI in the browser. The browser—not the website!—then makes a call to the payment processing provider e.g. Stripe.
E-commerce sites using the Payment Request API have seen a big drop in abandonment and a big increase in completed payments. The browser support is pretty good, especially on mobile. And remember, you can use it as a progressive enhancement. It’s kind of weird that we don’t encounter it more often—it’s been around for a few years now.
Jason read the fine print for Apple Pay, Google Pay, Microsoft Pay, and Samsung Pay. It doesn’t like there’s anything onerous in there that would stop you using them.
On some phones, you can now scan credit cards using the camera. This is built in to the operating system so as a site owner, you’ve just got to make sure not to break it. It’s really an extension of autofill. You should know what values the
autocomplete attribute can take. There are 48 different values; it’s not just for checkouts. When users use autofill, they fill out forms 30% faster. So make sure you don’t put obstacles in the way of autofill in your forms.
Jason proceeds to relate a long and involved story about buying burritos online from Chipotle. The upshot is: use the
pattern attributes correctly on
input elements. Test autofill with your forms. Make it part of your QA process.
So, to summarise, here’s how you make your forms disappear:
Any sufficiently advanced technology is indistinguishable from magic.
—Arthur C. Clarke’s Third Law
Don’t our users deserve magical experiences?
Successful voice interfaces aren’t necessarily solving new problems. They’re used to solve problems that other devices have already solved. Think about kitchen timers. There are lots of ways to set a timer. Your oven might have one. Your phone has one. Why use a $200 device to solve this mundane problem? Same goes for listening to music, news, and weather.
People are using voice interfaces for solving ordinary problems. Why? Context matters. If you’re carrying a toddler, then setting a kitchen timer can be tricky so a voice-activated timer is quite appealing. But why is voice is happening now?
Humans have been developing the art of conversation for thousands of years. It’s one of the first skills we learn. It’s deeply instinctual. Most humans use speach instinctively every day. You can’t necessarily say that about using a keyboard or a mouse.
Voice-based user interfaces are not new. Not just the idea—which we’ve seen in Star Trek—but the actual implementation. Bell Labs had Audrey back in 1952. It recognised ten words—the digits zero through nine. Why did it take so long to get to Alexa?
In the late 70s, DARPA issued a challenge to create a voice-activated system. Carnagie Mellon came up with Harpy (with a thousand word grammar). But none of the solutions could respond in real time. In conversation, we expect a break of no more than 200 or 300 milliseconds.
In the 1980s, computing power couldn’t keep up with voice technology, so progress kind of stopped. Time passed. Things finally started to catch up in the 90s with things like Dragon Naturally Speaking. But that was still about vocabulary, not grammar. By the 2000s, small grammars were starting to show up—starting an X-Box or pausing Netflix. In 2008, Google Voice Search arrived on the iPhone and natural language interaction began to arrive.
What makes natural language interactions so special? It requires minimal training because it uses the conversational muscles we’ve been working for a lifetime. It unlocks the ability to have more forgiving, less robotic conversations with devices. There might be ten different ways to set a timer.
Natural language interactions can also free us from “screen magnetism”—that tendency to stay on a device even when our original task is complete. Voice also enables fast and forgiving searches of huge catalogues without time spent typing or browsing. You can pick a needle straight out of a haystack.
Natural language interactions are excellent for older customers. These interfaces don’t intimidate people without dexterity, vision, or digital experience. Voice input often leads to more inclusive experiences. Many customers with visual or physical disabilities can’t use traditional graphical interfaces. Voice experiences throw open the door of opportunity for some people. However, voice experience can exclude people with speech difficulties.
There’s a misconception that you need to work at Amazon, Google, or Apple to work on a voice interface, or at least that you need to have a big product team. But Cheryl was able to make her first Alexa “skill” in a week. If you’re a web developer, you’re good to go. Your voice “interaction model” is just JSON.
How do you get your product team on board? Find the customers (and situations) you might have excluded with traditional input. Tell the stories of people whose hands are full, or who are vision impaired. You can also point to the adoption rate numbers for smart speakers.
You’ll need to show your scenario in context. Otherwise people will ask, “why can’t we just build an app for this?” Conduct research to demonstrate the appeal of a voice interface. Storyboarding is very useful for visualising the context of use and highlighting existing pain points.
You’ve got to understand how the technology works in order to adapt to how it fails. Here are a few basic concepts.
Utterance. A word, phrase, or sentence spoken by a customer. This is the true form of what the customer provides.
Intent. This is the meaning behind a customer’s request. This is an important distinction because one intent could have thousands of different utterances.
Prompt. The text of a system response that will be provided to a customer. The audio version of a prompt, if needed, is generated separately using text to speech.
Grammar. A finite set of expected utterances. It’s a list. Usually, each entry in a grammar is paired with an intent. Many interfaces start out as being simple grammars before moving on to a machine-learning model later once the concept has been proven.
Here’s the general idea with “artificial intelligence”…
There’s a human with a core intent to do something in the real world, like knowing when the cookies in the oven are done. This is translated into an intent like, “set a 15 minute timer.” That’s the utterance that’s translated into a string. But it hasn’t yet been parsed as language. That string is passed into a natural language understanding system. What comes is a data structure that represents the customers goal e.g. intent=timer; duration=15 minutes. That’s sent to the business logic where a timer is actually step. For a good voice interface, you also want to send back a response e.g. “setting timer for 15 minutes starting now.”
That seems simple enough, right? What’s so hard about designing for voice?
Natural language interfaces are a form of artifical intelligence so it’s not deterministic. There’s a lot of ruling out false positives. Unlike graphical interfaces, voice interfaces are driven by probability.
How do you turn a sound wave into an understandable instruction? It’s a lot like teaching a child. You feed a lot of data into a statistical model. That’s how machine learning works. It’s a probability game. That’s where it gets interesting for design—given a bunch of possible options, we need to use context to zero in on the most correct choice. This is where confidence ratings come in: the system will return the probability that a response is correct. Effectively, the system is telling you how sure or not it is about possible results. If the customer makes a request in an unusual or unexpected way, our system is likely to guess incorrectly. That’s because the system is being given something new.
Designing a conversation is relatively straightforward. But 80% of your voice design time will be spent designing for what happens when things go wrong. In voice recognition, edge cases are front and centre.
Here’s another challenge. Interaction with most voice interfaces is part conversation, part performance. Most interactions are not private.
Humans don’t distinguish digital speech fom human speech. That means these devices are intrinsically social. Our brains our wired to try to extract social information, even form digital speech. See, for example, why it’s such a big question as to what gender a voice interface has.
Storyboards help depict the context of use. Sample dialogues are your new wireframes. These are little scripts that not only cover the happy path, but also your edge case. Then you reverse engineer from there.
Flow diagrams communicate customer states, but don’t use the actual text in them.
Prompt lists are your final deliverable.
Functional prototypes are really important for voice interfaces. You’ll learn the real way that customers will ask for things.
If you build a working prototype, you’ll be building two things: a natural language interaction model (often a JSON file) and custom business logic (in a programming language).
Eventually voice design will become a core competency, much like mobile, which was once separate.
Ask yourself what tasks your customers complete on your site that feel clunkly. Remember that voice desing is almost never about new scenarious. Start your journey into voice interfaces by tackling old problems in new, more inclusive ways.
May the voice be with you!
Research gets done …and then sits in a report, gathering dust.
Research matters. But how do we make it count? We need allies. Maybe we need more money. Perhaps we need more participation from people not on the product team.
If you’re doing real research on a schedule, sharing it on a regular basis, making people’s eyes light up …then you’ve won!
Research counts when it answers questions that people care about. But you probably don’t want to directly ask “Hey, what questions do you want answered?”
Research can explain oddities in analytics weird feedback from customers, unexpected uses of products, and strange hunches (not just your own).
Curious people with power are the most useful ones to influence. Not just hierarchical power. Engineers often have a lot of power. So ask, “Who is the most curious engineer, and how can I drag them out on a research session with me?”
At 18F, Cyd found that a lot of the nodes of power were in the mid level of the organisation who had been there a while—they know a lot of people up and down the chain. If you can get one of those people excited about research, they can spread it.
Open up your practice. Demystify it. Put as much effort into communicating as into practicing. Create opportunities for people to ask questions and learn.
You can think about communities of practice in the obvious way: people who do similar things to us, and other people who make design decisions. But really, everyone in the organisation is affected by design decisions.
Cyd likes to do office hours. People can come by and ask questions. You could open a Slack channel. You can run brown bag lunches to train people in basic user research techniques. In more conventional organisations, a newsletter is a surprisingly effective tool for sharing the latest findings from research. And use your walls to show work in progress.
Research counts when people can see it for themselves—not just when it’s reported from afar. Ask yourself: who in your organisation is disconnected from their user? It’s difficult for people to maintain their motivation in that position.
When someone has been in the field with you, the data doesn’t have to be explained.
Whoever’s curious. Whoever’s disconnected. Invite them along. Show them what you’re doing.
Think about the qualities of a good invitation (for a party, say). Make the rules clear. Make sure they want to come back. Design the experience of observing research. Make sure everyone has tools. Give everyone a responsibility. Be like Willy Wonka—he gave clear rules to the invitied guests. And sure, things didn’t go great when people broke the rules, but at the end, everyone still went home with the truckload of chocolate they were promised.
People who get to ask a question buy in to the results. Those people feel a sense of ownership for the research.
Research counts when methods fit the question. Think about what the right question is and how you might go about answering it.
You can mix your methods. Interviews. Diary studies. Card sorting. Shadowing. You can ground the user research in competitor analysis.
Back in 2008, Cyd was contacted by a company who wanted to know: how do people really use phones in their cars? Cyd’s team would ride along with people, interviewing them, observing them, taking pictures and video.
Later at the federal government, Cyd was asked: what are the best practices for government digital transformation? How to answer that? It’s so broad! Interviews? Who knows what?
They refined the question: what makes modern digital practices stick within a government entity? They looked at what worked when companies were going online, so see if there was anything that government could learn from. Then they created a set of really focused interview questions. What does digital transformation mean? How do you know when you’re done? What are the biggest obstacles to this work? How do you make changes last?
They used atechnique called cluster recruiting to figure out who else to talk to (by asking participants who else they should be talking to).
There is no one research method that will always work for you. Cutting the right corners at the right time lets you be fast and cheap. Cyd’s bare-bones research kit costs about $20: a notebook, a pen, a consent form, and the price of a cup of coffee. She also created a quick score sheet for when she’s not in a position to have research transcribed.
Always label your assumptions before beginning your research. Maybe you’re assuming that something is a frustrating experience that needs fixing, but it might emerge that it doesn’t need fixing—great! You’ve just saved a whole lotta money.
Research counts when researchers tell the story well. Synthesis works best as a conversational practice. It’s hard to do by yourself. You start telling stories when you come back from the field (sometimes it starts when you’re still out in the field, talking about the most interesting observations).
Miller’s Law is a great conceptual framework:
To understand what another person is saying, you must assume that it is true and try to imagine what it could be true of.
You’re probably familiar with the “five whys”. What about the “five ways”? If people talk about something five different ways, it’s virtually certain that one of them will be an apt metaphor. So ask “Can you say that in a different way?” five time.
Spend as much time on communicating outcomes as you did on executing the work.
After research, play back how many people you spoke to, the most valuable insight you gained, the themes that are emerging. Describe the question you wanted to answer, what answers you got, and what you’re going to do next. If you’re in an organisation that values memos, write a memo. Or you could make a video. Or you could write directly into backlog tickets. And don’t forget the wall work! GDS have wonderfully full walls in their research department.
In the end, the best tool for research is an illuminating story.
Cyd was doing research at the Bakersfield courthouse. The hypothesis was that a lot of people weren’t engaging with technology in the court system. She approached a man named Manuel who was positively quaking. He was going through a custody battle. He said, “I don’t know technology but it doesn’t scare me. I’m shaking because this paperwork just gets to me—it’s terrifying.” He said who would gladly pay for someone to help him with the paperwork. Cyd wrote a report on this story. Months later, they heard people in the organisation asking questions like “How would this help Manuel?”
Sometimes you do have to fight (nicely).
People will push back on the time spent on research—they’ll say it doesn’t fit the sprint plan. You can have a three day research plan. Day 1: write scripts. Day 2: go to the users and talk to them. Day 3: play it back. People on a project spend more time than that in Slack.
People will say you can’t talk to the customers. In that situation, you could talk to people who are in the same sector as your company’s customers.
People will question the return on investment for research. Do it cheaply and show the very low costs. Then people stop talking about the money and start talking about the results.
People will claim that qualitative user research is not statistically significant. That’s true. But research is something else. It answers different question.
People will question whether a senior person needs to be involved. It is not fair to ask the intern to do all the work involved in research.
People will say you can’t always do research. But Cyd firmly believes that there’s always room for some research.
One of the best things you can do is be there, non-judgementally, making friends. It takes time, but it works. Research is like a dandelion in flight. Once it’s out and about, taking root, the more that research counts.
Maciej goes marching.
The protests are intentionally decentralized, using a jury-rigged combination of a popular message board, the group chat app Telegram, and in-person huddles at the protests.
This sounds like it shouldn’t possibly work, but the protesters are too young to know that it can’t work, so it works.
I have a proposal that I think might alleviate some of the animosity around Google AMP. You can jump straight to the proposal or get some of the back story first…
But I cannot get behind AMP.
Instead of competing on its own merits, AMP is unfairly propped up by the search engine of its parent company, Google. That makes it very hard to evaluate whether AMP is being used on its own merits. Instead, the evidence suggests that most publishers of AMP pages are doing so because they feel they have to, rather than because they want to. That’s a real shame, because as a library of web components, AMP seems pretty good. But there’s just no way to evaluate AMP-the-format without taking into account AMP-the-ecosystem.
Google AMP ostensibly exists to make the web faster. Initially the focus was specifically on mobile performance, but that distinction has since fallen by the wayside. The idea is that by using AMP’s web components, your pages will be speedy. Though, as Andy Davies points out, this isn’t always the case:
This is where I get confused… https://independent.co.uk only have an AMP site yet it’s performance is awful from a user perspective - isn’t AMP supposed to prevent this?
According to Google’s own Page Speed Insights audit (which Google recommends to check your performance), the AMP version of articles got an average performance score of 87. The non-AMP versions? 95.
Publishers who already have fast web pages—like The Guardian—are still compelled to make AMP versions of their stories because of the search benefits reserved for AMP. As Terence Eden reported from a meeting of the AMP advisory committee:
We heard, several times, that publishers don’t like AMP. They feel forced to use it because otherwise they don’t get into Google’s news carousel — right at the top of the search results.
Some people felt aggrieved that all the hard work they’d done to speed up their sites was for nothing.
The Google AMP team are at pains to point out that AMP is not a ranking factor in search. That’s true. But it is unfairly privileged in other ways. Only AMP pages can appear in the Top Stories carousel …which appears above any other search results. As I’ve said before:
Now, if you were to ask any right-thinking person whether they think having their page appear right at the top of a list of search results would be considered preferential treatment, I think they would say hell, yes! This is the only reason why The Guardian, for instance, even have AMP versions of their content—it’s not for the performance benefits (their non-AMP pages are faster); it’s for that prime real estate in the carousel.
Content that “opts in” to AMP and the associated hosting within Google’s domain is granted preferential search promotion, including (for news articles) a position above all other results.
That’s not the only way that AMP pages get preferential treatment. It turns out that the secret to the speed of AMP pages isn’t the web components. It’s the prerendering.
If you’ve ever seen an AMP page in a list of search results, you’ll have noticed the little lightning icon. If you’ve ever tapped on that search result, you’ll have noticed that the page loads blazingly fast!
That’s not down to AMP-the-format, alas. That’s down to the fact that the page has been prerendered by Google before you even went to it. If any page were prerendered that way, it would load blazingly fast. But currently, this privilege is reserved for AMP pages only.
If, after tapping through to that AMP page, you looked at the address bar of your browser, you might have noticed something odd. Even though you might have thought you were visiting The Washington Post, or The New York Times, the URL of the (blazingly fast) page you’re looking at is still under Google’s domain. That’s because Google hosts any AMP pages that it prerenders.
Google calls this “the AMP cache”, but it would be better described as “AMP hosting”. The web page sent down the wire is hosted on Google’s domain.
Here’s that AMP letter again:
When a user navigates from Google to a piece of content Google has recommended, they are, unwittingly, remaining within Google’s ecosystem.
Through gritted teeth, I will refer to this as “the AMP cache”, because that’s what everyone else calls it. But make no mistake, Google is hosting—not caching—these pages.
But why host the pages on a Google domain? Why not prerender the original URLs?
The pitch I think site owners are hearing is: let us host your pages on our domain and we’ll promote them in search results AND preload them so they feel “instant.” To opt-in, build pages using this component syntax.
But perhaps we could de-couple the AMP format from the AMP cache.
That’s what Terence suggests:
My recommendation is that Google stop requiring that organisations use Google’s proprietary mark-up in order to benefit from Google’s promotion.
Instead of granting premium placement in search results only to AMP, provide the same perks to all pages that meet an objective, neutral performance criterion such as Speed Index.
It’s been said before but it would be so good for the web if pages with a Lighthouse score over say, 90 could get into that top search result area, even if they’re not built using Google’s AMP framework. Feels wrong to have to rebuild/reproduce an already-fast site just for SEO.
Here’s the problem…
Let’s say Google do indeed prerender already-fast pages when they’re listed in search results. You, a search user, type something into Google. A list of results come back. Google begins pre-rendering some of them. But you don’t end up clicking through to those pages. Nonetheless, the servers those pages are hosted on have received a GET request coming from a Google search. Those publishers now know that a particular (cookied?) user could have clicked through to their site. That’s very different from knowing when someone has actually arrived at a particular site.
And that’s why Google host all the AMP pages that they prerender. Given the privacy implications of prerendering non-Google URLs, I must admit that I see their point.
Still, it’s a real shame to miss out on the speed benefit of prerendering:
Prerendering AMP documents leads to substantial improvements in page load times. Page load time can be measured in different ways, but they consistently show that prerendering lets users see the content they want faster. For now, only AMP can provide the privacy preserving prerendering needed for this speed benefit.
Why is Google’s AMP cache just for AMP pages? (Y’know, apart from the obvious answer that it’s in the name.)
What if Google were allowed to host non-AMP pages? Google search could then prerender those pages just like it currently does for AMP pages. There would be no privacy leaks; everything would happen on the same domain—google.com or ampproject.org or whatever—just as currently happens with AMP pages.
Don’t get me wrong: I’m not suggesting that Google should make a 1:1 model of the web just to prerender search results. I think that the implementation would need to have two important requirements:
This could be a
meta element. Maybe something like:
<meta name="caches-allowed" content="google">
This would have the nice benefit of allowing comma-separated values:
<meta name="caches-allowed" content="google, yandex">
(The name is just a strawman, by the way—I’m not suggesting that this is what the final implementation would actually look like.)
If not a
meta element, then perhaps this could be part of
robots.txt? Although my feeling is that this needs to happen on a document-by-document basis rather than site-wide.
Many people will, quite rightly, never want Google—or anyone else—to host and serve up their content. That’s why it’s so important that this behaviour needs to be opt-in. It’s kind of appalling that the current hosting of AMP pages is opt-in-by-proxy-sort-of.
Which pages should be blessed with hosting and prerendering? The fast ones. That’s sorta the whole point of AMP. But right now, there’s a lot of resentment by people with already-fast websites who quite rightly feel they shouldn’t have to use the AMP format to benefit from the AMP ecosystem.
Page speed is already a ranking factor. It doesn’t seem like too much of a stretch to extend its benefits to hosting and prerendering. As mentioned above, there are already a few possible metrics to use:
Ah, but what if a page has good score when it’s indexed, but then gets worse afterwards? Not a problem! The version of the page that’s measured is the same version of the page that gets hosted and prerendered. Google can confidently say “This page is fast!” After all, they’re the ones serving up the page.
That does raise the question of how often Google should check back with the original URL to see if it has changed/worsened/improved. The answer to that question is however long it currently takes to check back in on AMP pages:
Each time a user accesses AMP content from the cache, the content is automatically updated, and the updated version is served to the next user once the content has been cached.
This proposal does not solve the problem with the address bar. You’d still find yourself looking at a page from The Washington Post or The New York Times (or adactio.com) but seeing a completely different URL in your browser. That’s not good, for all the reasons outlined in the AMP letter.
In fact, this proposal could potentially make the situation worse. It would allow even more sites to be impersonated by Google’s URLs. Where currently only AMP pages are bad actors in terms of URL confusion, opening up the AMP cache would allow equal opportunity URL confusion.
What I’m suggesting is definitely not a long-term solution. The long-term solutions currently being investigated are technically tricky and will take quite a while to come to fruition—web packages and signed exchanges. In the meantime, what I’m proposing is a stopgap solution that’s technically a lot simpler. But it won’t solve all the problems with AMP.
This proposal solves one problem—AMP pages being unfairly privileged in search results—but does nothing to solve the other, perhaps more serious problem: the erosion of site identity.
Currently, Google can assess whether a page should be hosted and prerendered by checking to see if it’s a valid AMP page. That test would need to be widened to include a different measurement of performance, but those measurements already exist.
I can see how this assessment might not be as quick as checking for AMP validity. That might affect whether non-AMP pages could be measured quickly enough to end up in the Top Stories carousel, which is, by its nature, time-sensitive. But search results are not necessarily as time-sensitive. Let’s start there.
Currently, AMP pages can be prerendered without fetching anything other than the markup of the AMP page itself. All the CSS is inline. There are no initial requests for other kinds of content like images. That’s because there are no
img elements on the page: authors must use
amp-img instead. The image itself isn’t loaded until the user is on the page.
If the AMP cache were to be opened up to non-AMP pages, then any content required for prerendering would also need to be hosted on that same domain. Otherwise, there’s privacy leakage.
This definitely introduces an extra level of complexity. Paths to assets within the markup might need to be re-written to point to the Google-hosted equivalents. There would almost certainly need to be a limit on the number of assets allowed. Though, for performance, that’s no bad thing.
Make no mistake, figuring out what to do about assets—style sheets, scripts, and images—is very challenging indeed. Luckily, there are very smart people on the Google AMP team. If that brainpower were to focus on this problem, I am confident they could solve it.
There will be technical challenges, but hopefully nothing insurmountable.
I honestly can’t see what Google have to lose here. If their goal is genuinely to reward fast pages, then opening up their AMP cache to fast non-AMP pages will actively encourage people to make fast web pages (without having to switch over to the AMP format).
I’ve deliberately kept the details vague—what the opt-in should look like; what the speed measurement should be; how to handle assets—I’m sure smarter folks than me can figure that stuff out.
I would really like to know what other people think about this proposal. Obviously, I’d love to hear from members of the Google AMP team. But I’d also love to hear from publishers. And I’d very much like to know what people in the web performance community think about this. (Write a blog post and send me a webmention.)
What am I missing here? What haven’t I thought of? What are the potential pitfalls (and are they any worse than the current acrimonious situation with Google AMP)?
I would really love it if someone with a fast website were in a position to say, “Hey Google, I’m giving you permission to host this page so that it can be prerendered.”
I would really love it if someone with a slow website could say, “Oh, shit! We’d better make our existing website faster or Google won’t host our pages for prerendering.”
And I would dearly love to finally be able to embrace AMP-the-format with a clear conscience. But as long as prerendering is joined at the hip to the AMP format, the injustice of the situation only harms the AMP project.
Google, open up the AMP cache.
The web embodies principles of openness and portability and access that best align with the needs, and frankly the purpose, of the cultural heritage sector.
Aaron’s talk from the 2019 Museums and the Web conference.
In 2019 the web is not “sexy” anymore and compared to native platforms it can sometimes seems lacking, but I think that speaks as much to people’s desire for something “new” as it does to any apples to apples comparison. On measure – and that’s the important part: on measure – the web affords a better and more sustainable framework for the cultural heritage to work in than any of the shifting agendas of the various platform vendors.
Brendan describes the software he’s using to get away from Adobe’s mafia business model.