In Search of Lost Time, by Tom Vanderbilt
Temporal standards bodies.
Temporal standards bodies.
Ethan highlights a classic case of the McNamara Fallacy—measuring adoption of design system components.
This describes how I iterate on The Session:
It comes down to this annoying, upsetting, stupid fact: the only way to build a great product is to use it every day, to stare at it, to hold it in your hands to feel its lumps. The data and customers will lie to you but the product never will.
This whole post reminded of the episode of the Clearleft podcast on measuring design.
The problem underlying all this is that when it comes to building a product, all data is garbage, a lie, or measuring the wrong thing. Folks will be obsessed with clicks and charts and NPS scores—the NFTs of product management—and in this sea of noise they believe they can see the product clearly. There are courses and books and talks all about measuring happiness and growth—surveys! surveys! surveys!—with everyone in the field believing that they’ve built a science when they’ve really built a cult.
Spoiler: the answer to the question in the title is a resounding “hell yeah!”
Scott brings receipts.
If you’re travelling around Ireland, you may come across some odd pieces of 19th century architecture—walls, bridges, buildings and roads that serve no purpose. They date back to The Great Hunger of the 1840s. These “famine follies” were the result of a public works scheme.
The thinking went something like this: people are starving so we should feed them but we can’t just give people food for nothing so let’s make people do pointless work in exchange for feeding them (kind of like an early iteration of proof of work for cryptobollocks on blockchains …except with a blockchain, you don’t even get a wall or a road, just ridiculous amounts of wasted energy).
This kind of thinking seems reprehensible from today’s perspective. But I still see its echo in the work ethic espoused by otherwise smart people.
Here’s the thing: there’s good work and there’s working hard. What matters is doing good work. Often, to do good work you need to work hard. And so people naturally conflate the two, thinking that what matters is working hard. But whether you work hard or not isn’t actually what’s important. What’s important is that you do good work.
If you can do good work without working hard, that’s not a bad thing. In fact, it’s great—you’ve managed to do good work and do it efficiently! But often this very efficiency is treated as laziness.
Sensible managers are rightly appalled by so-called productivity tracking because it measures exactly the wrong thing. Those instruments of workplace surveillance measure inputs, not outputs (and even measuring outputs is misguided when what really matters are outcomes).
They can attempt to measure how hard someone is working, but they don’t even attempt to measure whether someone is producing good work. If anything, they actively discourage good work; there’s plenty of evidence to show that more hours equates to less quality.
I used to think that must be some validity to the belief that hard work has intrinsic value. It was a position that was espoused so often by those around me that it seemed a truism.
But after a few decades of experience, I see no evidence for hard work as an intrinsically valuable activity, much less a useful measurement. If anything, I’ve seen the real harm that can be caused by tying your self-worth to how much you’re working. That way lies burnout.
We no longer make people build famine walls or famine roads. But I wonder how many of us are constructing little monuments in our inboxes and calendars, filling those spaces with work to be done in an attempt to chase the rewards we’ve been told will result from hard graft.
I’d rather spend my time pursuing the opposite: the least work for the most people.
A fascinating interactive journey through biometrics using your face.
One of my favourite episodes of the Clearleft podcast is on measuring design. This post from Chris is a complements that episode in a sensible and practical style.
What gets measured gets done. You are what you measure. Measurement eliminates argument. If you work in an environment that puts store in these oft-quoted business adages then I urge you to take a moment to challenge your calculations. Let’s review our metrics to ensure they can stand up and be counted.
Eric’s response to Chris’s question—“What is one thing people can do to make their website better?”—dovetails nicely with my own answer:
The two real problems here are:
- Third-party assets, such as the very analytics and CRM packages you use to determine who is using your product and how they go about it. There’s no real control over the quality or amount of code they add to your site, and setting up the logic to block them loading their own third-party resources is difficult to do.
- The people who tell you to add these third-party assets. These people typically aren’t aware of the performance issues caused by the ask, or don’t care because it’s not part of the results they’re judged by.
We’ve got click rates, impressions, conversion rates, open rates, ROAS, pageviews, bounces rates, ROI, CPM, CPC, impression share, average position, sessions, channels, landing pages, KPI after never ending KPI.
That’d be fine if all this shit meant something and we knew how to interpret it. But it doesn’t and we don’t.
The reality is much simpler, and therefore much more complex. Most of us don’t understand how data is collected, how these mechanisms work and most importantly where and how they don’t work.
A good post by Andy on “the language of business,” which is most cases turns out to be numbers, numbers, numbers.
While it seems reasonable and fair to expect a modicum of self-awareness of why you’re employed and what business value you drive in the the context of the work you do, sometimes the incessant self-flagellation required to justify and explain this to those who hired you may be a clue to a much deeper and more troubling question at the heart of the organisation you work for.
This pairs nicely with the Clearleft podcast episode on measuring design.
I’ve noticed a trend in recent years—a trend that I’ve admittedly been part of myself—where performance-minded developers will rebuild a site and then post a screenshot of their Lighthouse score on social media to show off how fast it is.
Mea culpa! I should post my CrUX reports too.
But I’m going to respectfully decline Phil’s advice to use any of the RUM analytics providers he recommends that require me to put another script
element on my site. One third-party script is one third-party script too many.
I remember trying to convince people to use semantic markup because it’s good for accessibility. That tactic didn’t always work. When it didn’t, I would add “By the way, Google’s searchbot is indistinguishable from a screen-reader user so semantic markup is good for SEO.”
That usually worked. It always felt unsatisfying though. I don’t know why. It doesn’t matter if people do the right thing for the wrong reasons. The end result is what matters. But still. It never felt great.
It happened with responsive design and progressive enhancement too. If I couldn’t convince people based on user experience benefits, I’d pull up some official pronouncement from Google recommending those techniques.
Even AMP, a dangerously ill-conceived project, has one very handy ace in the hole. You can’t add third-party JavaScript cruft to AMP pages. That’s useful:
Beleaguered developers working for publishers of big bloated web pages have a hard time arguing with their boss when they’re told to add another crappy JavaScript tracking script or bloated library to their pages. But when they’re making AMP pages, they can easily refuse, pointing out that the AMP rules don’t allow it. Google plays the bad cop for us, and it’s a very valuable role.
AMP is currently dying, which is good news. Google have announced that core web vitals will be used to boost ranking instead of requiring you to publish in their proprietary AMP format. The really good news is that the political advantage that came with AMP has also been ported over to core web vitals.
Take user-hostile obtrusive overlays. Perhaps, as a contientious developer, you’ve been arguing for years that they should be removed from the site you work on because they’re so bad for the user experience. Perhaps you have been met with the same indifference that I used to get regarding semantic markup.
Well, now you can point out how those annoying overlays are affecting, for example, the cumulative layout shift for the site. And that number is directly related to SEO. It’s one thing for a department to over-ride UX concerns, but I bet they’d think twice about jeopardising the site’s ranking with Google.
I know it doesn’t feel great. It’s like dealing with a bully by getting an even bigger bully to threaten them. Still. Needs must.
Core web vitals from Google are the ingredients for an alphabet soup of exlusionary intialisms. But once you get past the unnecessary jargon, there’s a sensible approach underpinning the measurements.
From May—no, June—these measurements will be a ranking signal for Google search so performance will become more of an SEO issue. This is good news. This is what Google should’ve done years ago instead of pissing up the wall with their dreadful and damaging AMP project that blackmailed publishers into using a proprietary format in exchange for preferential search treatment. It was all done supposedly in the name of performance, but in reality all it did was antagonise users and publishers alike.
Core web vitals are an attempt to put numbers on user experience. This is always a tricky balancing act. You’ve got to watch out for the McNamara fallacy. Harry has already started noticing this:
A new and unusual phenomenon: clients reluctant (even refusing) to fix performance issues unless they directly improve Vitals.
Once you put a measurement on something, there’s a danger of focusing too much on the measurement. Chris is worried that we’re going to see tips’n’tricks for gaming core web vitals:
This feels like the start of a weird new era of web performance where the metrics of web performance have shifted to user-centric measurements, but people are implementing tricky strategies to game those numbers with methods that, if anything, slightly harm user experience.
The map is not the territory. The numbers are a proxy for user experience, but it’s notoriously difficult to measure intangible ideas like pain and frustration. As Laurie says:
This is 100% the downside of automatic tools that give you a “score”. It’s like gameification. It’s about hitting that perfect score instead of the holistic experience.
And Ethan has written about the power imbalance that exists when Google holds all the cards, whether it’s AMP or core web vitals:
Google used its dominant position in the marketplace to force widespread adoption of a largely proprietary technology for creating websites. By switching to Core Web Vitals, those power dynamics haven’t materially changed.
We would do well to remember:
When you measure, include the measurer.
But if we’re going to put numbers to user experience, the core web vitals are a pretty good spread of measurements: largest contentful paint, cumulative layout shift, and first input delay.
(If you prefer using initialisms, remember that CFP is Certified Financial Planner, CLS is Community Legal Services, and FID is Flame Ionization Detector. Together they form CWV, Catholic War Veterans.)
Developers, particularly in Silicon Valley firms, are definitionally wealthy and enfranchised by world-historical standards. Like upper classes of yore, comfort (“DX”) comes with courtiers happy to declare how important comfort must surely be. It’s bunk, or at least most of it is.
As frontenders, our task is to make services that work well for all, not just the wealthy. If improvements in our tools or our comfort actually deliver improvements in that direction, so much the better. But we must never forget that measurable improvement for users is the yardstick.
Cloudfare’s alternative to Google Analytics is now available—for free—regardless of whether your a Cloudflare customer or not:
Being privacy-first means we don’t track individual users for the purposes of serving analytics. We don’t use any client-side state (like cookies or localStorage) for analytics purposes. Cloudflare also doesn’t track users over time via their IP address, User Agent string, or any other immutable attributes for the purposes of displaying analytics — we consider “fingerprinting” even more intrusive than cookies, because users have no way to opt out.
Goodhart’s Law applied to Google’s core web vitals:
If developers start to focus solely on Core Web Vitals because it is important for SEO, then some folks will undoubtedly try to game the system.
Personally, my beef with core web vitals is that they introduce even more uneccessary initialisms (see, for example, Harry’s recent post where he uses CWV metrics like LCP, FID, and CLS—alongside TTFB and SI—to look at PLPs, PDPs, and SRPs. I mean, WTF?).
This is an interesting project to try to rank web hosts by performance:
Real-world server response (Time to First Byte) latencies, as experienced by real-world users navigating the web.
Progressive disclosure interface patterns categorised and evaluated:
I really like the hypertext history invoked in this article.
The piece finishes with a great note on the MacNamara fallacy:
Everyone thinks metrics let us measure results. But, actually, they don’t. They measure only what they are measuring. Engagement, for example, is not something that can be measured, so we use an analogue for it. Time on page. Or clicks.
We often end up measuring what is quick, cheap, and easy to measure. Therefore, few organizations regularly conduct usability testing or customer-satisfaction surveys, but lots use analytics.
Even today, organizations often use clicks as a measure of engagement. So, all too often, they design user interfaces to generate clicks, so the system can measure them.
Pages are often designed so that they’re hard or impossible to read if some dependency fails to load. On a slow connection, it’s quite common for at least one depedency to fail.
Fire up Reader Mode and read this excellent article informed by data from using a typically slow connection in rural USA today. Two findings are:
- A large fraction of the web is unusable on a bad connection. Even on a good (0% packetloss, no ping spike) dialup connection, some sites won’t load.
- Some sites will use a lot of data!
This is so useful! Get instant results from Google’s Chrome User Experience Report without having to wait (or pay) for BigQuery.
Here’s an example of my site’s metrics over the last few months, complete with nice charts.