Journal tags: learning

35

sparkline

Steam

Picture someone tediously going through a spreadsheet that someone else has filled in by hand and finding yet another error.

“I wish to God these calculations had been executed by steam!” they cry.

The year was 1821 and technically the spreadsheet was a book of logarithmic tables. The frustrated cry came from Charles Babbage, who channeled his frustration into a scheme to create the world’s first computer.

His difference engine didn’t work out. Neither did his analytical engine. He’d spend his later years taking his frustrations out on street musicians, which—as a former busker myself—earns him a hairy eyeball from me.

But we’ve all been there, right? Some tedious task that feels soul-destroying in its monotony. Surely this is exactly what machines should be doing?

I have a hunch that this is where machine learning and large language models might turn out to be most useful. Not in creating breathtaking works of creativity, but in menial tasks that nobody enjoys.

Someone was telling me earlier today about how they took a bunch of haphazard notes in a client meeting. When the meeting was done, they needed to organise those notes into a coherent summary. Boring! But ChatGPT handled it just fine.

I don’t think that use-case is going to appear on the cover of Wired magazine anytime soon but it might be a truer glimpse of the future than any of the breathless claims being eagerly bandied about in Silicon Valley.

You know the way we no longer remember phone numbers, because, well, why would we now that we have machines to remember them for us? I’d be quite happy if machines did that for the annoying little repetitive tasks that nobody enjoys.

I’ll give you an example based on my own experience.

Regular expressions are my kryptonite. I’m rubbish at them. Any time I have to figure one out, the knowledge seeps out of my brain before long. I think that’s because I kind of resent having to internalise that knowledge. It doesn’t feel like something a human should have to know. “I wish to God these regular expressions had been calculated by steam!”

Now I can get a chatbot with a large language model to write the regular expression for me. I still need to describe what I want, so I need to write the instructions clearly. But all the gobbledygook that I’m writing for a machine now gets written by a machine. That seems fair.

Mind you, I wouldn’t blindly trust the output. I’d take that regular expression and run it through a chatbot, maybe a different chatbot running on a different large language model. “Explain what this regular expression does,” would be my prompt. If my input into the first chatbot matches the output of the second, I’d have some confidence in using the regular expression.

A friend of mine told me about using a large language model to help write SQL statements. He described his database structure to the chatbot, and then described what he wanted to select.

Again, I wouldn’t use that output without checking it first. But again, I might use another chatbot to do that checking. “Explain what this SQL statement does.”

Playing chatbots off against each other like this is kinda how machine learning works under the hood: generative adverserial networks.

Of course, the task of having to validate the output of a chatbot by checking it with another chatbot could get quite tedious. “I wish to God these large language model outputs had been validated by steam!”

Sounds like a job for machines.

Disclosure

You know how when you’re on hold to any customer service line you hear a message that thanks you for calling and claims your call is important to them. The message always includes a disclaimer about calls possibly being recorded “for training purposes.”

Nobody expects that any training is ever actually going to happen—surely we would see some improvement if that kind of iterative feedback loop were actually in place. But we most certainly want to know that a call might be recorded. Recording a call without disclosure would be unethical and illegal.

Consider chatbots.

If you’re having a text-based (or maybe even voice-based) interaction with a customer service representative that doesn’t disclose its output is the result of large language models, that too would be unethical. But, at the present moment in time, it would be perfectly legal.

That needs to change.

I suspect the necessary legislation will pass in Europe first. We’ll see if the USA follows.

In a way, this goes back to my obsession with seamful design. With something as inherently varied as the output of large language models, it’s vital that people have some way of evaluating what they’re told. I believe we should be able to see as much of the plumbing as possible.

The bare minimum amount of transparency is revealing that a machine is in the loop.

This shouldn’t be a controversial take. But I guarantee we’ll see resistance from tech companies trying to sell their “AI” tools as seamless, indistinguishable drop-in replacements for human workers.

Guessing

The last talk at the last dConstruct was by local clever clogs Anil Seth. It was called Your Brain Hallucinates Your Conscious Reality. It’s well worth a listen.

Anil covers a lot of the same ground in his excellent book, Being You. He describes a model of consciousness that inverts our intuitive understanding.

We tend to think of our day-to-day reality in a fairly mechanical cybernetic manner; we receive inputs through our senses and then make decisions about reality informed by those inputs.

As another former dConstruct speaker, Adam Buxton, puts it in his interview with Anil, it feels like that old Beano cartoon, the Numskulls, with little decision-making homonculi inside our head.

But Anil posits that it works the other way around. We make a best guess of what the current state of reality is, and then we receive inputs from our senses, and then we adjust our model accordingly. There’s still a feedback loop, but cause and effect are flipped. First we predict or guess what’s happening, then we receive information. Rinse and repeat.

The book goes further and applies this to our very sense of self. We make a best guess of our sense of self and then adjust that model constantly based on our experiences.

There’s a natural tendency for us to balk at this proposition because it doesn’t seem rational. The rational model would be to make informed calculations based on available data …like computers do.

Maybe that’s what sets us apart from computers. Computers can make decisions based on data. But we can make guesses.

Enter machine learning and large language models. Now, for the first time, it appears that computers can make guesses.

The guess-making is not at all like what our brains do—large language models require enormous amounts of inputs before they can make a single guess—but still, this should be the breakthrough to be shouted from the rooftops: we’ve taught machines how to guess!

And yet. Almost every breathless press release touting some revitalised service that uses AI talks instead about accuracy. It would be far more honest to tout the really exceptional new feature: imagination.

Using AI, we will guess who should get a mortgage.

Using AI, we will guess who should get hired.

Using AI, we will guess who should get a strict prison sentence.

Reframed like that, it’s easy to see why technologists want to bury the lede.

Alas, this means that large language models are being put to use for exactly the wrong kind of scenarios.

(This, by the way, is also true of immersive “virtual reality” environments. Instead of trying to accurately recreate real-world places like meeting rooms, we should be leaning into the hallucinatory power of a technology that can generate dream-like situations where the pleasure comes from relinquishing control.)

Take search engines. They’re based entirely on trust and accuracy. Introducing a chatbot that confidentally conflates truth and fiction doesn’t bode well for the long-term reputation of that service.

But what if this is an interface problem?

Currently facts and guesses are presented with equal confidence, hence the accurate descriptions of the outputs as bullshit or mansplaining as a service.

What if the more fanciful guesses were marked as such?

As it is, there’s a “temperature” control that can be adjusted when generating these outputs; the more the dial is cranked, the further the outputs will stray from the safest predictions. What if that could be reflected in the output?

I don’t know what that would look like. It could be typographic—some markers to indicate which bits should be taken with pinches of salt. Or it could be through content design—phrases like “Perhaps…”, “Maybe…” or “It’s possible but unlikely that…”

I’m sure you’ve seen the outputs when people request that ChatGPT write their biography. Perfectly accurate statements are generated side-by-side with complete fabrications. This reinforces our scepticism of these tools. But imagine how differently the fabrications would read if they were preceded by some simple caveats.

A little bit of programmed humility could go a long way.

Right now, these chatbots are attempting to appear seamless. If 80% or 90% of their output is accurate, then blustering through the other 10% or 20% should be fine, right? But I think the experience for the end user would be immensely more empowering if these chatbots were designed seamfully. Expose the wires. Show the workings-out.

Mind you, that only works if there is some way to distinguish between fact and fabrication. If there’s no way to tell how much guessing is happening, then that’s a major problem. If you can’t tell me whether something is 50% true or 75% true or 25% true, then the only rational response is to treat the entire output as suspect.

I think there’s a fundamental misunderstanding behind the design of these chatbots that goes all the way back to the Turing test. There’s this idea that the way to make a chatbot believable and trustworthy is to make it appear human, attempting to hide the gears of the machine. But the real way to gain trust is through honesty.

I want a machine to tell me when it’s guessing. That won’t make me trust it less. Quite the opposite.

After all, to guess is human.

Like

We use metaphors all the time. To quote George Lakoff, we live by them.

We use analogies some of the time. They’re particularly useful when we’re wrapping our heads around something new. By comparing something novel to something familiar, we can make a shortcut to comprehension, or at least, categorisation.

But we need a certain amount of vigilance when it comes to analogies. Just because something is like something else doesn’t mean it’s the same.

With that in mind, here are some ways that people are describing generative machine learning tools. Large language models are like…

Media queries with display-mode

It’s said that the best way to learn about something is to teach it. I certainly found that to be true when I was writing the web.dev course on responsive design.

I felt fairly confident about some of the topics, but I felt somewhat out of my depth when it came to some of the newer modern additions to browsers. The last few modules in particular were unexplored areas for me, with topics like screen configurations and media features. I learned a lot about those topics by writing about them.

Best of all, I got to put my new-found knowledge to use! Here’s how…

The Session is a progressive web app. If you add it to the home screen of your mobile device, then when you launch the site by tapping on its icon, it behaves just like a native app.

In the web app manifest file for The Session, the display-mode property is set to “standalone.” That means it will launch without any browser chrome: no address bar and no back button. It’s up to me to provide the functionality that the browser usually takes care of.

So I added a back button in the navigation interface. It only appears on small screens.

Do you see the assumption I made?

I figured that the back button was most necessary in the situation where the site had been added to the home screen. That only happens on mobile devices, right?

Nope. If you’re using Chrome or Edge on a desktop device, you will be actively encourged to “install” The Session. If you do that, then just as on mobile, the site will behave like a standalone native app and launch without any browser chrome.

So desktop users who install the progressive web app don’t get any back button (because in my CSS I declare that the back button in the interface should only appear on small screens).

I was alerted to this issue on The Session:

It downloaded for me but there’s a bug, Jeremy - there doesn’t seem to be a way to go back.

Luckily, this happened as I was writing the module on media features. I knew exactly how to solve this problem because now I knew about the existence of the display-mode media feature. It allows you to write media queries that match the possible values of display-mode in a web app manifest:

.goback {
  display: none;
}
@media (display-mode: standalone) {
  .goback {
    display: inline;
  }
}

Now the back button shows up if you “install” The Session, regardless of whether that’s on mobile or desktop.

Previously I made the mistake of inferring whether or not to show the back button based on screen size. But the display-mode media feature allowed me to test the actual condition I cared about: is this user navigating in standalone mode?

If I hadn’t been writing about media features, I don’t think I would’ve been able to solve the problem. It’s a really good feeling when you’ve just learned something new, and then you immediately find exactly the right use case for it!

Faulty logic

I’m a fan of logical properties in CSS. As I wrote in the responsive design course on web.dev, they’re crucial for internationalisation.

Alaa Abd El-Rahim has written articles on CSS tricks about building multi-directional layouts and controlling layout in a multi-directional website. Not having to write separate stylesheets—or even separate rules—for different writing modes is great!

More than that though, I think understanding logical properties is the best way to truly understand CSS layout tools like grid and flexbox.

It’s like when you’re learning a new language. At some point your brain goes from translating from your mother tongue into the other language, and instead starts thinking in that other language. Likewise with CSS, as some point you want to stop translating “left” and “right” into “inline-start” and “inline-end” and instead start thinking in terms of inline and block dimensions.

As is so often the case with CSS, I think new features like these are easier to pick up if you’re new to the language. I had to unlearn using floats for layout and instead learn flexbox and grid. Someone learning layout from scatch can go straight to flexbox and grid without having to ditch the cognitive baggage of floats. Similarly, it’s going to take time for me to shed the baggage of directional properties and truly grok logical properties, but someone new to CSS can go straight to logical properties without passing through the directional stage.

Except we’re not quite there yet.

In order for logical properties to replace directional properties, they need to be implemented everywhere. Right now you can’t use logical properties inside a media query, for example:

@media (min-inline-size: 40em)

That wont’ work. You have to use the old-fashioned syntax:

@media (min-width: 40em)

Now you could rightly argue that in this instance we’re talking about the physical dimensions of the viewport. So maybe width and height make more sense than inline and block.

But then take a look at how the syntax for container queries is going to work. First you declare the axis that you want to be contained using the syntax from logical properties:

main {
  container-type: inline-size;
}

But then when you go to declare the actual container query, you have to use the corresponding directional property:

@container (min-width: 40em)

This won’t work:

@container (min-inline-size: 40em)

I kind of get why it won’t work: the syntax for container queries should match the syntax for media queries. But now the theory behind disallowing logical properties in media queries doesn’t hold up. When it comes to container queries, the physical layout of the viewport isn’t what matters.

I hope that both media queries and container queries will allow logical properties sooner rather than later. Until they fall in line, it’s impossible to make the jump fully to logical properties.

There are some other spots where logical properties haven’t been fully implemented yet, but I’m assuming that’s a matter of time. For example, in Firefox I can make a wide data table responsive by making its container side-swipeable on narrow screens:

.table-container {
  max-inline-size: 100%;
  overflow-inline: auto;
}

But overflow-inline and overflow-block aren’t supported in any other browsers. So I have to do this:

.table-container {
  max-inline-size: 100%;
  overflow-x: auto;
}

Frankly, mixing and matching logical properties with directional properties feels worse than not using logical properties at all. The inconsistency is icky. This feels old-fashioned but consistent:

.table-container {
  max-width: 100%;
  overflow-x: auto;
}

I don’t think there are any particular technical reasons why browsers haven’t implemented logical properties consistently. I suspect it’s more a matter of priorities. Fully implementing logical properties in a browser may seem like a nice-to-have bit of syntactic sugar while there are other more important web standard fish to fry.

But from the perspective of someone trying to use logical properties, the patchy rollout is frustrating.

User agents

I was on the podcast A Question Of Code recently. It was fun! The podcast is aimed at people who are making a career change into web development, so it’s right up my alley.

I sometimes get asked about what a new starter should learn. On the podcast, I mentioned a post I wrote a while back with links to some great resources and tutorials. As I said then:

For web development, start with HTML, then CSS, then JavaScript (and don’t move on to JavaScript too quickly—really get to grips with HTML and CSS first).

That’s assuming you want to be a good well-rounded web developer. But it might be that you need to get a job as quickly as possible. In that case, my advice would be very different. I would advise you to learn React.

Believe me, I take no pleasure in giving that advice. But given the reality of what recruiters are looking for, knowing React is going to increase your chances of getting a job (something that’s reflected in the curricula of coding schools). And it’s always possible to work backwards from React to the more fundamental web technologies of HTML, CSS, and JavaScript. I hope.

Regardless of your initial route, what’s the next step? How do you go from starting out in web development to being a top-notch web developer?

I don’t consider myself to be a top-notch web developer (far from it), but I am very fortunate in that I’ve had the opportunity to work alongside some tippety-top-notch developers at ClearleftTrys, Cassie, Danielle, Mark, Graham, Charlotte, Andy, and Natalie.

They—and other top-notch developers I’m fortunate to know—have something in common. They prioritise users. Sure, they’ll all have their favourite technologies and specialised areas, but they don’t lose sight of who they’re building for.

When you think about it, there’s quite a power imbalance between users and developers on the web. Users can—ideally—choose which web browser to use, and maybe make some preference changes if they know where to look, but that’s about it. Developers dictate everything else—the technology that a website will use, the sheer amount of code shipped over the network to the user, whether the site will be built in a fragile or a resilient way. Users are dependent on developers, but developers don’t always act in the best interests of users. It’s a classic example of the principal-agent problem:

The principal–agent problem, in political science and economics (also known as agency dilemma or the agency problem) occurs when one person or entity (the “agent”), is able to make decisions and/or take actions on behalf of, or that impact, another person or entity: the “principal”. This dilemma exists in circumstances where agents are motivated to act in their own best interests, which are contrary to those of their principals, and is an example of moral hazard.

A top-notch developer never forgets that they are an agent, and that the user is the principal.

But is it realistic to expect web developers to be so focused on user needs? After all, there’s a whole separate field of user experience design that specialises in this focus. It hardly seems practical to suggest that a top-notch developer needs to first become a good UX designer. There’s already plenty to focus on when it comes to just the technology side of front-end development.

So maybe this is too simplistic a way of defining the principle-agent relationship between users and developers:

user :: developer

There’s something that sits in between, mediating that relationship. It’s a piece of software that in the world of web standards is even referred to as a “user agent”: the web browser.

user :: web browser :: developer

So if making the leap to understanding users seems too much of a stretch, there’s an intermediate step. Get to know how web browsers work. As a web developer, if you know what web browsers “like” and “dislike”, you’re well on the way to making great user experiences. If you understand the pain points for browser when they’re parsing and rendering your code, you’ve got a pretty good proxy for understanding the pain points that your users are experiencing.

Getting started

I got an email recently from a young person looking to get into web development. They wanted to know what languages they should start with, whether they should a Mac or a Windows PC, and what some places to learn from.

I wrote back, saying this about languages:

For web development, start with HTML, then CSS, then JavaScript (and don’t move on to JavaScript too quickly—really get to grips with HTML and CSS first).

And this is what I said about hardware and software:

It doesn’t matter whether you use a Mac or a Windows PC, as long as you’ve got an internet connection, some web browsers (Chrome, Firefox, for example) and a text editor. There are some very good free text editors available for Mac and PC:

For resources, I had a trawl through links I’ve tagged with “learning” and “html” and sent along some links to free online tutorials:

After sending that email, I figured that this list might be useful to anyone else looking to start out in web development. If you know of anyone in that situation, I hope this list might help.

Voice User Interface Design by Cheryl Platz

Cheryl Platz is speaking at An Event Apart Chicago. Her inaugural An Event Apart presentation is all about voice interfaces, and I’m going to attempt to liveblog it…

Why make a voice interface?

Successful voice interfaces aren’t necessarily solving new problems. They’re used to solve problems that other devices have already solved. Think about kitchen timers. There are lots of ways to set a timer. Your oven might have one. Your phone has one. Why use a $200 device to solve this mundane problem? Same goes for listening to music, news, and weather.

People are using voice interfaces for solving ordinary problems. Why? Context matters. If you’re carrying a toddler, then setting a kitchen timer can be tricky so a voice-activated timer is quite appealing. But why is voice is happening now?

Humans have been developing the art of conversation for thousands of years. It’s one of the first skills we learn. It’s deeply instinctual. Most humans use speach instinctively every day. You can’t necessarily say that about using a keyboard or a mouse.

Voice-based user interfaces are not new. Not just the idea—which we’ve seen in Star Trek—but the actual implementation. Bell Labs had Audrey back in 1952. It recognised ten words—the digits zero through nine. Why did it take so long to get to Alexa?

In the late 70s, DARPA issued a challenge to create a voice-activated system. Carnagie Mellon came up with Harpy (with a thousand word grammar). But none of the solutions could respond in real time. In conversation, we expect a break of no more than 200 or 300 milliseconds.

In the 1980s, computing power couldn’t keep up with voice technology, so progress kind of stopped. Time passed. Things finally started to catch up in the 90s with things like Dragon Naturally Speaking. But that was still about vocabulary, not grammar. By the 2000s, small grammars were starting to show up—starting an X-Box or pausing Netflix. In 2008, Google Voice Search arrived on the iPhone and natural language interaction began to arrive.

What makes natural language interactions so special? It requires minimal training because it uses the conversational muscles we’ve been working for a lifetime. It unlocks the ability to have more forgiving, less robotic conversations with devices. There might be ten different ways to set a timer.

Natural language interactions can also free us from “screen magnetism”—that tendency to stay on a device even when our original task is complete. Voice also enables fast and forgiving searches of huge catalogues without time spent typing or browsing. You can pick a needle straight out of a haystack.

Natural language interactions are excellent for older customers. These interfaces don’t intimidate people without dexterity, vision, or digital experience. Voice input often leads to more inclusive experiences. Many customers with visual or physical disabilities can’t use traditional graphical interfaces. Voice experiences throw open the door of opportunity for some people. However, voice experience can exclude people with speech difficulties.

Making the case for voice interfaces

There’s a misconception that you need to work at Amazon, Google, or Apple to work on a voice interface, or at least that you need to have a big product team. But Cheryl was able to make her first Alexa “skill” in a week. If you’re a web developer, you’re good to go. Your voice “interaction model” is just JSON.

How do you get your product team on board? Find the customers (and situations) you might have excluded with traditional input. Tell the stories of people whose hands are full, or who are vision impaired. You can also point to the adoption rate numbers for smart speakers.

You’ll need to show your scenario in context. Otherwise people will ask, “why can’t we just build an app for this?” Conduct research to demonstrate the appeal of a voice interface. Storyboarding is very useful for visualising the context of use and highlighting existing pain points.

Getting started with voice interfaces

You’ve got to understand how the technology works in order to adapt to how it fails. Here are a few basic concepts.

Utterance. A word, phrase, or sentence spoken by a customer. This is the true form of what the customer provides.

Intent. This is the meaning behind a customer’s request. This is an important distinction because one intent could have thousands of different utterances.

Prompt. The text of a system response that will be provided to a customer. The audio version of a prompt, if needed, is generated separately using text to speech.

Grammar. A finite set of expected utterances. It’s a list. Usually, each entry in a grammar is paired with an intent. Many interfaces start out as being simple grammars before moving on to a machine-learning model later once the concept has been proven.

Here’s the general idea with “artificial intelligence”…

There’s a human with a core intent to do something in the real world, like knowing when the cookies in the oven are done. This is translated into an intent like, “set a 15 minute timer.” That’s the utterance that’s translated into a string. But it hasn’t yet been parsed as language. That string is passed into a natural language understanding system. What comes is a data structure that represents the customers goal e.g. intent=timer; duration=15 minutes. That’s sent to the business logic where a timer is actually step. For a good voice interface, you also want to send back a response e.g. “setting timer for 15 minutes starting now.”

That seems simple enough, right? What’s so hard about designing for voice?

Natural language interfaces are a form of artifical intelligence so it’s not deterministic. There’s a lot of ruling out false positives. Unlike graphical interfaces, voice interfaces are driven by probability.

How do you turn a sound wave into an understandable instruction? It’s a lot like teaching a child. You feed a lot of data into a statistical model. That’s how machine learning works. It’s a probability game. That’s where it gets interesting for design—given a bunch of possible options, we need to use context to zero in on the most correct choice. This is where confidence ratings come in: the system will return the probability that a response is correct. Effectively, the system is telling you how sure or not it is about possible results. If the customer makes a request in an unusual or unexpected way, our system is likely to guess incorrectly. That’s because the system is being given something new.

Designing a conversation is relatively straightforward. But 80% of your voice design time will be spent designing for what happens when things go wrong. In voice recognition, edge cases are front and centre.

Here’s another challenge. Interaction with most voice interfaces is part conversation, part performance. Most interactions are not private.

Humans don’t distinguish digital speech fom human speech. That means these devices are intrinsically social. Our brains our wired to try to extract social information, even form digital speech. See, for example, why it’s such a big question as to what gender a voice interface has.

Delivering a voice interface

Storyboards help depict the context of use. Sample dialogues are your new wireframes. These are little scripts that not only cover the happy path, but also your edge case. Then you reverse engineer from there.

Flow diagrams communicate customer states, but don’t use the actual text in them.

Prompt lists are your final deliverable.

Functional prototypes are really important for voice interfaces. You’ll learn the real way that customers will ask for things.

If you build a working prototype, you’ll be building two things: a natural language interaction model (often a JSON file) and custom business logic (in a programming language).

Eventually voice design will become a core competency, much like mobile, which was once separate.

Ask yourself what tasks your customers complete on your site that feel clunkly. Remember that voice desing is almost never about new scenarious. Start your journey into voice interfaces by tackling old problems in new, more inclusive ways.

May the voice be with you!

Going Offline—the talk of the book

I gave a new talk at An Event Apart in Seattle yesterday morning. The talk was called Going Offline, which the eagle-eyed amongst you will recognise as the title of my most recent book, all about service workers.

I was quite nervous about this talk. It’s very different from my usual fare. Usually I have some big sweeping arc of history, and lots of pretentious ideas joined together into some kind of narrative arc. But this talk needed to be more straightforward and practical. I wasn’t sure how well I would manage that brief.

I knew from pretty early on that I was going to show—and explain—some code examples. Those were the parts I sweated over the most. I knew I’d be presenting to a mixed audience of designers, developers, and other web professionals. I couldn’t assume too much existing knowledge. At the same time, I didn’t want to teach anyone to such eggs.

In the end, there was an overarching meta-theme to talk, which was this: logic is more important than code. In other words, figuring out what you’re trying to accomplish (and describing it clearly) is more important than typing curly braces and semi-colons. Programming is an act of translation. Before you can translate something, you need to be able to articulate it clearly in your own language first. By emphasising that point, I hoped to make the code less overwhelming to people unfamilar with it.

I had tested the talk with some of my Clearleft colleagues, and they gave me great feedback. But I never know until I’ve actually given a talk in front of a real conference audience whether the talk is any good or not. Now that I’ve given the talk, and received more feedback, I think I can confidentally say that it’s pretty damn good.

My goal was to explain some fairly gnarly concepts—let’s face it: service workers are downright weird, and not the easiest thing to get your head around—and to leave the audience with two feelings:

  1. This is exciting, and
  2. This is something I can do today.

I deliberately left time for questions, bribing people with free copies of my book. I got some great questions, and I may incorporate some of them into future versions of this talk (conference organisers, if this sounds like the kind of talk you’d like at your event, please get in touch). Some of the points brought up in the questions were:

  • Is there some kind of wizard for creating a typical service worker script for any site? I didn’t have a direct answer to this, but I have attempted to make a minimal viable service worker that could be used for just about any site. Mostly I encouraged the questioner to roll their sleeves up and try writing a bespoke script. I also mentioned the Workbox library, but I gave my opinion that if you’re going to spend the time to learn the library, you may as well spend the time to learn the underlying language.
  • What are some state-of-the-art progressive web apps for offline user experiences? Ooh, this one kind of stumped me. I mean, the obvious poster children for progressive webs apps are things like Twitter, Instagram, and Pinterest. They’re all great but the offline experience is somewhat limited. To be honest, I think there’s more potential for great offline experiences by publishers. I especially love the pattern on personal sites like Una’s and Sara’s where people can choose to save articles offline to read later—like a bespoke Instapaper or Pocket. I’d love so see that pattern adopted by some big publications. I particularly like that gives so much more control directly to the end user. Instead of trying to guess what kind of offline experience they want, we give them the tools to craft their own.
  • Do caches get cleaned up automatically? Great question! And the answer is mostly no—although browsers do have their own heuristics about how much space you get to play with. There’s a whole chapter in my book about being a good citizen and cleaning up your caches, but I didn’t include that in the talk because it isn’t exactly exciting: “Hey everyone! Now we’re going to do some housekeeping—yay!”
  • Isn’t there potential for abuse here? This is related to the previous question, and it’s another great question to ask of any technology. In short, yes. Bad actors could use service workers to fill up caches uneccesarily. I’ve written about back door service workers too, although the real problem there is with iframes rather than service workers—iframes and cookies are technologies that are already being abused by bad actors, and we’re going to see more and more interventions by ethical browser makers (like Mozilla) to clamp down on those technologies …just as browsers had to clamp down on the abuse of pop-up windows in the early days of JavaScript. The cache API could become a tragedy of the commons. I liken the situation to regulation: we should self-regulate, but if we prove ourselves incapable of that, then outside regulation (by browsers) will be imposed upon us.
  • What kind of things are in the future for service workers? Excellent question! If you think about it, a service worker is kind of a conduit that gives you access to different APIs: the Cache API and the Fetch API being the main ones now. A service worker is like an airport and the APIs are like the airlines. There are other APIs that you can access through service workers. Notifications are available now on desktop and on Android, and they’ll be coming to iOS soon. Background Sync is another powerful API accessed through service workers that will get more and more browser support over time. The great thing is that you can start using these APIs today even if they aren’t universally supported. Then, over time, more and more of your users will benefit from those enhancements.

If you attended the talk and want to learn more about about service workers, there’s my book (obvs), but I’ve also written lots of blog posts about service workers and I’ve linked to lots of resources too.

Finally, here’s a list of links to all the books, sites, and articles I referenced in my talk…

Books

Sites

Progressive Web Apps

Optimise without a face

I’ve been playing around with the newly-released Squoosh, the spiritual successor to Jake’s SVGOMG. You can drag images into the browser window, and eyeball the changes that any optimisations might make.

On a project that Cassie is working on, it worked really well for optimising some JPEGs. But there were a few images that would require a bit more fine-grained control of the optimisations. Specifically, pictures with human faces in them.

I’ve written about this before. If there’s a human face in image, I open that image in a graphics editing tool like Photoshop, select everything but the face, and add a bit of blur. Because humans are hard-wired to focus on faces, we’ll notice any jaggy artifacts on a face, but we’re far less likely to notice jagginess in background imagery: walls, materials, clothing, etc.

On the face of it (hah!), a browser-based tool like Squoosh wouldn’t be able to optimise for faces, but then Cassie pointed out something really interesting…

When we were both at FFConf on Friday, there was a great talk by Eleanor Haproff on machine learning with JavaScript. It turns out there are plenty of smart toolkits out there, and one of them is facial recognition. So I wonder if it’s possible to build an in-browser tool with this workflow:

  • Drag or upload an image into the browser window,
  • A facial recognition algorithm finds any faces in the image,
  • Those portions of the image remain crisp,
  • The rest of the image gets a slight blur,
  • Download the optimised image.

Maybe the selecting/blurring part would need canvas? I don’t know.

Anyway, I thought this was a brilliant bit of synthesis from Cassie, and now I’ve got two questions:

  1. Does this exist yet? And, if not,
  2. Does anyone want to try building it?

Praise for Going Offline

I’m very, very happy to see that my new book Going Offline is proving to be accessible and unintimidating to a wide audience—that was very much my goal when writing it.

People have been saying nice things on their blogs, which is very gratifying. It’s even more gratifying to see people use the knowledge gained from reading the book to turn those blogs into progressive web apps!

Sara Soueidan:

It doesn’t matter if you’re a designer, a junior developer or an experienced engineer — this book is perfect for anyone who wants to learn about Service Workers and take their Web application to a whole new level.

I highly recommend it. I read the book over the course of two days, but it can easily be read in half a day. And as someone who rarely ever reads a book cover to cover (I tend to quit halfway through most books), this says a lot about how good it is.

Eric Lawrence:

I was delighted to discover a straightforward, very approachable reference on designing a ServiceWorker-backed application: Going Offline by Jeremy Keith. The book is short (I’m busy), direct (“Here’s a problem, here’s how to solve it“), opinionated in the best way (landmine-avoiding “Do this“), and humorous without being confusing. As anyone who has received unsolicited (or solicited) feedback from me about their book knows, I’m an extremely picky reader, and I have no significant complaints on this one. Highly recommended.

Ben Nadel:

If you’re interested in the “offline first” movement or want to learn more about Service Workers, Going Offline by Jeremy Keith is a really gentle and highly accessible introduction to the topic.

Daniel Koskine:

Jeremy nails it again with this beginner-friendly introduction to Service Workers and Progressive Web Apps.

Donny Truong

Jeremy’s technical writing is as superb as always. Similar to his first book for A Book Apart, which cleared up all my confusions about HTML5, Going Offline helps me put the pieces of the service workers’ puzzle together.

People have been saying nice things on Twitter too…

Aaron Gustafson:

It’s a fantastic read and a simple primer for getting Service Workers up and running on your site.

Ethan Marcotte:

Of course, if you’re looking to take your website offline, you should read @adactio’s wonderful book

Lívia De Paula Labate:

Ok, I’m done reading @adactio’s Going Offline book and as my wife would say, it’s the bomb dot com.

If that all sounds good to you, get yourself a copy of Going Offline in paperbook, or ebook (or both).

Frustration

I had some problems with my bouzouki recently. Now, I know my bouzouki pretty well. I can navigate the strings and frets to make music. But this was a problem with the pickup under the saddle of the bouzouki’s bridge. So it wasn’t so much a musical problem as it was an electronics problem. I know nothing about electronics.

I found it incredibly frustrating. Not only did I have no idea how to fix the problem, but I also had no idea of the scope of the problem. Would it take five minutes or five days? Who knows? Not me.

My solution to a problem like this is to pay someone else to fix it. Even then I have to go through the process of having the problem explained to me by someone who understands and cares about electronics much more than me. I nod my head and try my best to look like I’m taking it all in, even though the truth is I have no particular desire to get to grips with the inner workings of pickups—I just want to make some music.

That feeling of frustration I get from having wiring issues with a musical instrument is the same feeling I get whenever something goes awry with my web server. I know just enough about servers to be dangerous. When something goes wrong, I feel very out of my depth, and again, I have no idea how long it will take the fix the problem: minutes, hours, days, or weeks.

I had a very bad day yesterday. I wanted to make a small change to the Clearleft website—one extra line of CSS. But the build process for the website is quite convoluted (and clever), automatically pulling in components from the site’s pattern library. Something somewhere in the pipeline went wrong—I still haven’t figured out what—and for a while there, the Clearleft website was down, thanks to me. (Luckily for me, Danielle saved the day …again. I’d be lost without her.)

I was feeling pretty down after that stressful day. I felt like an idiot for not knowing or understanding the wiring beneath the site.

But, on the other hand, considering I was only trying to edit a little bit of CSS, maybe the problem didn’t lie entirely with me.

There’s a principle underlying the architecture of the World Wide Web called The Rule of Least Power. It somewhat counterintuitively states that you should:

choose the least powerful language suitable for a given purpose.

Perhaps, given the relative simplicity of the task I was trying to accomplish, the plumbing was over-engineered. That complexity wouldn’t matter if I could circumvent it, but without the build process, there’s no way to change the markup, CSS, or JavaScript for the site.

Still, most of the time, the build process isn’t a hindrance, it’s a help: concatenation, minification, linting and all that good stuff. Most of my frustration when something in the wiring goes wrong is because of how it makes me feel …just like with the pickup in my bouzouki, or the server powering my website. It’s not just that I find this stuff hard, but that I also feel like it’s stuff I’m supposed to know, rather than stuff I want to know.

On that note…

Last week, Paul wrote about getting to grips with JavaScript. On the very same day, Brad wrote about his struggle to learn React.

I think it’s really, really, really great when people share their frustrations and struggles like this. It’s very reassuring for anyone else out there who’s feeling similarly frustrated who’s worried that the problem lies with them. Also, this kind of confessional feedback is absolute gold dust for anyone looking to write explanations or documentation for JavaScript or React while battling the curse of knowledge. As Paul says:

The challenge now is to remember the pain and anguish I endured, and bare that in mind when helping others find their own path through the knotted weeds of JavaScript.

Technical balance

Two technical editors worked with me on Going Offline.

Jake was one of the tech editors. He literally (co-)wrote the spec on service workers. There ain’t nuthin’ he don’t know about the code involved. His job was to catch any technical inaccuracies in my writing.

The other tech editor was Amber. She’s relatively new to web development. While I was writing the book, she had a solid grounding in HTML and CSS, but not much experience in JavaScript. That made her the perfect archetypal reader. Her job was to point out whenever I wasn’t explaining something clearly enough.

My job was to satisfy both of them. I needed to explain service workers and all its associated APIs. I also needed to make it approachable and understandable to people who haven’t dived deeply into JavaScript.

I deliberately didn’t wait until I was an expert in this topic before writing Going Offline. I knew that the more familiar I became with the ins-and-outs of getting a service worker up and running, the harder it would be for me to remember what it was like not to know that stuff. I figured the best way to avoid the curse of knowledge would be not to accrue too much of it. But then once I started researching and writing, I inevitably became more au fait with the topic. I had to try to battle against that, trying to keep a beginner’s mind.

My watchword was this great piece of advice from Codebar:

Assume that anyone you’re teaching has no knowledge but infinite intelligence.

It was tricky. I’m still not sure if I managed to pull off the balancing act, although early reports are very, very encouraging. You’ll be able to judge for yourself soon enough. The book is shipping at the start of next week. Get your order in now.

The audience for Going Offline

My new book, Going Offline, starts with no assumption of JavaScript knowledge, but by the end of the book the reader is armed with enough code to make any website work offline.

I didn’t want to overwhelm the reader with lots of code up front, so I’ve tried to dole it out in manageable chunks. The amount of code ramps up a little bit in each chapter until it peaks in chapter five. After that, it ramps down a bit with each subsequent chapter.

This tweet perfectly encapsulates the audience I had in mind for the book:

Some people have received advance copies of the PDF, and I’m very happy with the feedback I’m getting.

Honestly, that is so, so gratifying to hear!

Words cannot express how delighted I am with Sara’s reaction:

She’s walking the walk too:

That gives me a warm fuzzy glow!

If you’ve been nervous about service workers, but you’ve always wanted to turn your site into a progressive web app, you should get a copy of this book.

Understandable excitement

An Event Apart Seattle just wrapped. It was a three-day special edition and it was really rather good. Lots of the speakers (myself included) were unveiling brand new talks, so there was a real frisson of excitement.

It was interesting to see repeating, overlapping themes. From a purely technical perspective, three technologies that were front and centre were:

  • CSS grid,
  • variable fonts, and
  • service workers.

From listening to other attendees, the overwhelming message received was “These technologies are here—they’ve arrived.” Now, depending on your mindset, that understanding can be expressed as “Oh shit! These technologies are here!” or “Yay! Finally! These technologies are here!”

My reaction is very firmly the latter. That in itself is an interesting data-point, because (as discussed in my talk) my reaction towards new technological advances isn’t always one of excitement—quite often it’s one of apprehension, even fear.

I’ve been trying to self-analyse to figure out which kinds of technologies trigger which kind of reaction. I don’t have any firm answers yet, but it’s interesting to note that the three technologies mentioned above (CSS grid, variable fonts, and service workers) are all additions to the core languages of the web—the materials we use to build the web. Frameworks, libraries, build tools, and other such technologies are more like tools than materials. I tend to get less excited about advances in those areas. Sometimes advances in those areas not only fail to trigger excitement, they make me feel overwhelmed and worried about falling behind.

Since figuring out this split between materials and tools, it has helped me come to terms with my gut emotional reaction to the latest technological advances on the web. I think it’s okay that I don’t get excited about everything. And given the choice, I think maybe it’s healthier to be more excited about advances in the materials—HTML, CSS, and JavaScript APIs—than advances in tooling …although, it is, of course, perfectly possible to get equally excited about both (that’s just not something I seem to be able to do).

Another split I’ve noticed is between technologies that directly benefit users, and technologies that directly benefit developers. I think there was a bit of a meta-thread running through the talks at An Event Apart about CSS grid, variable fonts, and service workers: all of those advances allow us developers to accomplish more with less. They’re good for performance, in other words. I get much more nervous about CSS frameworks and JavaScript libraries that allow us to accomplish more, but require the user to download the framework or library first. It feels different when something is baked into browsers—support for CSS features, or JavaScript APIs. Then it feels like much more of a win-win situation for users and developers. If anything, the onus is on developers to take the time and do the work and get to grips with these browser-native technologies. I’m okay with that.

Anyway, all of this helps me understand my feelings at the end of An Event Apart Seattle. I’m fired up and eager to make something with CSS grid, variable fonts, and—of course—service workers.

Just change it

Amber and I often have meta conversations about the nature of learning and teaching. We swap books and share ideas and experiences whenever we’re trying to learn something or trying to teach something. A topic that comes up again and again is the idea of “the curse of knowledge“—it’s the focus of Steven Pinker’s book The Sense Of Style. That’s when the author/teacher can’t remember what it’s like not to know something, which makes for a frustrating reading/learning experience.

This is one of the reasons why I encourage people to blog about stuff as they’re learning it; not when they’ve internalised it. The perspective that comes with being in the moment of figuring something out is invaluable to others. I honestly think that most explanatory books shouldn’t be written by experts—the “curse of knowledge” can become almost insurmountable.

I often think about this when I’m reading through the installation instructions for frameworks, libraries, and other web technologies. I find myself put off by documentation that assumes I’ve got a certain level of pre-existing knowledge. But now instead of letting it get me down, I use it as an opportunity to try and bridge that gap.

The brilliant Safia Abdalla wrote a post a while back called How do I get started contributing to open source?. I definitely don’t have the programming chops to contribute much to a codebase, but I thoroughly agree with Safia’s observation:

If you’re interested in contributing to open source to improve your communication and empathy skills, you’re definitely making the right call. A lot of open source tools could definitely benefit from improvements in the documentation, accessibility, and evangelism departments.

What really jumps out at me is when instructions use words like “simply” or “just”. I’m with Brad:

“Just” makes me feel like an idiot. “Just” presumes I come from a specific background, studied certain courses in university, am fluent in certain technologies, and have read all the right books, articles, and resources. “Just” is a dangerous word.

But rather than letting that feeling overwhelm me, I now try to fix the text. Here are a few examples of changes I’ve suggested, usually via pull requests on Github repos:

They all have different codebases in different programming languages, but they’re all intended for humans, so having clear and kind documentation is a shared goal.

I like suggesting these kinds of changes. That initial feeling of frustration I get from reading the documentation gets turned into a warm fuzzy feeling from lending a helping hand.

Getting griddy with it

I had the great pleasure of attending An Event Apart Seattle last week. It was, as always, excellent.

It’s always interesting to see themes emerge during an event, especially when those thematic overlaps haven’t been planned in advance. Jen noticed this one:

I remember that being a theme at An Event Apart San Francisco too, when it seemed like every speaker had words to say about ill-judged use of Bootstrap. That theme was certainly in my presentation when I talked about “the fallacy of assumed competency”:

  1. large company X uses technology Y,
  2. company X must know what they are doing because they are large,
  3. therefore technology Y must be good.

Perhaps “the fallacy of assumed suitability” would be a better term. Heydon calls it “the ‘made at Facebook’ fallacy.” But I also made sure to contrast it with the opposite extreme: “Not Invented Here syndrome”.

As well as over-arching themes, it was also interesting to see which technologies were hot topics at An Event Apart. There was one clear winner here—CSS Grid Layout.

Microsoft—a sponsor of the event—used An Event Apart as the place to announce that Grid is officially moving into development for Edge. Jen talked about Grid (of course). Rachel talked about Grid (of course). And while Eric and Una didn’t talk about it on stage, they’ve both been writing about the fun they’ve been having having with Grid. Una wrote about 3 CSS Grid Features That Make My Heart Flutter. Eric is documenting the overall of his site with Grid. So when we were all gathered together, that’s what we were nerding out about.

The CSS Squad.

There are some great resources out there for levelling up in Grid-fu:

With Jen’s help, I’ve been playing with CSS Grid on a little site that I’m planning to launch tomorrow (he said, foreshadowingly). I took me a while to get my head around it, but once it clicked I started to have a lot of fun. “Fun” seems to be the overall feeling around this technology. There’s something infectious about the excitement and enthusiasm that’s returning to the world of layout on the web. And now that the browser support is great pretty much across the board, we can start putting that fun into production.

Balance

This year’s Render conference just wrapped up in Oxford. It was a well-run, well-curated event, right up my alley: two days of a single track of design and development talks (see also: An Event Apart and Smashing Conference for other events in this mold that get it right).

One of my favourite talks was from Frances Ng. She gave a thoroughly entertaining account of her journey from aerospace engineer to front-end engineer, filled with ideas about how to get started, and keep from getting overwhelmed in the world of the web.

She recommended taking the time to occasionally dive deep into a foundational topic, pointing to another talk as a perfect example; Ana Balica gave a great presentation all about HTTP. The second half of the talk was about HTTP 2 and was filled with practical advice, but the first part was a thoroughly geeky history of the Hypertext Transfer Protocol, which I really loved.

While I’m mentoring Amber, we’ve been trying to find a good balance between those deep dives into the foundational topics and the hands-on day-to-day skills needed for web development. So far, I think we’ve found a good balance.

When Amber is ‘round at the Clearleft office, we sit down together and work on the practical aspects of HTML, CSS, and (soon) JavaScript. Last week, for example, we had a really great day diving into CSS selectors and specificity—I watched Amber’s knowledge skyrocket over the course of the day.

But between those visits—which happen every one or two weeks—I’ve been giving Amber homework of sorts. That’s where the foundational building blocks come in. Here are the questions I’ve asked so far:

  • What is the difference between the internet and the web?
  • What is the difference between GET and POST?
  • What are cookies?

The first question is a way of understanding the primacy of URLs on the web. Amber wrote about her research. The second question was getting at an understanding of HTTP. Amber wrote about that too. The third and current question is about state on the web. I’m looking forward to reading a write-up of that soon.

We’re still figuring out this whole mentorship thing but I think this balance of research and exercises is working out well.

Amber

I really enjoyed teaching in Porto last week. It was like having a week-long series of CodeBar sessions.

Whenever I’m teaching at CodeBar, I like to be paired up with people who are just starting out. There’s something about explaining the web and HTML from first principles that I really like. And people often have lots and lots of questions that I enjoy answering (if I can). At CodeBar—and at The New Digital School—I found myself saying “Great question!” multiple times. The really great questions are the ones that I respond to with “I don’t know …let’s find out!”

CodeBar is always a very rewarding experience for me. It has given me the opportunity to try teaching. And having tried it, I can now safely say that I like it. It’s also a great chance to meet people from all walks of life. It gets me out of my bubble.

I can’t remember when I was first paired up with Amber at CodeBar. It must have been sometime last year. I do remember that she had lots of great questions—at some point I found myself explaining how hexadecimal colours work.

I was impressed with Amber’s eagerness to learn. I also liked that she was making her own website. I told her about Homebrew Website Club and she started coming along to that (along with other CodeBar people like Cassie and Alice).

I’ve mentioned to multiple CodeBar students that there’s pretty much an open-door policy at Clearleft when it comes to shadowing: feel free to come along and sit with a front-end developer while they’re working on client projects. A few people have taken up the offer and enjoyed observing myself or Charlotte at work. Amber was one of those people. Again, I was very impressed with her drive. She’s got a full-time job (with sometimes-crazy hours) but she’s so determined to get into the world of web design and development that she’s willing to spend her free time visiting Clearleft to soak up the atmosphere of a design studio.

We’ve decided to turn this into something more structured. Amber and I will get together for a couple of hours once a week. She’s given me a list of some of the areas she wants to explore, and I think it’s a fine-looking list:

  • I want to gather base, structural knowledge about the web and all related aspects. Things seem to float around in a big cloud at the moment.
  • I want to adhere to best practices.
  • I want to learn more about what direction I want to go in, find a niche.
  • I’d love to opportunity to chat with the brilliant people who work at Clearleft and gain a broad range of knowledge from them.

My plan right now is to take a two-track approach: one track about the theory, and another track about the practicalities. The practicalities will be HTML, CSS, JavaScript, and related technologies. The theory will be about understanding the history of the web and its strengths and weaknesses as a medium. And I want to make sure there’s plenty of UX, research, information architecture and content strategy covered too.

Seeing as we’ll only have a couple of hours every week, this won’t be quite like the masterclass I just finished up in Porto. Instead I imagine I’ll be laying some groundwork and then pointing to topics to research. I guess it’s a kind of homework. For example, after we talked today, I set Amber this little bit of research for the next time we meet: “What is the difference between the internet and the World Wide Web?”

I’m excited to see where this will lead. I find Amber’s drive and enthusiasm very inspiring. I also feel a certain weight of responsibility—I don’t want to enter into this lightly.

I’m not really sure what to call this though. Is it mentorship? Or is it coaching? Or training? All of the above?

Whatever it is, I’m looking forward to documenting the journey. Amber will be writing about it too. She is already demonstrating a way with words.