Thursday, October 3rd, 2019
Tuesday, September 10th, 2019
You pop in a URL, it fetches the page and maps out all the subsequent requests in a nifty interactive diagram of circles, showing how many requests third-party scripts are themselves generating. I’ve found it to be a very effective way of showing the impact of third-party scripts to people who aren’t interested in looking at waterfall diagrams.
I was wondering… Wouldn’t it be great if this were built into browsers?
We already have a “Network” tab in our developer tools. The purpose of this tab is to show requests coming in. The browser already has all the information it needs to make a diagram of requests in the same that the request map generator does.
In Firefox, there’s a little clock icon in the bottom left corner of the “Network” tab. Clicking that shows a pie-chart view of requests. That’s useful, but I’d love it if there were the option to also see the connected circles that the request map generator shows.
Just a thought.
Tuesday, August 27th, 2019
Voice User Interface Design by Cheryl Platz
Why make a voice interface?
Successful voice interfaces aren’t necessarily solving new problems. They’re used to solve problems that other devices have already solved. Think about kitchen timers. There are lots of ways to set a timer. Your oven might have one. Your phone has one. Why use a $200 device to solve this mundane problem? Same goes for listening to music, news, and weather.
People are using voice interfaces for solving ordinary problems. Why? Context matters. If you’re carrying a toddler, then setting a kitchen timer can be tricky so a voice-activated timer is quite appealing. But why is voice is happening now?
Humans have been developing the art of conversation for thousands of years. It’s one of the first skills we learn. It’s deeply instinctual. Most humans use speach instinctively every day. You can’t necessarily say that about using a keyboard or a mouse.
Voice-based user interfaces are not new. Not just the idea—which we’ve seen in Star Trek—but the actual implementation. Bell Labs had Audrey back in 1952. It recognised ten words—the digits zero through nine. Why did it take so long to get to Alexa?
In the late 70s, DARPA issued a challenge to create a voice-activated system. Carnagie Mellon came up with Harpy (with a thousand word grammar). But none of the solutions could respond in real time. In conversation, we expect a break of no more than 200 or 300 milliseconds.
In the 1980s, computing power couldn’t keep up with voice technology, so progress kind of stopped. Time passed. Things finally started to catch up in the 90s with things like Dragon Naturally Speaking. But that was still about vocabulary, not grammar. By the 2000s, small grammars were starting to show up—starting an X-Box or pausing Netflix. In 2008, Google Voice Search arrived on the iPhone and natural language interaction began to arrive.
What makes natural language interactions so special? It requires minimal training because it uses the conversational muscles we’ve been working for a lifetime. It unlocks the ability to have more forgiving, less robotic conversations with devices. There might be ten different ways to set a timer.
Natural language interactions can also free us from “screen magnetism”—that tendency to stay on a device even when our original task is complete. Voice also enables fast and forgiving searches of huge catalogues without time spent typing or browsing. You can pick a needle straight out of a haystack.
Natural language interactions are excellent for older customers. These interfaces don’t intimidate people without dexterity, vision, or digital experience. Voice input often leads to more inclusive experiences. Many customers with visual or physical disabilities can’t use traditional graphical interfaces. Voice experiences throw open the door of opportunity for some people. However, voice experience can exclude people with speech difficulties.
Making the case for voice interfaces
There’s a misconception that you need to work at Amazon, Google, or Apple to work on a voice interface, or at least that you need to have a big product team. But Cheryl was able to make her first Alexa “skill” in a week. If you’re a web developer, you’re good to go. Your voice “interaction model” is just JSON.
How do you get your product team on board? Find the customers (and situations) you might have excluded with traditional input. Tell the stories of people whose hands are full, or who are vision impaired. You can also point to the adoption rate numbers for smart speakers.
You’ll need to show your scenario in context. Otherwise people will ask, “why can’t we just build an app for this?” Conduct research to demonstrate the appeal of a voice interface. Storyboarding is very useful for visualising the context of use and highlighting existing pain points.
Getting started with voice interfaces
You’ve got to understand how the technology works in order to adapt to how it fails. Here are a few basic concepts.
Utterance. A word, phrase, or sentence spoken by a customer. This is the true form of what the customer provides.
Intent. This is the meaning behind a customer’s request. This is an important distinction because one intent could have thousands of different utterances.
Prompt. The text of a system response that will be provided to a customer. The audio version of a prompt, if needed, is generated separately using text to speech.
Grammar. A finite set of expected utterances. It’s a list. Usually, each entry in a grammar is paired with an intent. Many interfaces start out as being simple grammars before moving on to a machine-learning model later once the concept has been proven.
Here’s the general idea with “artificial intelligence”…
There’s a human with a core intent to do something in the real world, like knowing when the cookies in the oven are done. This is translated into an intent like, “set a 15 minute timer.” That’s the utterance that’s translated into a string. But it hasn’t yet been parsed as language. That string is passed into a natural language understanding system. What comes is a data structure that represents the customers goal e.g. intent=timer; duration=15 minutes. That’s sent to the business logic where a timer is actually step. For a good voice interface, you also want to send back a response e.g. “setting timer for 15 minutes starting now.”
That seems simple enough, right? What’s so hard about designing for voice?
Natural language interfaces are a form of artifical intelligence so it’s not deterministic. There’s a lot of ruling out false positives. Unlike graphical interfaces, voice interfaces are driven by probability.
How do you turn a sound wave into an understandable instruction? It’s a lot like teaching a child. You feed a lot of data into a statistical model. That’s how machine learning works. It’s a probability game. That’s where it gets interesting for design—given a bunch of possible options, we need to use context to zero in on the most correct choice. This is where confidence ratings come in: the system will return the probability that a response is correct. Effectively, the system is telling you how sure or not it is about possible results. If the customer makes a request in an unusual or unexpected way, our system is likely to guess incorrectly. That’s because the system is being given something new.
Designing a conversation is relatively straightforward. But 80% of your voice design time will be spent designing for what happens when things go wrong. In voice recognition, edge cases are front and centre.
Here’s another challenge. Interaction with most voice interfaces is part conversation, part performance. Most interactions are not private.
Humans don’t distinguish digital speech fom human speech. That means these devices are intrinsically social. Our brains our wired to try to extract social information, even form digital speech. See, for example, why it’s such a big question as to what gender a voice interface has.
Delivering a voice interface
Storyboards help depict the context of use. Sample dialogues are your new wireframes. These are little scripts that not only cover the happy path, but also your edge case. Then you reverse engineer from there.
Flow diagrams communicate customer states, but don’t use the actual text in them.
Prompt lists are your final deliverable.
Functional prototypes are really important for voice interfaces. You’ll learn the real way that customers will ask for things.
If you build a working prototype, you’ll be building two things: a natural language interaction model (often a JSON file) and custom business logic (in a programming language).
Eventually voice design will become a core competency, much like mobile, which was once separate.
Ask yourself what tasks your customers complete on your site that feel clunkly. Remember that voice desing is almost never about new scenarious. Start your journey into voice interfaces by tackling old problems in new, more inclusive ways.
May the voice be with you!
Thursday, August 8th, 2019
I recently wrote about a web-specific example of a general principle for choosing the right tool for the job:
I was—yet again—talking about appropriateness. Use the right technology for the task at hand. Here’s the example I gave:
Surprisingly, I got some pushback on this. Šime Vidas wrote:
Based on my experience, this is not necessarily the case.
Going from server-side rendering and progressive enhancement via JS to a single-page app powered by a JS framework was a enormous reduction in complexity for me (so the opposite of over-engineering).
(Emphasis mine.) He goes on to say:
My main concerns are ease of use & maintainability. If you get those things right, you’re good and it’s not over-engineering.
There’s no doubt that maintainability is a desirable goal. And ease of use for the developer is also important …but I think they pale in comparison to ease of use for the end user.
To be fair, the specific use-case I mentioned was making a blog. And a blog is a personal thing. You can do whatever the heck you like on your own website and don’t let anyone tell you otherwise.
So I probably chose a poor example to illustrate my point. I was thinking more about when you’re making websites for a living. You’re being paid money to make something available on the web. In that situation, I strongly believe that user needs should win out over developer convenience.
As a user-centred developer, my priority is doing what’s best for end users. That’s not to say I don’t value developer convenience. I do. But I prioritise user needs over developer needs. And in any case, those two needs don’t even come into conflict most of the time.
That’s why I responded to Šime, saying:
Your main concern should be user needs—not your own.
When I talk about over-engineering, I’m speaking from the perspective of end users, not developers.
Before considering your ease of use, and maintainability, consider users first.
In fairness to Šime, he’s being very open and honest about his priorities. I admire that. I’ve seen too many developers try to provide user experience justifications for decisions made for developer convenience. Once again I recommend Alex’s excellent article, The “Developer Experience” Bait-and-Switch:
The swap is executed by implying that by making things better for developers, users will eventually benefit equivalently. The unstated agreement is that developers share all of the same goals with the same intensity as end users and even managers. This is not true.
Now I worry I wasn’t specific enough when I talked about choosing appropriate technology:
Appropriateness is something I keep coming back to when it comes to evaluating web technologies. I don’t think there are good tools and bad tools; just tools that are appropriate or inapropriate for the task at hand.
I should have made it clear that I was talking about what is appropriate or inapropriate for users. I think I made the mistake of assuming that this was obvious, and didn’t need saying. I’ll try not to make that mistake again.
If you’re in that situation—you are being paid money to make websites, and you are making technology decisions—I urge you to remember Charlie’s words: it isn’t about you.
Wednesday, July 31st, 2019
Charlie’s thoughts on dev perception:
People speak about “the old guard” and “stupid backwards techniques”, forgetting that it’s real humans, with real constraints who are working on these solutions. Most of us are working in a “stupid backwards way” because that “backwardsness” WORKS. It is something that is proven and is clearly documented. We can implement it confident that it will not disappear from fashion within a couple of years.
Thursday, July 25th, 2019
My main concern about this new generation of tools is that they require a specific toolchain in order to function. “If you just use this version of React and just use this styling library and configure things in exactly this way, your designers can play around with coded components.” It worries me that teams would end up choosing (and subsequently holding onto) specific tools not because they’re the best choices for our users but because the designers’ and developers’ workflow depends on a specific toolchain to work properly.
You can spot the less useful design principles after a while. They tend to be wishy-washy; more like empty aspirational exhortations than genuinely useful guidelines for alignment. I’ve written about what makes for good design principles before. Matthew Ström also asked—and answered—What makes a good design principle?
- Good design principles are memorable.
- Good design principles help you say no.
- Good design principles aren’t truisms.
- Good design principles are applicable.
I like those. They’re like design principles for design principles.
One set of design principles that I’ve included in my collection is from gov.uk: government design principles . I think they’re very well thought-through (although I’m always suspicious when I see a nice even number like 10 for the amount of items in the list). There’s a great line in design principle number two—Do less:
Government should only do what only government can do.
This wasn’t a theoretical issue. The multiple departmental websites that preceded gov.uk were notorious for having too much irrelevant content—content that was readily available elsewhere. It was downright wasteful to duplicate that content on a government site. It wasn’t appropriate.
I think that the design principle from GDS could be abstracted into a general technology principle:
Any particular technology should only do what only that particular technology can do.
Choose the least powerful language suitable for a given purpose.
Or, as Derek put it:
In the web front-end stack — HTML, CSS, JS, and ARIA — if you can solve a problem with a simpler solution lower in the stack, you should. It’s less fragile, more foolproof, and just works.
ARIA should only do what only ARIA can do.
CSS should only do what only CSS can do.
HTML should only do what only HTML can do.
Sunday, July 21st, 2019
A case study from Twitter on the benefits of using a design system:
With component-based design, development becomes an act of composition, rather than constantly reinventing the wheel.
I think that could be boiled down to this:
Component-based design favours composition over invention.
I’m not saying that’s good. I’m not saying that’s bad. I’m also not saying it’s neutral.
Friday, July 19th, 2019
Monday, July 15th, 2019
Correlation does not imply causation.
Sunday, June 30th, 2019
Lighthouses of the world, mapped.
Saturday, June 29th, 2019
Thursday, June 27th, 2019
How to speak Silicon Valley: decoding tech bros from microdosing to privacy | US news | The Guardian
Every one of these is chef’s kiss.
Wednesday, June 26th, 2019
Over the last fifty years, we have come to recognize that the fuel of our civilizational expansion has become the main driver of our extinction, and that of many of the species we share the planet with. We are now coming to realize that is as true of our cognitive infrastructure. Something is out of sync, felt everywhere: something amiss in the temporal order, and it is as related to political and technological shifts, shifts in our own cognition and attention, as it is to climatic ones. To think clearly in such times requires an intersectional understanding of time itself, a way of thinking that escapes the cognitive traps, ancient and modern, into which we too easily fall. Because our technologies, the infrastructures we have built to escape our past, have turned instead to cancelling our future.
James writes beautifully about rates of change.
The greatest trick our utility-directed technologies have performed is to constantly pull us out of time: to distract us from the here and now, to treat time as a kind of fossil fuel which can be endlessly extracted in the service of a utopian future which never quite arrives. If information is the new oil, we are already, in the hyper-accelerated way of present things, well into the fracking age, with tremors shuddering through the landscape and the tap water on fire. But this is not enough; it will never be enough. We must be displaced utterly in time, caught up in endless imaginings of the future while endlessly neglecting the lessons and potential actions of the present moment.
Sunday, June 16th, 2019
Although this piece is ostensibly about why we should be using web workers more, there’s a much, much bigger point about the growing power gap between the devices we developers use and the typical device used by the rest of the planet.
While we are getting faster flagship phones every cycle, the vast majority of people can’t afford these. The more affordable phones are stuck in the past and have highly fluctuating performance metrics. These low-end phones will mostly likely be used by the massive number of people coming online in the next couple of years. The gap between the fastest and the slowest phone is getting wider, and the median is going down.
Tuesday, June 11th, 2019
A (possibly) Turing complete language:
As the validity and the semantics of a program depend on the structure of the London underground system, which is administered by London Underground Ltd, a subsidiary of Transport for London, who are likely unaware of the existence of this programming language, its future compatibility is uncertain. Programs may become invalid or subtly wrong as the transport company expands or retires some of the network, reroutes lines or renames stations. Features may be removed with no prior consultation with the programming community. For all we know, Mornington Crescent itself may at some point be closed, at which point this programming language will cease to exist.
Friday, June 7th, 2019
Language is not an invention. As best we can tell it is an evolved feature of the human brain. There have been almost countless languages humans have spoken. But they all follow certain rules that grow out of the wiring of the human brain and human cognition. Critically, it is something that is hardwired into us. Writing is an altogether different and artificial thing.
Monday, May 27th, 2019
Before leading the software project that put men on the moon, Margaret Hamilton worked on the equations that led to chaos theory, followed by Mount Holyoke graduate, Ellen Fetter.