Tags: action

225

sparkline

Tuesday, October 27th, 2020

ARIA in CSS

Sara tweeted something recently that resonated with me:

Also, Pro Tip: Using ARIA attributes as CSS hooks ensures your component will only look (and/or function) properly if said attributes are used in the HTML, which, in turn, ensures that they will always be added (otherwise, the component will obv. be broken)

Yes! I didn’t mention it when I wrote about accessible interactions but this is my preferred way of hooking up CSS and JavaScript interactions. Here’s old Codepen where you can see it in action:

[aria-hidden='true'] {
  display: none;
}

In order for the functionality to work for everyone—screen reader users or not—I have to make sure that I’m toggling the value of aria-hidden in my JavaScript.

There’s another advantage to this technique. Generally, ARIA attributes—like aria-hidden—are added by JavaScript at runtime (rather than being hard-coded in the HTML). If something goes wrong with the JavaScript, the aria-hidden value isn’t set to “true”, which means that the CSS never kicks in. So the default state is for content to be displayed. There’s no assumption that the JavaScript has to work in order for the CSS to make sense.

It’s almost as though accessibility and progressive enhancement are connected somehow…

Wednesday, October 21st, 2020

Accessible interactions

Accessibility on the web is easy. Accessibility on the web is also hard.

I think it’s one of those 80/20 situations. The most common accessibility problems turn out to be very low-hanging fruit. Take, for example, Holly Tuke’s list of the 5 most annoying website features she faces as a blind person every single day:

  • Unlabelled links and buttons
  • No image descriptions
  • Poor use of headings
  • Inaccessible web forms
  • Auto-playing audio and video

None of those problems are hard to fix. That’s what I mean when I say that accessibility on the web is easy. As long as you’re providing a logical page structure with sensible headings, associating form fields with labels, and providing alt text for images, you’re at least 80% of the way there (you’re also doing way better than the majority of websites, sadly).

Ah, but that last 20% or so—that’s where things get tricky. Instead of easy-to-follow rules (“Always provide alt text”, “Always label form fields”, “Use sensible heading levels”), you enter an area of uncertainty and doubt where there are no clear answers. Different combinations of screen readers, browsers, and operating systems might yield very different results.

This is the domain of interaction design. Here be dragons. ARIA can help you …but if you overuse its power, it may cause more harm than good.

When I start to feel overwhelmed by this, I find it’s helpful to take a step back. Instead of trying to imagine all the possible permutations of screen readers and browsers, I start with a more straightforward use case: keyboard users. Keyboard users are (usually) a subset of screen reader users.

The pattern that comes up the most is to do with toggling content. I suppose you could categorise this as progressive disclosure, but I’m talking about quite a wide range of patterns:

  • accordions,
  • menus (including mega menu monstrosities),
  • modal dialogs,
  • tabs.

In each case, there’s some kind of “trigger” that toggles the appearance of a “target”—some chunk of content.

The first question I ask myself is whether the trigger should be a button or a link (at the very least you can narrow it down to that shortlist—you can discount divs, spans, and most other elements immediately; use a trigger that’s focusable and interactive by default).

As is so often the case, the answer is “it depends”, but generally you can’t go wrong with a button. It’s an element designed for general-purpose interactivity. It carries the expectation that when it’s activated, something somewhere happens. That’s certainly true in all the examples I’ve listed above.

That said, I think that links can also make sense in certain situations. It’s related to the second question I ask myself: should the target automatically receive focus?

Again, the answer is “it depends”, but here’s the litmus test I give myself: how far away from each other are the trigger and the target?

If the target content is right after the trigger in the DOM, then a button is almost certainly the right element to use for the trigger. And you probably don’t need to automatically focus the target when the trigger is activated: the content already flows nicely.

<button>Trigger Text</button>
<div id="target">
<p>Target content.</p>
</div>

But if the target is far away from the trigger in the DOM, I often find myself using a good old-fashioned hyperlink with a fragment identifier.

<a href="#target">Trigger Text</a>
…
<div id="target">
<p>Target content.</p>
</div>

Let’s say I’ve got a “log in” link in the main navigation. But it doesn’t go to a separate page. The design shows it popping open a modal window. In this case, the markup for the log-in form might be right at the bottom of the page. This is when I think there’s a reasonable argument for using a link. If, for any reason, the JavaScript fails, the link still works. But if the JavaScript executes, then I can hijack that link and show the form in a modal window. I’ll almost certainly want to automatically focus the form when it appears.

The expectation with links (as opposed to buttons) is that you will be taken somewhere. Let’s face it, modal dialogs are like fake web pages so following through on that expectation makes sense in this context.

So I can answer my first two questions:

  • “Should the trigger be a link or button?” and
  • “Should the target be automatically focused?”

…by answering a different question:

  • “How far away from each other are the trigger and the target?”

It’s not a hard and fast rule, but it helps me out when I’m unsure.

At this point I can write some JavaScript to make sure that both keyboard and mouse users can interact with the interactive component. There’ll certainly be an addEventListener(), some tabindex action, and maybe a focus() method.

Now I can start to think about making sure screen reader users aren’t getting left out. At the very least, I can toggle an aria-expanded attribute on the trigger that corresponds to whether the target is being shown or not. I can also toggle an aria-hidden attribute on the target.

When the target isn’t being shown:

  • the trigger has aria-expanded="false",
  • the target has aria-hidden="true".

When the target is shown:

  • the trigger has aria-expanded="true",
  • the target has aria-hidden="false".

There’s also an aria-controls attribute that allows me to explicitly associate the trigger and the target:

<button aria-controls="target">Trigger Text</button>
<div id="target">
<p>Target content.</p>
</div>

But don’t assume that’s going to help you. As Heydon put it, aria-controls is poop. Still, Léonie points out that you can still go ahead and use it. Personally, I find it a useful “hook” to use in my JavaScript so I know which target is controlled by which trigger.

Here’s some example code I wrote a while back. And here are some old Codepens I made that use this pattern: one with a button and one with a link. See the difference? In the example with a link, the target automatically receives focus. But in this situation, I’d choose the example with a button because the trigger and target are close to each other in the DOM.

At this point, I’ve probably reached the limits of what can be abstracted into a single trigger/target pattern. Depending on the specific component, there might be much more work to do. If it’s a modal dialog, for example, you’ve got to figure out where to put the focus, how to trap the focus, and figure out where the focus should return to when the modal dialog is closed.

I’ve mostly been talking about websites that have some interactive components. If you’re building a single page app, then pretty much every single interaction needs to be made accessible. Good luck with that. (Pro tip: consider not building a single page app—let the browser do what it has been designed to do.)

Anyway, I hope this little stroll through my thought process is useful. If nothing else, it shows how I attempt to cope with an accessibility landscape that looks daunting and ever-changing. Remember though, the fact that you’re even considering this stuff means you care more than most web developers. And you are not alone. There are smart people out there sharing what they learn. The A11y Project is a great hub for finding resources.

And when it comes to interactive patterns like the trigger/target examples I’ve been talking about, there’s one more question I ask myself: what would Heydon do?

Van11y: Accessibility and Vanilla JavaScript - ES2015

Van11y (for Vanilla-Accessibility) is a collection of accessible scripts for rich interfaces elements, built using progressive enhancement and customisable.

Monday, October 19th, 2020

Boring by default

More on battling entropy:

Ever needed to change “just a small thing” on an old page you build years ago? I recently had the pleasure and the simple task of changing some colors in CSS lead to a whole day of me wrangling with old deprecated Grunt tasks and trying to get the build task running.

The solution:

That’s why starting with HTML, CSS and JavaScript without the need to ever compile anything on your local machine is a good idea. Changing some colors on such a page would indeed only take minutes and not a whole day.

I like this mindset:

Be boring by default and enhance on the way.

Monday, October 12th, 2020

Cheating Entropy with Native Web Technologies - Jim Nielsen’s Weblog

This post really highlights one of the biggest issues with the convoluted build tools used for “modern” web development. If you return to a project after any length of time, this is what awaits:

I find entropy staring me back in the face: library updates, breaking API changes, refactored mental models, and possible downright obsolescence. An incredible amount of effort will be required to make a simple change, test it, and get it live.

Always bet on HTML:

Take a moment and think about this super power: if you write vanilla HTML, CSS, and JS, all you have to do is put that code in a web browser and it runs. Edit a file, refresh the page, you’ve got a feedback cycle. As soon as you introduce tooling, as soon as you introduce an abstraction not native to the browser, you may have to invent the universe for a feedback cycle.

Maintainability matters—if not for you, then for future you.

The more I author code as it will be run by the browser the easier it will be to maintain that code over time, despite its perceived inferior developer ergonomics (remember, developer experience encompasses both the present and the future, i.e. “how simple are the ergonomics to build this now and maintain it into the future?) I don’t mind typing some extra characters now if it means I don’t have to learn/relearn, setup, configure, integrate, update, maintain, and inevitably troubleshoot a build tool or framework later.

Tuesday, September 29th, 2020

Unobtrusive feedback

Ten years ago I gave a talk at An Event Apart all about interaction design. It was called Paranormal Interactivity. You can watch the video, listen to the audio or read the transcript if you like.

I think it holds up pretty well. There’s one interaction pattern in particular that I think has stood the test of time. In the talk, I introduce this pattern as something you can see in action on Huffduffer:

I was thinking about how to tell the user that something’s happened without distracting them from their task, and I thought beyond the web. I thought about places that provide feedback mechanisms on screens, and I thought of video games.

So we all know Super Mario, right? And if you think about when you’re collecting coins in Super Mario, it doesn’t stop the game and pop up an alert dialogue and say, “You have just collected ten points, OK, Cancel”, right? It just does it. It does it in the background, but it does provide you with a feedback mechanism.

The feedback you get in Super Mario is about the number of points you’ve just gained. When you collect an item that gives you more points, the number of points you’ve gained appears where the item was …and then drifts upwards as it disappears. It’s unobtrusive enough that it won’t distract you from the gameplay you’re concentrating on but it gives you the reassurance that, yes, you have just gained points.

I think this a neat little feedback mechanism that we can borrow for subtle Ajax interactions on the web. These are actions that don’t change much of the content. The user needs to be able to potentially do lots of these actions on a single page without waiting for feedback every time.

On Huffduffer, for example, you might be looking at a listing of people that you can choose to follow or unfollow. The mechanism for doing that is a button per person. You might potentially be clicking lots of those buttons in quick succession. You want to know that each action has taken effect but you don’t want to be interrupted from your following/unfollowing spree.

You get some feedback in any case: the button changes. Maybe the text updates from “follow” to “unfollow” accompanied by a change in colour (this is what you’ll see on Twitter). The Super Mario style feedback is in addition to that, rather than instead of.

I’ve made a Codepen so you can see a reduced test case of the Super Mario feedback in action.

See the Pen Unobtrusive feedback by Jeremy Keith (@adactio) on CodePen.

Here’s the code available as a gist.

It’s a function that takes two arguments: the element that the feedback originates from (pass in a DOM node reference for this), and the contents of the feedback (this can be a string of text or it can be HTML …or SVG). When you call the function with those two arguments, this is what happens:

  1. The JavaScript generates a span element and puts the feedback contents inside it.
  2. Then it positions that element right over the element that the feedback originates from.
  3. Then there’s a CSS transform. The feedback gets a translateY applied so it drifts upward. At the same time it gets its opacity reduced from 1 to 0 so it’s fading away.
  4. Finally there’s a transitionend event that fires when the animation is over. Once that event fires, the generated span is destroyed.

When I first used this pattern on Huffduffer, I’m pretty sure I was using jQuery. A few years later I rewrote it in vanilla JavaScript. That was four years ago so I wonder if the code could be improved. Have a go if you fancy it.

Still, even if the code could benefit from an update, I’m pleased that the underlying pattern still holds true. I used it recently on The Session and it’s working a treat for a new Ajax interaction there (bookmarking or unbookbarking an item).

If you end up using this unobtrusive feedback pattern anyway, please let me know—I’d love to see more examples of it in the wild.

Thursday, September 24th, 2020

15 years of Clearleft

Ah, look at this beautiful timeline that Cassie designed and built—so many beautiful little touches! It covers the fifteen years(!) of Clearleft so far.

But you can also contribute to it …by looking ahead to the next fifteen years:

Let’s imagine it’s 2035…

How do you hope the practice of design will have changed for the better?

Fill out an online postcard with your hopes for the future.

Friday, July 24th, 2020

MSEdgeExplainers/explainer.md at main · MicrosoftEdge/MSEdgeExplainers

This is great! Ideas for allowing more styling of form controls. I agree with the goals 100% and I like the look of the proposed solutions too.

The team behind this are looking for feedback so be sure to share your thoughts (I’ll probably formulate mine into a blog post).

Thursday, July 23rd, 2020

4 Design Patterns That Violate “Back” Button Expectations – 59% of Sites Get It Wrong - Articles - Baymard Institute

Some interesting research in here around user expecations with the back button:

Generally, we’ve observed that if a new view is sufficiently different visually, or if a new view conceptually feels like a new page, it will be perceived as one — regardless of whether it technically is a new page or not. This has consequences for how a site should handle common product-finding and -exploration elements like overlays, filtering, and sorting. For example, if users click a link and 70% of the view changes to something new, most will perceive this to be a new page, even if it’s technically still the same page, just with a new view loaded in.

Saturday, July 18th, 2020

An Introduction To Stimulus.js — Smashing Magazine

An intro to Stimulus, the lightweight JavaScript library from Basecamp that takes a progressive enhancement approach, as seen with HEY.

One aspect I really like about the approach Stimulus encourages, is I can focus on sending HTML down the wire to my users, which is then jazzed up a little with JavaScript.

I’ve always been a fan of using the first few milliseconds of a user’s attention getting what I have to share with them — in front of them. Then worrying setting up the interaction layer while the user can start processing what they’re seeing.

Furthermore, if the JavaScript were to fail for whatever reason, the user can still see the content and interact with it without JavaScript.

Tuesday, June 9th, 2020

Intent

There are intentions and there are outcomes. Sometimes bad outcomes are the result of good intentions. Less often, good outcomes can be the result of bad intentions. But generally we associate the two: we expect good outcomes to come from good intentions and we expect bad outcomes to come from bad intentions.

Perhaps it’s because of this conflation that we place too much emphasis on intentions. If, for example, someone is called out for causing a bad outcome, their first response is often to defend their intentions. That’s understandable. When someone says “you have created a bad outcome”, I understand why the person on the receiving end would receive that feedback as “you intended to create this bad outcome.” Cue a non-apology that clarifies the (good) intention without acknowledging the reality of the outcome (“It was never my intention to…”).

I get it. Intentions do matter …just not as much as we give them credit for. I mean, in general, I’d prefer bad outcomes to be the inadvertent result of good intentions. But in some ways, it really doesn’t matter: a bad outcome is a bad outcome.

Anyway, all of this is just to preface something I’m going to say about myself:

I am almost certainly racist.

I don’t intend to be racist, but like I said, intentions aren’t really what matter. Outcomes are.

Note, for example, the cliché of the gormless close-minded goon who begins a sentence with “I’m not racist, but…” before going on to say something clearly racist. It’s as though the racism could be defanged by disavowing bad intent.

The same defence mechanism is used to defend racist traditions. “Oh, it’s not racist—that’s just something we’ve always done.” Again, the defence is for the intention, not the outcome. And again, outcomes matter far, far more than intentions.

I really don’t intend to be racist. But how could I not be? I grew up in a small town in Ireland where literally everyone else looked like me. By the same token, I’m also almost certainly sexist. Growing up as a cisgender male in a patriarchal society guarantees that my mind has been shaped in ways I now wish it weren’t.

Acknowledging my racism—and sexism—doesn’t mean I’m okay with it. On the contrary. It’s a source of shame. But acknowledging my racism is a necessary step to changing it.

In any case, it doesn’t really matter how I feel about any of this. This isn’t meant to be a confessional. What matters are outcomes. Outcomes aren’t really the direct result of intentions—outcomes are the direct result of actions.

Most of my actions lately have been very passive. Listening. Watching. Because my actions are passive, they are indistinguishable from silence. That’s not good. Silence can be interpreted as acquiescence, acceptance. That’s not what I intend …but my intentions don’t matter.

So, even though this isn’t about me or my voice or my intentions, and even though this is something that is so self-evident that it shouldn’t need to be said, I want to say:

Black lives matter.

Tuesday, May 26th, 2020

This Website Will Self-destruct

You can send me messages using the form below. If I go 24 hours without receiving a message, I’ll permanently self-destruct, and everything will be wiped from my database.

Friday, May 8th, 2020

Designing for Progressive Disclosure by Steven Hoober

Progressive disclosure interface patterns categorised and evaluated:

  • popups,
  • drawers,
  • mouseover popups (just say no!),
  • accordions,
  • tabs,
  • new pages,
  • scrolling,
  • scrolling sideways.

I really like the hypertext history invoked in this article.

The piece finishes with a great note on the MacNamara fallacy:

Everyone thinks metrics let us measure results. But, actually, they don’t. They measure only what they are measuring. Engagement, for example, is not something that can be measured, so we use an analogue for it. Time on page. Or clicks.

We often end up measuring what is quick, cheap, and easy to measure. Therefore, few organizations regularly conduct usability testing or customer-satisfaction surveys, but lots use analytics.

Even today, organizations often use clicks as a measure of engagement. So, all too often, they design user interfaces to generate clicks, so the system can measure them.

Home | SofaConf 2020

You don’t want to miss this! A five-day online conference with a different theme each day:

  1. Monday: Product Strategy
  2. Tuesday: Research
  3. Wednesday: Service Design
  4. Thursday: Content Strategy
  5. Friday: Interaction Design

Speakers include Amy Hupe, Kelly Goto, Kristina Halvorson, Lou Downe, Leisa Reichelt and many more still to be announce, all for ludicrously cheap ticket prices.

I know it sounds like I’m blowing my own trumpet because this is a Clearleft event, but I had nothing to do with it. The trumpets of my talented co-workers should be blasting in harmonious chorus.

(It’s a truly lovely website too!)

Sunday, March 8th, 2020

The 3 Laws of Serverless - Burke Holland

“Serverless”, is a buzzword. We can’t seem to agree on what it actaully means, so it ends up meaning nothing at all. Much like “cloud” or “dynamic” or “synergy”. You just wait for the right time in a meeting to drop it, walk to the board and draw a Venn Diagram, and then just sit back and wait for your well-deserved promotion.

That’s very true, and I do not like the term “serverless” for the rather obvious reason that it’s all about servers (someone else’s servers, that is). But these three principles are handy for figuring out if you’re building with in a serverlessy kind of way:

  1. You have no knowledge of the underlying system where your code runs.
  2. Scaling is an intrinsic attribute of the technology; so much so that it just happens automatically.
  3. You only pay for what you use.

Abstraction; scale; consumption.

Friday, December 13th, 2019

Why `details` is Not an Accordion - daverupert.com

At the risk of being a broken record; HTML really needs <accordion> , <tabs>, <dialog>, <dropdown>, and <tooltip> elements. Not more “low-level primitives” but good ol’ fashioned, difficult-to-get-consensus-on elements.

Hear, hear!

I wish browsers would prioritize accessibility improvements over things like main thread scheduling optimization to unblock tracking pixels and the Sisyphean task of competing with native.

If we really want to win, let’s make it easy for everyone to access the Web.

Saturday, September 7th, 2019

How Video Games Inspire Great UX – Scott Jenson

Six UX lessons from game design:

  1. Story vs Narrative (Think in terms of story arcs)
  2. Games are fractal (Break up the journey from big to small to tiny)
  3. Learning loop (figure out your core mechanic)
  4. Affordances (Prompt for known loops)
  5. Hintiness (Move to new loops)
  6. Pacing (Be sure to start here)

Tuesday, September 3rd, 2019

Bottom Navigation Pattern On Mobile Web Pages: A Better Alternative? — Smashing Magazine

Making the case for moving your navigation to the bottom of the screen on mobile:

Phones are getting bigger, and some parts of the screen are easier to interact with than others. Having the hamburger menu at the top provides too big of an interaction cost, and we have a large number of amazing mobile app designs that utilize the bottom part of the screen. Maybe it’s time for the web design world to start using these ideas on websites as well?

Sunday, September 1st, 2019

Less… Is More? Apple’s Inconsistent Ellipsis Icons Inspire User Confusion - TidBITS

The ellipsis is the new hamburger.

It’s disappointing that Apple, supposedly a leader in interface design, has resorted to such uninspiring, and I’ll dare say, lazy design in its icons. I don’t claim to be a usability expert, but it seems to me that icons should represent a clear intention, followed by a consistent action.

Tuesday, August 27th, 2019

Voice User Interface Design by Cheryl Platz

Cheryl Platz is speaking at An Event Apart Chicago. Her inaugural An Event Apart presentation is all about voice interfaces, and I’m going to attempt to liveblog it…

Why make a voice interface?

Successful voice interfaces aren’t necessarily solving new problems. They’re used to solve problems that other devices have already solved. Think about kitchen timers. There are lots of ways to set a timer. Your oven might have one. Your phone has one. Why use a $200 device to solve this mundane problem? Same goes for listening to music, news, and weather.

People are using voice interfaces for solving ordinary problems. Why? Context matters. If you’re carrying a toddler, then setting a kitchen timer can be tricky so a voice-activated timer is quite appealing. But why is voice is happening now?

Humans have been developing the art of conversation for thousands of years. It’s one of the first skills we learn. It’s deeply instinctual. Most humans use speach instinctively every day. You can’t necessarily say that about using a keyboard or a mouse.

Voice-based user interfaces are not new. Not just the idea—which we’ve seen in Star Trek—but the actual implementation. Bell Labs had Audrey back in 1952. It recognised ten words—the digits zero through nine. Why did it take so long to get to Alexa?

In the late 70s, DARPA issued a challenge to create a voice-activated system. Carnagie Mellon came up with Harpy (with a thousand word grammar). But none of the solutions could respond in real time. In conversation, we expect a break of no more than 200 or 300 milliseconds.

In the 1980s, computing power couldn’t keep up with voice technology, so progress kind of stopped. Time passed. Things finally started to catch up in the 90s with things like Dragon Naturally Speaking. But that was still about vocabulary, not grammar. By the 2000s, small grammars were starting to show up—starting an X-Box or pausing Netflix. In 2008, Google Voice Search arrived on the iPhone and natural language interaction began to arrive.

What makes natural language interactions so special? It requires minimal training because it uses the conversational muscles we’ve been working for a lifetime. It unlocks the ability to have more forgiving, less robotic conversations with devices. There might be ten different ways to set a timer.

Natural language interactions can also free us from “screen magnetism”—that tendency to stay on a device even when our original task is complete. Voice also enables fast and forgiving searches of huge catalogues without time spent typing or browsing. You can pick a needle straight out of a haystack.

Natural language interactions are excellent for older customers. These interfaces don’t intimidate people without dexterity, vision, or digital experience. Voice input often leads to more inclusive experiences. Many customers with visual or physical disabilities can’t use traditional graphical interfaces. Voice experiences throw open the door of opportunity for some people. However, voice experience can exclude people with speech difficulties.

Making the case for voice interfaces

There’s a misconception that you need to work at Amazon, Google, or Apple to work on a voice interface, or at least that you need to have a big product team. But Cheryl was able to make her first Alexa “skill” in a week. If you’re a web developer, you’re good to go. Your voice “interaction model” is just JSON.

How do you get your product team on board? Find the customers (and situations) you might have excluded with traditional input. Tell the stories of people whose hands are full, or who are vision impaired. You can also point to the adoption rate numbers for smart speakers.

You’ll need to show your scenario in context. Otherwise people will ask, “why can’t we just build an app for this?” Conduct research to demonstrate the appeal of a voice interface. Storyboarding is very useful for visualising the context of use and highlighting existing pain points.

Getting started with voice interfaces

You’ve got to understand how the technology works in order to adapt to how it fails. Here are a few basic concepts.

Utterance. A word, phrase, or sentence spoken by a customer. This is the true form of what the customer provides.

Intent. This is the meaning behind a customer’s request. This is an important distinction because one intent could have thousands of different utterances.

Prompt. The text of a system response that will be provided to a customer. The audio version of a prompt, if needed, is generated separately using text to speech.

Grammar. A finite set of expected utterances. It’s a list. Usually, each entry in a grammar is paired with an intent. Many interfaces start out as being simple grammars before moving on to a machine-learning model later once the concept has been proven.

Here’s the general idea with “artificial intelligence”…

There’s a human with a core intent to do something in the real world, like knowing when the cookies in the oven are done. This is translated into an intent like, “set a 15 minute timer.” That’s the utterance that’s translated into a string. But it hasn’t yet been parsed as language. That string is passed into a natural language understanding system. What comes is a data structure that represents the customers goal e.g. intent=timer; duration=15 minutes. That’s sent to the business logic where a timer is actually step. For a good voice interface, you also want to send back a response e.g. “setting timer for 15 minutes starting now.”

That seems simple enough, right? What’s so hard about designing for voice?

Natural language interfaces are a form of artifical intelligence so it’s not deterministic. There’s a lot of ruling out false positives. Unlike graphical interfaces, voice interfaces are driven by probability.

How do you turn a sound wave into an understandable instruction? It’s a lot like teaching a child. You feed a lot of data into a statistical model. That’s how machine learning works. It’s a probability game. That’s where it gets interesting for design—given a bunch of possible options, we need to use context to zero in on the most correct choice. This is where confidence ratings come in: the system will return the probability that a response is correct. Effectively, the system is telling you how sure or not it is about possible results. If the customer makes a request in an unusual or unexpected way, our system is likely to guess incorrectly. That’s because the system is being given something new.

Designing a conversation is relatively straightforward. But 80% of your voice design time will be spent designing for what happens when things go wrong. In voice recognition, edge cases are front and centre.

Here’s another challenge. Interaction with most voice interfaces is part conversation, part performance. Most interactions are not private.

Humans don’t distinguish digital speech fom human speech. That means these devices are intrinsically social. Our brains our wired to try to extract social information, even form digital speech. See, for example, why it’s such a big question as to what gender a voice interface has.

Delivering a voice interface

Storyboards help depict the context of use. Sample dialogues are your new wireframes. These are little scripts that not only cover the happy path, but also your edge case. Then you reverse engineer from there.

Flow diagrams communicate customer states, but don’t use the actual text in them.

Prompt lists are your final deliverable.

Functional prototypes are really important for voice interfaces. You’ll learn the real way that customers will ask for things.

If you build a working prototype, you’ll be building two things: a natural language interaction model (often a JSON file) and custom business logic (in a programming language).

Eventually voice design will become a core competency, much like mobile, which was once separate.

Ask yourself what tasks your customers complete on your site that feel clunkly. Remember that voice desing is almost never about new scenarious. Start your journey into voice interfaces by tackling old problems in new, more inclusive ways.

May the voice be with you!