Tags: interaction

15

sparkline

Voice User Interface Design by Cheryl Platz

Cheryl Platz is speaking at An Event Apart Chicago. Her inaugural An Event Apart presentation is all about voice interfaces, and I’m going to attempt to liveblog it…

Why make a voice interface?

Successful voice interfaces aren’t necessarily solving new problems. They’re used to solve problems that other devices have already solved. Think about kitchen timers. There are lots of ways to set a timer. Your oven might have one. Your phone has one. Why use a $200 device to solve this mundane problem? Same goes for listening to music, news, and weather.

People are using voice interfaces for solving ordinary problems. Why? Context matters. If you’re carrying a toddler, then setting a kitchen timer can be tricky so a voice-activated timer is quite appealing. But why is voice is happening now?

Humans have been developing the art of conversation for thousands of years. It’s one of the first skills we learn. It’s deeply instinctual. Most humans use speach instinctively every day. You can’t necessarily say that about using a keyboard or a mouse.

Voice-based user interfaces are not new. Not just the idea—which we’ve seen in Star Trek—but the actual implementation. Bell Labs had Audrey back in 1952. It recognised ten words—the digits zero through nine. Why did it take so long to get to Alexa?

In the late 70s, DARPA issued a challenge to create a voice-activated system. Carnagie Mellon came up with Harpy (with a thousand word grammar). But none of the solutions could respond in real time. In conversation, we expect a break of no more than 200 or 300 milliseconds.

In the 1980s, computing power couldn’t keep up with voice technology, so progress kind of stopped. Time passed. Things finally started to catch up in the 90s with things like Dragon Naturally Speaking. But that was still about vocabulary, not grammar. By the 2000s, small grammars were starting to show up—starting an X-Box or pausing Netflix. In 2008, Google Voice Search arrived on the iPhone and natural language interaction began to arrive.

What makes natural language interactions so special? It requires minimal training because it uses the conversational muscles we’ve been working for a lifetime. It unlocks the ability to have more forgiving, less robotic conversations with devices. There might be ten different ways to set a timer.

Natural language interactions can also free us from “screen magnetism”—that tendency to stay on a device even when our original task is complete. Voice also enables fast and forgiving searches of huge catalogues without time spent typing or browsing. You can pick a needle straight out of a haystack.

Natural language interactions are excellent for older customers. These interfaces don’t intimidate people without dexterity, vision, or digital experience. Voice input often leads to more inclusive experiences. Many customers with visual or physical disabilities can’t use traditional graphical interfaces. Voice experiences throw open the door of opportunity for some people. However, voice experience can exclude people with speech difficulties.

Making the case for voice interfaces

There’s a misconception that you need to work at Amazon, Google, or Apple to work on a voice interface, or at least that you need to have a big product team. But Cheryl was able to make her first Alexa “skill” in a week. If you’re a web developer, you’re good to go. Your voice “interaction model” is just JSON.

How do you get your product team on board? Find the customers (and situations) you might have excluded with traditional input. Tell the stories of people whose hands are full, or who are vision impaired. You can also point to the adoption rate numbers for smart speakers.

You’ll need to show your scenario in context. Otherwise people will ask, “why can’t we just build an app for this?” Conduct research to demonstrate the appeal of a voice interface. Storyboarding is very useful for visualising the context of use and highlighting existing pain points.

Getting started with voice interfaces

You’ve got to understand how the technology works in order to adapt to how it fails. Here are a few basic concepts.

Utterance. A word, phrase, or sentence spoken by a customer. This is the true form of what the customer provides.

Intent. This is the meaning behind a customer’s request. This is an important distinction because one intent could have thousands of different utterances.

Prompt. The text of a system response that will be provided to a customer. The audio version of a prompt, if needed, is generated separately using text to speech.

Grammar. A finite set of expected utterances. It’s a list. Usually, each entry in a grammar is paired with an intent. Many interfaces start out as being simple grammars before moving on to a machine-learning model later once the concept has been proven.

Here’s the general idea with “artificial intelligence”…

There’s a human with a core intent to do something in the real world, like knowing when the cookies in the oven are done. This is translated into an intent like, “set a 15 minute timer.” That’s the utterance that’s translated into a string. But it hasn’t yet been parsed as language. That string is passed into a natural language understanding system. What comes is a data structure that represents the customers goal e.g. intent=timer; duration=15 minutes. That’s sent to the business logic where a timer is actually step. For a good voice interface, you also want to send back a response e.g. “setting timer for 15 minutes starting now.”

That seems simple enough, right? What’s so hard about designing for voice?

Natural language interfaces are a form of artifical intelligence so it’s not deterministic. There’s a lot of ruling out false positives. Unlike graphical interfaces, voice interfaces are driven by probability.

How do you turn a sound wave into an understandable instruction? It’s a lot like teaching a child. You feed a lot of data into a statistical model. That’s how machine learning works. It’s a probability game. That’s where it gets interesting for design—given a bunch of possible options, we need to use context to zero in on the most correct choice. This is where confidence ratings come in: the system will return the probability that a response is correct. Effectively, the system is telling you how sure or not it is about possible results. If the customer makes a request in an unusual or unexpected way, our system is likely to guess incorrectly. That’s because the system is being given something new.

Designing a conversation is relatively straightforward. But 80% of your voice design time will be spent designing for what happens when things go wrong. In voice recognition, edge cases are front and centre.

Here’s another challenge. Interaction with most voice interfaces is part conversation, part performance. Most interactions are not private.

Humans don’t distinguish digital speech fom human speech. That means these devices are intrinsically social. Our brains our wired to try to extract social information, even form digital speech. See, for example, why it’s such a big question as to what gender a voice interface has.

Delivering a voice interface

Storyboards help depict the context of use. Sample dialogues are your new wireframes. These are little scripts that not only cover the happy path, but also your edge case. Then you reverse engineer from there.

Flow diagrams communicate customer states, but don’t use the actual text in them.

Prompt lists are your final deliverable.

Functional prototypes are really important for voice interfaces. You’ll learn the real way that customers will ask for things.

If you build a working prototype, you’ll be building two things: a natural language interaction model (often a JSON file) and custom business logic (in a programming language).

Eventually voice design will become a core competency, much like mobile, which was once separate.

Ask yourself what tasks your customers complete on your site that feel clunkly. Remember that voice desing is almost never about new scenarious. Start your journey into voice interfaces by tackling old problems in new, more inclusive ways.

May the voice be with you!

Marty’s mashup

While the Interaction 19 event was a bit of a mixed bag overall, there were some standout speakers.

Marty Neumeier was unsurprisingly excellent. I’d seen him speak before, at UX London a few years back, so I knew he’d be good. He has a very reassuring, avuncular manner when he’s speaking. You know the way that there are some people you could just listen to all day? He’s one of those.

Marty’s talk at Interaction 19 was particularly interesting because it was about his new book. Now, why would that be of particular interest? Well, this new book—Scramble—is a business book, but it’s written in the style of a thriller. He wanted it to be like one of those airport books that people read as a guilty pleasure.

One rainy night in December, young CEO David Stone is inexplicably called back to the office. The company’s chairman tells him that the board members have reached the end of their patience. If David can’t produce a viable turnaround plan in five weeks, he’s out of a job. His only hope is to try something new. But what?

I love this idea!

I’ve talked before about borrowing narrative structures from literature and film and applying them to blog posts and conference talks—techniques like flashback, in media res, etc.—so I really like the idea of taking an entire genre and applying it to a technical topic.

The closest I’ve seen is the comic that Scott McCloud wrote for the release of Google Chrome back in 2008. But how about a romantic comedy about service workers? Or a detective novel about CSS grid?

I have a feeling I’ll be thinking about Marty Neumeier’s book next time I’m struggling to put a conference talk together.

In the meantime, if you want to learn from the master storyteller himself, Clearleft are running a two-day Brand Master Workshop with Marty on March 14th and 15th at The Barbican in London. Early bird tickets are on sale until this Thursday, so don’t dilly-dally if you were thinking about nabbing your spot.

Interaction 19

Right before heading to Geneva to spend the week hacking at CERN, I was in Seattle with a sizable Clearleft contingent to attend Interaction 19, the annual conference put on by the Interaction Design Association.

Ben has rounded up the highlights from my fellow Clearlefties. There are some good talks listed there: John Maeda, Nelly Ben Hayoun, and Jon Bell were thoroughly enjoyable. Some other talks were just okay, and there was one talk, by IXDA president Alok Nandi, that was almost impressive in how rambling and incoherent it was. It was like being in a scene from Silicon Valley. I remember clapping at the end; not out of appreciation, but out of relief.

If truth be told, Interaction 19 had about a day’s worth of really great content …spread out over three days. To be fair, that’s par for the course. When we went to Interaction 17 in New York, the hit/miss ratio was about the same:

There were some really good talks at the event, but alas, the muti-track format made it difficult to see all of them. Continuous partial FOMO was the order of the day.

And as I said at the time:

To be honest, the conference was only part of the motivation for the trip. Spending a week in New York with a gaggle of Clearlefties was its own reward.

So I’m willing to cut Interaction 19 a lot of slack. Even if quite a few of the talks were just so-so, getting to hang with Clearlefties in Seattle during snowmageddon was a lot of fun (and you’ll be pleased to hear that we didn’t even resort to cannibalism to survive).

But while the content of the conference was fair to middling, the organisation of it was a shambles:

Imagine the Fyre Festival but in downtown Seattle in winter. Welcome to @ixdconf. #ixd19

They sold more tickets than there were seats. I ended up watching the first morning’s keynotes being streamed to a screen in a conference room in a different building.

Now, I’ve been at events with keynotes that have overflow rooms—South by Southwest does this. But that’s at a different scale. This is a conference with a known number of attendees, each one of them spending over a thousand dollars to attend. I’m pretty sure that a first-come, first-served policy isn’t the best way of treating those attendees.

Anyway, here’s what I submitted for that round-up of the best talks, but which, for reasons of prudence, was omitted from the final post:

I really enjoyed the keynote by Liz Jackson on inclusive design. I would’ve enjoyed it even more if I could’ve seen it in person. Instead I watched it live-streamed to a meeting room two buildings over because the conference sold more tickets than they had seats for. This was after queueing in the cold for registration. So I feel like I learned a lot from Interaction 19 …about how not to organise a conference.

Still, as Ben notes:

We all enjoyed ourselves thoroughly, despite best efforts by the West Coast snow to disrupt the entire city.

I’m going to be back in Seattle in just under two weeks for An Event Apart. Now that’s a conference! It runs like a well-oiled machine, and every talk in its single track has been curated for excellence …with one exception.

Ubiquity and consistency

I keep thinking about this post from Baldur Bjarnason, Over-engineering is under-engineering. It took me a while to get my head around what he was saying, but now that (I think) I understand it, I find it to be very astute.

Let’s take a single interface element, say, a dropdown menu. This is the example Laura uses in her article for 24 Ways called Accessibility Through Semantic HTML. You’ve got two choices, broadly speaking:

  1. Use the HTML select element.
  2. Create your own dropdown widget using JavaScript (working with divs and spans).

The advantage of the first choice is that it’s lightweight, it works everywhere, and the browser does all the hard work for you.

But…

You don’t get complete control. Because the browser is doing the heavy lifting, you can’t craft the details of the dropdown to look identical on different browser/OS combinations.

That’s where the second option comes in. By scripting your own dropdown, you get complete control over the appearance and behaviour of the widget. The disadvantage is that, because you’re now doing all the work instead of the browser, it’s up to you to do all the work—that means lots of JavaScript, thinking about edge cases, and making the whole thing accessible.

This is the point that Baldur makes: no matter how much you over-engineer your own custom solution, there’ll always be something that falls between the cracks. So, ironically, the over-engineered solution—when compared to the simple under-engineered native browser solution—ends up being under-engineered.

Is it worth it? Rian Rietveld asks:

It is impossible to style select option. But is that really necessary? Is it worth abandoning the native browser behavior for a complete rewrite in JavaScript of the functionality?

The answer, as ever, is it depends. It depends on your priorities. If your priority is having consistent control over the details, then foregoing native browser functionality in favour of scripting everything yourself aligns with your goals.

But I’m reminded of something that Eric often says:

The web does not value consistency. The web values ubiquity.

Ubiquity; universality; accessibility—however you want to label it, it’s what lies at the heart of the World Wide Web. It’s the idea that anyone should be able to access a resource, regardless of technical or personal constraints. It’s an admirable goal, and what’s even more admirable is that the web succeeds in this goal! But sometimes something’s gotta give, and that something is control. Rian again:

The days that a website must be pixel perfect and must look the same in every browser are over. There are so many devices these days, that an identical design for all is not doable. Or we must take a huge effort for custom form elements design.

So far I’ve only been looking at the micro scale of a single interface element, but this tension between ubiquity and consistency plays out at larger scales too. Take page navigations. That’s literally what browsers do. Click on a link, and the browser fetches that URL, displaying progress at it goes. The alternative, as exemplified by single page apps, is to do all of that for yourself using JavaScript: figure out the routing, show some kind of progress, load some JSON, parse it, convert it into HTML, and update the DOM.

Personally, I tend to go for the first option. Partly that’s because I like to apply the rule of least power, but mostly it’s because I’m very lazy (I also have qualms about sending a whole lotta JavaScript down the wire just so the end user gets to do something that their browser would do for them anyway). But I get it. I understand why others might wish for greater control, even if it comes with a price tag of fragility.

I think Jake’s navigation transitions proposal is fascinating. What if there were a browser-native way to get more control over how page navigations happen? I reckon that would cover the justification of 90% of single page apps.

That’s a great way of examining these kinds of decisions and questioning how this tension could be resolved. If people are frustrated by the lack of control in browser-native navigations, let’s figure out a way to give them more control. If people are frustrated by the lack of styling for select elements, maybe we should figure out a way of giving them more control over styling.

Hang on though. I feel like I’ve painted a divisive picture, like you have to make a choice between ubiquity or consistency. But the rather wonderful truth is that, on the web, you can have your cake and eat it. That’s what I was getting at with the three-step approach I describe in Resilient Web Design:

  1. Identify core functionality.
  2. Make that functionality available using the simplest possible technology.
  3. Enhance!

Like, say…

  1. The user needs to select an item from a list of options.
  2. Use a select element.
  3. Use JavaScript to replace that native element with a widget of your own devising.

Or…

  1. The user needs to navigate to another page.
  2. Use an a element with an href attribute.
  3. Use JavaScript to intercept that click, add a nice transition, and pull in the content using Ajax.

The pushback I get from people in the control/consistency camp is that this sounds like more work. It kinda is. But honestly, in my experience, it’s not that much more work. Also, and I realise I’m contradicting the part where I said I’m lazy, but that’s why it’s called work. This is our job. It’s not about what we prefer; it’s about serving the needs of the people who use what we build.

Anyway, if I were to rephrase my three-step process in terms of under-engineering and over-engineering, it might look something like this:

  1. Start with user needs.
  2. Build an under-engineered solution—one that might not offer you much control, but that works for everyone.
  3. Layer on a more over-engineered solution—one that might not work for everyone, but that offers you more control.

Ubiquity, then consistency.

Pseudo and pseudon’t

I like CSS pseudo-classes. They come in handy for adding little enhancements to interfaces based on interaction.

Take the form-related pseudo-classes, for example: :valid, :invalid, :required, :in-range, and many more.

Let’s say I want to adjust the appearance of an element based on whether it has been filled in correctly. I might have an input element like this:

<input type="email" required>

Then I can write some CSS to put green border on it once it meets the minimum requirements for validity:

input:valid {
  border: 1px solid green;
}

That works, but somewhat annoyingly, the appearance will change while the user is still typing in the field (as soon as the user types an @ symbol, the border goes green). That can be distracting, or downright annoying.

I only want to display the green border when the input is valid and the field is not focused. Luckily for me, those last two words (“not focused”) map nicely to some more pseudo-classes: not and focus:

input:not(:focus):valid {
  border: 1px solid green;
}

If I want to get really fancy, I could display an icon next to form fields that have been filled in. But to do that, I’d need more than a pseudo-class; I’d need a pseudo-element, like :after

input:not(:focus):valid::after {
  content: '✓';
}

…except that won’t work. It turns out that you can’t add generated content to replaced elements like form fields. I’d have to add a regular element into my markup, like this:

<input type="email" required>
<span></span>

So I could style it with:

input:not(:focus):valid + span::after {
  content: '✓';
}

But that feels icky.

Update: See this clever flexbox technique by Hugo Giraudel for a potential solution.

A question of timing

I’ve been updating my collection of design principles lately, adding in some more examples from Android and Windows. Coincidentally, Vasilis unveiled a neat little page that grabs one list of principles at random —just keep refreshing to see more.

I also added this list of seven principles of rich web applications to the collection, although they feel a bit more like engineering principles than design principles per se. That said, they’re really, really good. Every single one is rooted in performance and the user’s experience, not developer convenience.

Don’t get me wrong: developer convenience is very, very important. Nobody wants to feel like they’re doing unnecessary work. But I feel very strongly that the needs of the end user should trump the needs of the developer in almost all instances (you may feel differently and that’s absolutely fine; we’ll agree to differ).

That push and pull between developer convenience and user experience is, I think, most evident in the first principle: server-rendered pages are not optional. Now before you jump to conclusions, the author is not saying that you should never do client-side rendering, but instead points out the very important performance benefits of having the server render the initial page. After that—if the user’s browser cuts the mustard—you can use client-side rendering exclusively.

The issue with that hybrid approach—as I’ve discussed before—is that it’s hard. Isomorphic JavaScript (terrible name) can theoretically help here, but I haven’t seen too many examples of it in action. I suspect that’s because this approach doesn’t yet offer enough developer convenience.

Anyway, I found myself nodding along enthusiastically with that first of seven design principles. Then I got to the second one: act immediately on user input. That sounds eminently sensible, and it’s backed up with sound reasoning. But it finishes with:

Techniques like PJAX or TurboLinks unfortunately largely miss out on the opportunities described in this section.

Ah. See, I’m a big fan of PJAX. It’s essentially the same thing as the Hijax technique I talked about many years ago in Bulletproof Ajax, but with the new addition of HTML5’s History API. It’s a quick’n’dirty way of giving the illusion of a fat client: all the work is actually being done in the server, which sends back chunks of HTML that update the interface. But it’s true that, because of that round-trip to the server, there’s a bit of a delay and so you often end up briefly displaying a loading indicator.

I contend that spinners or “loading indicators” should become a rarity

I agree …but I also like using PJAX/Hijax. Now how do I reconcile what’s best for the user experience with what’s best for my own developer convenience?

I’ve come up with a compromise, and you can see it in action on The Session. There are multiple examples of PJAX in action on that site, like pretty much any page that returns paginated results: new tune settings, the latest events, and so on. The steps for initiating an Ajax request used to be:

  1. Listen for any clicks on the page,
  2. If a “previous” or “next” button is clicked, then:
  3. Display a loading indicator,
  4. Request the new data from the server, and
  5. Update the page with the new data.

In one sense, I am acting immediately to user input, because I always display the loading indicator straight away. But because the loading indicator always appears, no matter how fast or slow the server responds, it sometimes only appears very briefly—just for a flash. In that situation, I wonder if it’s serving any purpose. It might even be doing the opposite to its intended purpose—it draws attention to the fact that there’s a round-trip to the server.

“What if”, I asked myself, “I only showed the loading indicator if the server is taking too long to send a response back?”

The updated flow now looks like this:

  1. Listen for any clicks on the page,
  2. If a “previous” or “next” button is clicked, then:
  3. Start a timer, and
  4. Request the new data from the server.
  5. If the timer reaches an upper limit, show a loading indicator.
  6. When the server sends a response, cancel the timer and
  7. Update the page with the new data.

Even though there are more steps, there’s actually less happening from the user’s perspective. Where previously you would experience this:

  1. I click on a button,
  2. I briefly see a loading indicator,
  3. I see the new data.

Now your experience is:

  1. I click on a button,
  2. I see the new data.

…unless the server or the network is taking too long, in which case the loading indicator appears as an interim step.

The question is: how long is too long? How long do I wait before showing the loading indicator?

The Nielsen Norman group offers this bit of research:

0.1 second is about the limit for having the user feel that the system is reacting instantaneously, meaning that no special feedback is necessary except to display the result.

So I should set my timer to 100 milliseconds. In practice, I found that I can set it to as high as 200 to 250 milliseconds and keep it feeling very close to instantaneous. Anything over that, though, and it’s probably best to display a loading indicator: otherwise the interface starts to feel a little sluggish, and slightly uncanny. (“Did that click do any—? Oh, it did.”)

You can test the response time by looking at some of the simpler pagination examples on The Session: new recordings or new discussions, for example. To see examples of when the server takes a bit longer to send a response, you can try paginating through search results. These take longer because, frankly, I’m not very good at optimising some of those search queries.

There you have it: an interface that—under optimal conditions—reacts to user input instantaneously, but falls back to displaying a loading indicator when conditions are less than ideal. The result is something that feels like a client-side web thang, even though the actual complexity is on the server.

Now to see what else I can learn from the rest of those design principles.

Double tap delay

Even though my encounter with Ted yesterday was brief, we still managed to turn the conversation to browsers, standards, and all things web in our brief chat.

Specifically, we talked about this proposal in Blink related to the 300 millisecond delay that mobile browsers introduce after a tap event.

Why do browsers have this 300 millisecond delay? Well, you know when you’re looking at fixed-width desktop-based website on a mobile phone, and everything is zoomed out, and one of the ways that you can zoom in to a specific portion of the page is to double tap on that content? A double tap is defined as two taps less than 300 milliseconds apart. So whenever you tap on something in a touch-based browser, it needs to wait for that length of time to see if you’re going to turn that single tap into a double tap.

The overall effect is that tap actions feel a little bit laggy on the web compared to native apps. You can fix this by using the fastclick code from FT Labs, but I always feel weird solving a problem on mobile by throwing more front-end code at it.

Hence the Blink proposal: if the author has used a meta viewport declaration to set width=device-width (effectively saying “hey, I know what I’m doing: this content doesn’t need to be zoomed”), then the 300 millisecond delay could be removed from tap events. Note: this only affects double taps—pinch zoom is unaffected.

This sounds like a sensible idea to me, but Ted says that he sometimes still likes to double tap to zoom even in responsive designs. He’d prefer a per-element solution rather than a per-document meta element. An attribute? Or maybe a CSS declaration similar to pointer events?

I thought for a minute, and then I spitballed this idea: what if the 300 millisecond delay only applied to non-focusable elements?

After all, the tap delay is only noticeable when you’re trying to tap on a focusable element: links, buttons, form fields. Double tapping tends to happen on text content: divs, paragraphs, sections. That’s assuming you are actually using buttons and links for buttons and links—not spans or divs a-la Google.

And if the author decides they want to remove the tap delay on a non-focusable element, they can always make it focusable by adding tabindex=-1 (if that still works …does that still work? I don’t even know any more).

Anyway, that was my not-very-considered idea, but on first pass, it doesn’t strike me as being obviously stupid or crazy.

So, how about it, browser makers? Does removing the 300 millisecond delay on focusable elements—possibly in combination with the meta viewport declaration—make sense?

Progresponsive

Brad has done a great job in documenting navigation patterns for responsive designs. More recently I came across Erick Arbé’s similar collection of patterns for responsive navigation. And, of course, at the Responsive Day Out, David gave a presentation on the subject.

David Bushell: Responsive Navigation on Huffduffer

As I mentioned in the chat after David’s talk, choosing a pattern doesn’t need to be an either/or decision. You can start with a simple solution and progressively enhance to a more complex navigation pattern.

Take the footer-anchor pattern, for example. I really, really like this pattern. It doesn’t require any JavaScript whatsoever; just a simple hyperlink from the top of the page that links to the fragment identifier of the navigation at the bottom of the page. It works on just about every device.

But you don’t have to stop there. Now that you’ve got a simple solution that works everywhere, you can enhance it for more capable browsers.

Take a look at this example that applies the off-canvas pattern for browsers capable of handling the JavaScript and CSS required.

You can see the two patterns in action by looking at the source in JS Bin. If you toggle the “Auto-run JS” checkbox, you can see both behaviours. Without JavaScript you get the footer-anchor pattern. With JavaScript (and a capable browser) you get the off-canvas pattern.

I haven’t applied any media queries in this instance, but it would be pretty straightforward to apply absolute positioning or the display: table hack to display the navigation by default at wider screen sizes. I’ll leave that as an exercise for the reader (bonus points: apply the off-canvas from the right of the viewport rather than the left).

Feel free to peruse the somewhat simplistic code. I’m doing a bit of feature detection—or cutting the mustard—to test for querySelector and addEventListener. If a browser passes the test, a class is applied to the document root and some JavaScript is executed on page load to toggle the off-canvas behaviour.

On a recent project, I found myself implementing a number of different navigation patterns: off-canvas, overlay, and progressive disclosure. But each one began as an instance of the simple footer-anchor pattern.

Progressive enhancement, baby. Still not dead, still important.

Off-canvas horizontal lists

There was a repeated rallying cry at the Responsive Day Out. It was the call for more sharing—more sharing of data, more sharing of case studies, more sharing of success stories, but also more sharing of failures.

In that spirit, I thought I’d share a pattern I’ve been working on. It didn’t work, but I’m not going to let that stop me putting it out there.

Here’s what I wanted to do…

Let’s say you’ve got a list of items; modular chunks of markup like an image and a caption, for example. By default these will display linearly on a small screen: a vertical list. I quite like the way that the Flickr iPhone app takes those lists and makes them horizontal—they go off-canvas (to the right), with a little bit of the next item peaking out to give some affordance. It’s like an off-canvas carousel.

I’d quite like to use that interaction in responsive designs. But I don’t want to do it by throwing a lot of JavaScript at the problem. So I thought I’d attempt to achieve it with a little bit of CSS.

So, let’s say I’ve got a list of six items like this:

<div class="items">
    <ul class="item-list">
        <li class="item"></li>
        <li class="item"></li>
        <li class="item"></li>
        <li class="item"></li>
        <li class="item"></li>
        <li class="item"></li>
    </ul><!-- /.item-list -->
</div><!-- /.items -->

Please pay no mind to the qualities of the class names: this is just a quick proof of concept.

Here’s how that looks. At larger screen sizes, I display the list items in groups of two or three, side by side. At smaller sizes, the items simply linearise vertically.

Okay, now within a small-screen media query I’m going to constrain the width of the container:

.items {
    width: 100%;
}

I’m going to make the list within that element stretch off-canvas for six screens wide (this depends on me knowing that there will be exactly six items in the list):

.items .item-list {
    width: 600%;
}

Now I’ll make each item one sixth of that size, which should be one screen’s worth. Actually, I’m going to make it a bit less than exactly one sixth (which would be 16.6666%) so that a bit of the next item peaks out:

.item-list .item {
    width: 15%;
}

My hope was that to make this crawlable/swipable, all I had to do was apply overflow: scroll to the containing element:

.items {
    width: 100%;
    overflow: scroll;
}

All of that is wrapped up in a small-viewport media query:

@media all and (max-width: 30em) {
    .items {
        width: 100%;
        overflow: scroll;
    }
    .items .item-list {
        width: 600%;
    }
    .items .item {
        width: 15%;
    }
}

It actually works …in some browsers. Alas, support for overflow: scroll doesn’t extend back as far as Android 2, still a very popular flavour of that operating system. That’s quite a showstopper.

There is a polyfill called Overthrow from those mad geniuses at Filament Group. But, as I said, I’d rather not throw more code at the problem. While I can imagine shovelling a polyfill at a desktop browser, I have a lot of qualms about trying to “support” an older mobile browser by giving it a chunk of JavaScript to chew on.

What I really need is a way to detect support for overflow: scroll. Alas, looking at the code for Overthrow, that isn’t so easy. Modernizr cannot help me here. We are in the realm of the undetectables.

My pattern is, alas, a failure.

Or, at least, it’s a failure for now. The @supports rule in CSS is tailor-made for this kind of situation. Basically, I don’t want any those small-screen rules to apply unless the browser supports overflow: scroll. Here’s how I will be able to do that:

@media all and (max-width: 30em) {
  @supports (overflow: scroll) {
    .items {
        width: 100%;
        overflow: scroll;
    }
    .items .item-list {
        width: 600%;
    }
    .items .item {
        width: 15%;
    }
  }
}

This is really, really useful. It means that I can start implementing this pattern now even though very few browsers currently understand @supports. That’s okay. Browsers that don’t understand it will simply ignore the whole block of CSS, leaving the list items to display vertically. But as @support gets more …um, support …then the pattern will kick in for those more capable browsers.

I can see myself adding this pre-emptive pattern for a few different use cases:

Feel free to poke at the example code. Perhaps you can find a way to succeed where I have failed.

Publishing Paranormal Interactivity

I’ve published the transcript of a talk I gave at An Event Apart in 2010. It’s mostly about interaction design, with a couple of diversions into progressive enhancement and personality in products. It’s called Paranormal Interactivity.

I had a lot of fun with this talk. It’s interspersed with videos from The Hitchhiker’s Guide To The Galaxy, Alan Partridge, and Super Mario, with special guest appearances from the existentialist chalkboard and Poshy’s upper back torso.

If you don’t feel like reading it, you can always watch the video or listen to the audio.

Adactio: Articles—Paranormal Interactivity on Huffduffer

You could even look at the slides but, as I always say, they won’t make much sense without the context of the presentation.

Continuous partial annoyance

Twitter have been rolling out a new redesign. Thanks to Dustin, I got to try it out when the switch was flipped.

As with any redesign, the initial reaction tends to be It’s different! I fear change! Therefore I dislike this. See also: redesigns of The Guardian, Last.fm, Flickr, BBC…

With Twitter, that initial knee-jerk fades pretty quickly because the new site is undeniably beautiful. The visual design is top-notch.

There’s a nice little addition in the markup, too. The body element has a class name that you can hook into for user stylesheets. This is a very, very, very good thing. For example, my class name is .user-style-adactio so I can add some declarations to my user stylesheet.

The first rule simply hides the egregious Trending Topics and Who To Follow features (and I love that Who To Follow abbreviates to WTF):

.user-style-adactio .trends-inner,
.user-style-adactio .wtf-inner {
 display: none !important;
}

By the way, a user stylesheet is the only time it’s acceptable use important! in your CSS.

My other rules adjust the layout a bit when the viewport gets smaller. It’s just a quick little hack and it’s not great but it’s handy if, like me and Norm!, you don’t like a site dictating how wide your browser window should be. Thanks to user stylesheets, you can fix this:

@media screen and (max-width: 995px) {
 .user-style-adactio #page-container,
 .user-style-adactio #page-outer {
  min-width: 590px !important;
 }
 .user-style-adactio .dashboard {
  float: none !important;
  clear: both !important;
  max-width: 0 !important;
 }
}

Handy tip: if you use Dropbox, store your user stylesheet there. That way, you can point multiple machines to the same stylesheet. I’ve got my laptop at home and my iMac at work pointing to the same CSS file.

There’s one aspect of the new Twitter redesign that I really don’t like, and I can’t fix it with a user stylesheet: infinite scrolling. As I said (on Twitter, of course):

I’m allergic to infinite scrolling

Notice that I didn’t say that infinite scrolling is wrong, it’s just wrong for me. There’s nothing wrong with peanuts unless you have a nut allergy.

The reason that I don’t like infinite scrolling is that I actually use the scrollbar to scroll. That is, I move my cursor over the scrollbar, click and drag. Infinite scrolling makes this unworkable: the scrollbar under my cursor jumps around as new content is loaded.

I figured that in this day of mouse wheels and trackpads, I must be in the minority with my old-fashioned scrollbar usage. I asked for data on Twitter, and sure enough, most people who responded said they used the mouse wheel, the trackpad, the space bar or arrow keys. Though some people still found the scrollbar useful as a visual indicator of how long the page is …which is also negated by infinite scrolling.

Interestingly, while most of the people who responded to my query on Twitter said they hardly ever use the scrollbar, the Firefox heatmap shows that it’s one of the most used interface features. That was a much larger sampling: 117,000 users.

Still, I can understand why Twitter have decided to go with infinite scrolling. If I’m in the minority in thinking it’s horrible, that’s my problem. I can’t even claim that it’s an accessibility problem: it requires more manual dexterity to use the scrollbar than to use other methods of scrolling.

Twitter could add a user setting to switch off infinite scrolling—perhaps replacing it with the old style “more” button, which I liked—but that’s a cop-out. Whenever something gets shunted off into a preference, it’s generally a sign of indecision in the design. The Twitter redesign isn’t indecisive: it has a very clear and consistent visual and interactive design vocabulary. It just happens that one aspect of the UI vocabulary doesn’t mesh well with my own usage pattern.

So, in this case, the solution may well be for me to change the way I use the site. It still irks me, though. I’m generally against any interactions that happen without an explicit request from the user, such as revealing data and functionality on hover, for example. Twitter avoids that particular anti-pattern but with infinite scrolling, the act of moving down the page is interpreted as a request to load more data. I would much prefer to request that data explicitly with a button or link. Of course, that requires that the user do more, so it could be argued that infinite scrolling actually reduces the number of interactions that the user is required to do …assuming that the inferred interaction is in fact the desired interaction. That’s a big assumption.

On the face of it, it would seem that Twitter are being somewhat dismissive of the scrollbar as a UI element. But that’s not true. While they are reducing the usefulness of browser-native scrollbars by using infinite scrolling, they are, at the same time, replicating the functionality of scrollbars but non-natively. If you reveal a side panel—by clicking on someone’s Twitter username, for example—and if the content doesn’t fit within the viewport, then a non-native scrollbar is generated.

scrollbars

As I said, the new redesign is wonderful. I’m just nit-picking ..but it’s a big nit.

The Framework Age

Liz Danzico is talking at An Event Apart San Francisco about frameworks. Not CSS frameworks, not JavaScript frameworks, not Rails, not Django, but websites as frameworks. These days we’re designing frameworks for user interaction rather than static artefacts.

Liz tells a story about Miles Davis who showed up at the studio with six slips of paper listing the six musicians he wanted to play with on his record. Over the course of one day, these people who had never played this music together recorded a whole album. Davis wanted to capture something called creative instability. Kind of Blue came out of this framework that he created.

Liz wants to talk about frameworks that are uninscribed and detectable cues that loosely govern a set of actions. These are interaction frameworks, frameworks that shape how people behave.

Back to music. Classical music uses classical notation. If you can’t read notation, you can’t make sense of it so it’s kind of elitist. It also provides rules like tempo and key. If you step outside these boundaries, you are deviating from the notation. Also, every note is accounted for in the notation. You can’t improvise it. Jazz notation is different. It provides chord progressions. It’s up to the musician to improvise around this framework. Modal jazz is even more abstract. That’s what Miles Davis invented that day in the studio. Kind of Blue was created out of just a scale.

On the web, we’re making the same transition from classical to jazz. We’re improvising. We’ve moved from a hard-coded system of building pages to an open system of creating participatory environments.

But this kind of tension is nothing new. It’s being going on for years. There’s been a long-running tension between orality and literacy. The printing press destroyed a lot of oral tradition but we still use word of mouth to pass on urban legends and recipes. Liz mentions Alex Wright’s observation in Glut that we are seeing a resurgence in this kind of oral tradition online. Even though we’re writing in blogs and mailing lists, we’re not so much publishing as talking.

There’s evidence of improv online. Exquisite simplicity was how pianist Bill Evans described Miles Davis’s framework of six slips of paper.

Quoting from The Paradox of Choice, Liz shows how the default settings can make a big difference (in the number of organ donations, for example, which could be opt-in or opt-out). Geni has some smart default settings. Same with Tripit. All you need to do is forward an email and it will take care of the rest. Focus on creating smart defaults.

In improv, you need to involve the audience. It’s important to adapt to what your audience is doing. Here’s an example from architecture: there was a fountain that was built in Washington Square Park in New York but before they got ‘round to turning it on, people started using it as a seating area. When the city tried to turn on the fountain, people revolted. The fountain is dry to this day and is used for public theatre.

Referring to the redesign of the Wordpress admin, Liz points out that it’s really important to involve users in the design process. There’s a difference between asking your audience what they think of a system compared to looking at how they are actually using that system.

Listen and watch. That’s another lesson we can take from music and apply to the web. When you’re playing with other people, not only do you have to listen to what the other people are doing, you have to watch them too. It’s the same with architecture. Desire paths are created by people actually using a space. They show clearly where paths should be built. Eyetracking can reveal the desire paths of users interacting with an application. There are other tools like User Voice which can involve the audience. Observe. Listen. Pay attention.

A common technique in Jazz is call and response when musicians play off one another. You see this online in reviews where the reviews start reacting to each other rather than the original item being reviewed. Allow users to build on one another.

User-centred design and participatory design are great ways of involving the users in the design process but that’s still different to actual use. It’s time for a new way of working: designing for improvisation (but remember that no one single process will ever be successful). Our design process should reflect the trend towards user participation that we’re seeing on the web. People’s tolerance for improvisation is increasing and our role as framework providers should reflect that.

Spaces

It seems that small interface changes are rolled out to Twitter on a fairly regular basis. This morning, for example, I was greeted with a new “Everyone” tab. I was also disappointed to discover that an interface improvement that was introduced a few weeks ago has now been removed. As interaction tweaks go, this was a very small thing but it’s something I appreciated very much. Let me explain…

When I’m reading a long-ish page on the Web, rather than move my cursor over to the scrollbar, position it just so and click to scroll down see the next screenful of content, I’ll just tap the spacebar. In just about every browser I know, this will scroll the content by one screenful. This flow is interrupted if a website “helpfully” puts the focus into a form element when the page loads. This isn’t an issue on, say, Google because Google doesn’t have more than one screenful of content. But it is an issue on Twitter. When Twitter loads, the What are you doing? input box is automatically given focus. If I want to scroll down below the fold, I must either use the scrollbar or click out of the form element and then use the spacebar. I can understand the rationale behind this. Chances are most people want to get into that form element and start typing …at least on the front page.

The interface improvement that Twitter introduced a while back was to take that automatic focus away if I was on any page other than the first. In other words, if I was clicking back through older pages to catch up what my friends have been doing, the focus was no longer automatically given to the form element. Brilliant! This awareness of context reminds me of what Eric wrote when they were adding print stylesheets to A List Apart:

These print styles are only used on articles, which are the pages that are most likely to be printed.

Perhaps through oversight or maybe through deliberate choice, Twitter now places the focus in that form element on every page. What a pain! And what a shame that a great example of context-sensitive interaction has been removed.

I hereby invoke my bitching ‘n’ moaning mojo: c’mon Twitter, do the right thing.

While I’m at it…

Oi! Flickr! What’s up with the automatic focus in the search form on search results pages? Explain that to me. ‘Cause from where I’m sitting, it’s just downright annoying.

Outgoing

As a web developer, I get annoyed by interaction design implementations all the time: Why is that a link instead of a form button?, Why doesn’t that scale when I bump up the font size?, Why am I being asked to enter this unnecessary information?… Usually I can brush off these annoyances and continue my journey along the threads of the World Wide Web but there’s one “feature” that has irked me to point of distraction and it’s all the more irritating for being on a site I use habitually: Upcoming.

As an Upcoming user, I have a default location. In my case it’s Brighton. This location is important. My location determines what content gets served up to me on the front page of the site—a useful way of discovering local events of interest.

The site also has a search feature. The search form has two components: what I’m searching for and where I’m searching for it. The “where” field defaults to my location, which is a handy little touch. If I want to search for something outside my current location—say the Future of Web Design conference in London this April—I can enter “Future of Web Design” in the “what” field and delete “Brighton” from the “where” field, replacing it with “London”. That works: I have now narrowed down my search to the location “London.”

Here’s the problem: if I now return to the front page I will find that my location is London. That’s right: simply by searching in a place, the system assumes that I now want that to be my location. You know what they say about assumptions, right? In this case, not only has it made an ass out of me, it has, over time, instilled a fear of searching.

I’ll be in San Francisco at the end of this month so I’d like to see what’s going on while I’m there. But once I’ve finished my searching I must remember to reset my location back to Brighton. Knowing this makes me hesitant to use the search form. No doubt the justification for this unexpected behaviour in the search is to second-guess what people really want: do as I want, not as I say. But when I search, I really just want to search. I suspect the same is true of most people.

Normally I wouldn’t rant about an obviously-flawed feature but in this case it’s a feature that can be easily fixed by simply being removed. Here is the current flow:

  1. The user enters a search term in the “what” field, a location in the “where” field and submits the search form.
  2. The system returns a list of search results for the specified term in the specified place.
  3. The system changes the user’s location to the specified place.

That third step is completely unnecessary. Its omission would not harm the search functionality one whit and it would make the search interface more truthful and less duplicitous.

I’ve already mentioned this on the Upcoming suggestion board. If you can think of a good reason why the current behaviour should stay, please add your justification there. If, like me, you’d like to see a search feature that actually just searches, please let your voice be heard there too.

Please Leonard, Neil, I kvetch because I care. I use Upcoming all the time. It would be a butt-kicking service if it weren’t for this one glaring flaw… even without a liquid layout.

Update: Fixed!

The tyranny of mouseover

If I click on a link, I am initiating an action. If I fill in a form and press a submit button, I am initiating an action. But if I move my mouse over a page element, I am not initiating an action. Chances are I’m on my way to initiating an action (like clicking a link or pressing a button) but if I brush past a link on the way, that does not mean that I want something to happen in response.

Most browsers display the value of a title attribute as a tool-tip after a suitable pause. Generally this works pretty well as long as the tool-tip is relatively small and self-contained. Ever come across an instance of a title attribute with a large amount of text? It just feels wrong. There are economies of scale when it comes to displaying information triggered by a mouseover.

All of this is by way of introduction to the topic of those bloody annoying Snap previews that are quite literally popping up all over the place.

I’m not alone in my annoyance. Lorelle VanFossen has put together an excellent list of the problems caused by these rude and intrusive interlopers. As well as listing the accessibility issues for low-vision and motor-impaired users, she makes the very valid point that these pop-ups actively destroy the act of reading:

There’s a small author-part of me that hopes what I write resembles some action-packed-page-turning-thriller and that people are glued to their screens eagerly embracing every word I write. I’d hate to have that experience be interrupted by an annoying pop-up window of any kind. Destroys the interaction of the reader with the written word, doesn’t it?

The way that the developers at Snap view web pages reminds of the Far Side cartoon:

Blah blah LINK blah blah blah blah blah blah blah blah LINK blah blah blah blah blah…

Lorelle’s frustration is particularly acute because the Snap previews showed up on her Wordpress.com blog because Matt thought it would be cool to roll out this “feature” to 10% of Wordpress.com users.

Luckily, Lorelle and other hijacked blogs can turn the feature off. As pointed out by John Gruber, Jason Kottke and Michael Heilemann, the rest of us can also deactivate these annoying things. I should also point out that you can deactivate them directly from a preview by clicking on the “options” link in the pop-up and setting either a local or a global cookie to switch off the previews.

But this is like opt-out spam. I shouldn’t be confronted by these intrusive and annoying pop-ups to begin with. Offering them as a feature to users who want them strikes me as a perfectly reasonable implementation. This is the perfect example of something that should have been implemented like a Greasemonkey script: give users the choice and the power to activate this flashy feature. But don’t foist it on us and then claim it’s our responsibility to disable it.

If you haven’t seen the Snap previews in action, you can find them on TechCrunch and Vitamin, to give just two examples. Their presence on TechCrunch isn’t really surprising given that the site is devoted to pointing out all that is flashy and pointless on the web. But the gang over at Vitamin really ought to know better.