Tags: language

311

sparkline

Thursday, March 23rd, 2023

Steam

Picture someone tediously going through a spreadsheet that someone else has filled in by hand and finding yet another error.

“I wish to God these calculations had been executed by steam!” they cry.

The year was 1821 and technically the spreadsheet was a book of logarithmic tables. The frustrated cry came from Charles Babbage, who channeled his frustration into a scheme to create the world’s first computer.

His difference engine didn’t work out. Neither did his analytical engine. He’d spend his later years taking his frustrations out on street musicians, which—as a former busker myself—earns him a hairy eyeball from me.

But we’ve all been there, right? Some tedious task that feels soul-destroying in its monotony. Surely this is exactly what machines should be doing?

I have a hunch that this is where machine learning and large language models might turn out to be most useful. Not in creating breathtaking works of creativity, but in menial tasks that nobody enjoys.

Someone was telling me earlier today about how they took a bunch of haphazard notes in a client meeting. When the meeting was done, they needed to organise those notes into a coherent summary. Boring! But ChatGPT handled it just fine.

I don’t think that use-case is going to appear on the cover of Wired magazine anytime soon but it might be a truer glimpse of the future than any of the breathless claims being eagerly bandied about in Silicon Valley.

You know the way we no longer remember phone numbers, because, well, why would we now that we have machines to remember them for us? I’d be quite happy if machines did that for the annoying little repetitive tasks that nobody enjoys.

I’ll give you an example based on my own experience.

Regular expressions are my kryptonite. I’m rubbish at them. Any time I have to figure one out, the knowledge seeps out of my brain before long. I think that’s because I kind of resent having to internalise that knowledge. It doesn’t feel like something a human should have to know. “I wish to God these regular expressions had been calculated by steam!”

Now I can get a chatbot with a large language model to write the regular expression for me. I still need to describe what I want, so I need to write the instructions clearly. But all the gobbledygook that I’m writing for a machine now gets written by a machine. That seems fair.

Mind you, I wouldn’t blindly trust the output. I’d take that regular expression and run it through a chatbot, maybe a different chatbot running on a different large language model. “Explain what this regular expression does,” would be my prompt. If my input into the first chatbot matches the output of the second, I’d have some confidence in using the regular expression.

A friend of mine told me about using a large language model to help write SQL statements. He described his database structure to the chatbot, and then described what he wanted to select.

Again, I wouldn’t use that output without checking it first. But again, I might use another chatbot to do that checking. “Explain what this SQL statement does.”

Playing chatbots off against each other like this is kinda how machine learning works under the hood: generative adverserial networks.

Of course, the task of having to validate the output of a chatbot by checking it with another chatbot could get quite tedious. “I wish to God these large language model outputs had been validated by steam!”

Sounds like a job for machines.

Wednesday, March 22nd, 2023

Disclosure

You know how when you’re on hold to any customer service line you hear a message that thanks you for calling and claims your call is important to them. The message always includes a disclaimer about calls possibly being recorded “for training purposes.”

Nobody expects that any training is ever actually going to happen—surely we would see some improvement if that kind of iterative feedback loop were actually in place. But we most certainly want to know that a call might be recorded. Recording a call without disclosure would be unethical and illegal.

Consider chatbots.

If you’re having a text-based (or maybe even voice-based) interaction with a customer service representative that doesn’t disclose its output is the result of large language models, that too would be unethical. But, at the present moment in time, it would be perfectly legal.

That needs to change.

I suspect the necessary legislation will pass in Europe first. We’ll see if the USA follows.

In a way, this goes back to my obsession with seamful design. With something as inherently varied as the output of large language models, it’s vital that people have some way of evaluating what they’re told. I believe we should be able to see as much of the plumbing as possible.

The bare minimum amount of transparency is revealing that a machine is in the loop.

This shouldn’t be a controversial take. But I guarantee we’ll see resistance from tech companies trying to sell their “AI” tools as seamless, indistinguishable drop-in replacements for human workers.

Monday, March 20th, 2023

The AI hype bubble is the new crypto hype bubble

A handy round-up of recent wrtings on artificial insemination.

Sunday, March 19th, 2023

ongoing by Tim Bray · The LLM Problem

It doesn’t bother me much that bleeding-edge ML technology sometimes gets things wrong. It bothers me a lot when it gives no warnings, cites no sources, and provides no confidence interval.

Yes! Like I said:

Expose the wires. Show the workings-out.

Wednesday, March 15th, 2023

Stochastic Parrots Day Tickets, Fri, Mar 17, 2023 at 8:00 AM | Eventbrite

This free event is running online from 3pm to 7pm UK time this Friday. The line-up features Emily Bender, Safiya Noble, Timnit Gebru and more.

Since the publication of On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?🦜 two years ago, many of the harms the paper has warned about and more, have unfortunately occurred. From exploited workers filtering hateful content, to an engineer claiming that chatbots are sentient, the harms are only accelerating.

Join the co-authors of the paper and various guests to reflect on what has happened in the last two years, what the large language model landscape currently look like, and where we are headed vs where we should be headed.

Monday, March 6th, 2023

Like

We use metaphors all the time. To quote George Lakoff, we live by them.

We use analogies some of the time. They’re particularly useful when we’re wrapping our heads around something new. By comparing something novel to something familiar, we can make a shortcut to comprehension, or at least, categorisation.

But we need a certain amount of vigilance when it comes to analogies. Just because something is like something else doesn’t mean it’s the same.

With that in mind, here are some ways that people are describing generative machine learning tools. Large language models are like…

Monday, February 13th, 2023

You can call me AI

I’ve mentioned before that I’m not a fan of initialisms and acronyms. They can be exclusionary.

It bothers me doubly when everyone is talking about AI.

First of all, the term is so vague as to be meaningless. Sometimes—though rarely—AI refers to general artificial intelligence. Sometimes AI refers to machine learning. Sometimes AI refers to large language models. Sometimes AI refers to a series of if/else statements. That’s quite a spectrum of meaning.

Secondly, there’s the assumption that everyone understands the abbreviation. I guess that’s generally a safe assumption, but sometimes AI could refer to something other than artificial intelligence.

In countries with plenty of pastoral agriculture, if someone works in AI, it usually means they’re going from farm to farm either extracting or injecting animal semen. AI stands for artificial insemination.

I think that abbreviation might work better for the kind of things currently described as using AI.

We were discussing this hot topic at work recently. Is AI coming for our jobs? The consensus was maybe, but only the parts of our jobs that we’re more than happy to have automated. Like summarising some some findings. Or perhaps as a kind of lorem ipsum generator. Or for just getting the ball rolling with a design direction. As Terence puts it:

Midjourney is great for a first draft. If, like me, you struggle to give shape to your ideas then it is nothing short of magic. It gets you through the first 90% of the hard work. It’s then up to you to refine things.

That’s pretty much the conclusion we came to in our discussion at Clearleft. There’s no way that we’d use this technology to generate outputs for clients, but we certainly might use it to generate inputs. It’s like how we’d do a quick round of sketching to get a bunch of different ideas out into the open. Terence is spot on when he says:

Midjourney lets me quickly be wrong in an interesting direction.

To put it another way, using a large language model could be a way of artificially injecting some seeds of ideas. Artificial insemination.

So now when I hear people talk about using AI to create images or articles, I don’t get frustrated. Instead I think, “Using artificial insemination to create images or articles? Yes, that sounds about right.”

Friday, February 10th, 2023

ChatGPT Is a Blurry JPEG of the Web | The New Yorker

A very astute framing by Ted Chiang—large language models as a form of lossy compression for text.

When we’re dealing with sequences of words, lossy compression looks smarter than lossless compression.

A lot of uses have been proposed for large language models. Thinking about them as blurry JPEGs offers a way to evaluate what they might or might not be well suited for.

Tuesday, January 24th, 2023

Efficiency over performance

I quite like this change of terminology when it comes to making fast websites. After all, performance can sound like a process of addition, whereas efficiency can be a process of subtraction.

The term ‘performance’ brings to mind exotic super-cars suitable only for impractical demonstrations (or ‘performances’). ‘Efficiency’ brings to mind an electric car (or even better, a bicycle), making effective use of limited resources.

Tuesday, January 17th, 2023

Pattern Wise, System Foolish

A library of UX components is one common part of a design system, but the system itself is something bigger. A good system is also a shared set of strategies for solving visual and interactive communication challenges, a playbook rather than a script.

I like this way of putting it:

The problem is that treating a design system as a pantry full of widgets is, in and of itself, a failure of both craft and imagination. Think of it like a language: if a writer’s only engagement with it is grabbing words from the dictionary and heaping them together until “message” is achieved, things are going to suck. Language is more than a bag of words.

Monday, December 19th, 2022

Empty Pointers and Constellations of AI

AI becomes a stand-in term for whatever technologies and techniques are new, shiny, and just beyond the grasp of our understanding. We use it to gesture at a future we cannot fully comprehend or currently realise. As soon as we do, it will no longer be “AI.”

Tuesday, November 15th, 2022

CSS Timeline

Here’s a remarkably in-depth timeline of the web’s finest programming language, from before it existed to today’s thriving ecosystem. And the timeline is repsonsive too—lovely!

Saturday, October 15th, 2022

Filtered for the miracle of writing (Interconnected)

You don’t need to write for anyone else. You don’t need to share, or even keep it. You just need the act of it. Writing is a particle collider for reality and the imagination. And new discoveries are the result.

(That’s why I write here, of course. It’s how I think.)

It me.

Monday, September 26th, 2022

Data Design Language

I like this approach to offering a design system. It seems less prescriptive than many:

Designed not as a rule set, but rather a toolbox, the Data Design Language includes a chart library, design guidelines, colour and typographic style specifications with usability guidance for internationalization (i18n) and accessibility (a11y), all reflecting our data design principles.

Thursday, September 8th, 2022

What it’s like working with an editor

This piece by Giles is a spot-on description of what I do in my role as content buddy at Clearleft. Especially this bit:

Your editor will explain why things need changing

As a writer, it’s really helpful to understand the why of each edit. It’s easier to re-write if you know precisely what the problem is. And often, it’s less bruising to the ego. It’s not that you’re a bad writer, but just that one particular thing could be expressed more simply, or more clearly, than your first effort.

Tuesday, August 23rd, 2022

rottytooth/Olympus: The language where computation happens through the will of the gods

A new programming language where you pray to Greek gods.

An invocation has three parts: the god’s name and adoration (praising of that god), supplication to show the humbleness of the asker, followed by a request to add one or several of what we ordinarily call “commands” to the program.

Here’s the source code for “99 bottles of beer” in Olympus and here it is transpiled into JavaScript.

Tuesday, August 16th, 2022

Wednesday, June 1st, 2022

What the Vai Script Reveals About the Evolution of Writing - SAPIENS

How a writing system went from being a dream (literally) to a reality, codified in unicode.

Friday, May 6th, 2022

wrong side of write

An opinionated blog about writing. I’ve subscribed in my feed reader.

Monday, April 25th, 2022

UI Pattern: Natural Language Form

I only just found this article about those “mad libs” style forms that I started with Huffduffer.