Journal tags: stats




I’ve been reading the excellent Design For Safety by Eva PenzeyMoog. There was a line that really stood out to me:

The idea that it’s alright to do whatever unethical thing is currently the industry norm is widespread in tech, and dangerous.

It stood out to me because I had been thinking about certain practices that are widespread, accepted, and yet strike me as deeply problematic. These practices involve tracking users.

The first problem is that even the terminology I’m using would be rejected. When you track users on your website, it’s called analytics. Or maybe it’s stats. If you track users on a large enough scale, I guess you get to just call it data.

Those words—“analytics”, “stats”, and “data”—are often used when the more accurate word would be “tracking.”

Or to put it another way; analytics, stats, data, numbers …these are all outputs. But what produced these outputs? Tracking.

Here’s a concrete example: email newsletters.

Do you have numbers on how many people opened a particular newsletter? Do you have numbers on how many people clicked a particular link?

You can call it data, or stats, or analytics, but make no mistake, that’s tracking.

Follow-on question: do you honestly think that everyone who opens a newsletter or clicks on a link in a newsletter has given their informed constent to be tracked by you?

You may well answer that this is a widespread—nay, universal—practice. Well yes, but a) that’s not what I asked, and b) see the above quote from Design For Safety.

You could quite correctly point out that this tracking is out of your hands. Your newsletter provider—probably Mailchimp—does this by default. So if the tracking is happening anyway, why not take a look at those numbers?

But that’s like saying it’s okay to eat battery-farmed chicken as long as you’re not breeding the chickens yourself.

When I try to argue against this kind of tracking from an ethical standpoint, I get a frosty reception. I might have better luck battling numbers with numbers. Increasing numbers of users are taking steps to prevent tracking. I had a plug-in installed in my mail client—Apple Mail—to prevent tracking. Now I don’t even need the plug-in. Apple have built it into the app. That should tell you something. It reminds me of when browsers had to introduce pop-up blocking.

If the outputs generated by tracking turn out to be inaccurate, then shouldn’t they lose their status?

But that line of reasoning shouldn’t even by necessary. We shouldn’t stop tracking users because it’s inaccurate. We should stop stop tracking users because it’s wrong.

Huffduff up and up

I had a nice Skype chat with Stan Alcorn yesterday all about Huffduffer, online sharing of audio, and all things podcasty and radioish. I’m sure I must have talked his ear off.

Stan was asking about numbers for Huffduffer’s user base and activity. I have to admit that I’ve got zero analytics running on the site. To be honest, I’m okay with that—one of the perks of having a personal project is that only metric that really matters is your own satisfaction. But I told Stan I’d run some quick database queries to get some feeling for Huffduffer’s usage patterns. Here’s what I found…

There are 5,862 people signed up to Huffduffer.

About 150,919 items have been huffduffed. But those aren’t unique files. The total number of distinct files that have been huffduffed is 5,972. That means that, on average, an audio file is huffduffed around 26 times. And the average user has huffduffed around 30 items. But neither of those distributions would be evenly distributed; they’d be power-law distributions rather than bell curves. For example, the most popular file was huffduffed 329 times.

Looking at the amount of items huffduffed each year, there’s a pleasing upward trend.

1st year 7,382
2nd year 19,080
3rd year 23,403
4th year 31,808
5th year 41,514

I was pleasantly surprised by this. I would’ve assumed that Huffduffer usage would be more of a steady-state affair, but it looks like the site is getting used a bit more with each passing year (the site is currently in its sixth(!) year).

Not that any of that really matters. I built Huffduffer to scratch my own itch. I huffduff an average of 411 audio files each year. So even if nobody else used Huffduffer, it would still provide plenty of value to me.

Like I was saying to Stan, the biggest strength and the biggest weakness of audio—as opposed to text or video—is that you can listen to it while your doing other things. For some people, car journeys are the perfect podcast time. For others, it might be doing the dishes or train journeys. For me, it’s the walk to and from work each day—it takes about 35 minutes each way, and I catch up on my Huffduffer feed during that time.

Jessica and I will often listen to some spoken word audio in the background during dinner—usually something quite radio-y like Radiolab, or NPR stories. Yesterday, we were catching up with Aleks’s BBC documentary series, The Digital Human. It was the episode about voice.

Imagine my surprise when I heard the voice of Stan Alcorn. What a co-inky-dink!