Tags: surveillance



GDPR and Google Analytics

Enforcement of the European Union’s General Data Protection Regulation is coming very, very soon. Look busy. This regulation is not limited to companies based in the EU—it applies to any service anywhere in the world that can be used by citizens of the EU.

It’s less about data protection and more like a user’s bill of rights. That’s good. Cennydd has written a techie’s rough guide to GDPR.

The Open Data Institute’s Jeni Tennison wrote down her thoughts on how it could change data portability in particular. While she welcomes GDPR, she has some misgivings.

Blaine—who really needs to get a blog—shared his concerns in the form of the online equivalent of interpretive dance …a twitter thread (it’s called a thread because it inevitably gets all tangled, and it’s easy to break.)

The interesting thing about the so-called “cookie law” is that it makes no mention of cookies whatsoever. It doesn’t list any specific technology. Instead it states that any means of tracking or identifying users across websites requires disclosure. So if you’re setting a cookie just to manage state—so that users can log in, or keep items in a shopping basket—the legislation doesn’t apply. But as soon as your site allows a third-party to set a cookie, it’s banner time.

Google Analytics is a classic example of a third-party service that uses cookies to track people across domains. That’s pretty much why it exists. We, as site owners, get to use this incredibly powerful tool, and all we have to do in return is add one little snippet of JavaScript to our pages. In doing so, we’re allowing a third party to read or write a cookie from their domain.

Before Google Analytics, Google—the search engine business—was able to identify and track what users were searching for, and which search results they clicked on. But as soon as the user left google.com, the trail went cold. By creating an enormously useful analytics product that only required site owners to add a single line of JavaScript, Google—the online advertising business—gained the ability to keep track of users across most of the web, whether they were on a site owned by Google or not.

Under the old “cookie law”, using a third-party cookie-setting service like that meant you had to inform any of your users who were citizens of the EU. With GDPR, that changes. Now you have to get consent. A dismissible little overlay isn’t going to cut it any more. Implied consent isn’t enough.

Now this situation raises an interesting question. Who’s responsible for getting consent? Is it the site owner or the third party whose script is the conduit for the tracking?

In the first scenario, you’d need to wait for an explicit agreement from a visitor to your site before triggering the Google Analytics functionality. Suddenly it’s not as simple as adding a single line of JavaScript to your site.

In the second scenario, you don’t do anything differently than before—you just add that single line of JavaScript. But now that script would need to launch the interface for getting consent before doing any tracking. Google Analytics would go from being something invisible to something that directly impacts the user experience of your site.

I’m just using Google Analytics as an example here because it’s so widespread. This also applies to third-party sharing buttons—Twitter, Facebook, etc.—and of course, advertising.

In the case of advertising, it gets even thornier because quite often, the site owner has no idea which third party is about to do the tracking. Many, many sites use intermediary services (y’know, ‘cause bloated ad scripts aren’t slowing down sites enough so let’s throw some just-in-time bidding into the mix too). You could get consent for the intermediary service, but not for the final advert—neither you nor your site’s user would have any idea what they were consenting to.

Interesting times. One way or another, a massive amount of the web—every website using Google Analytics, embedded YouTube videos, Facebook comments, embedded tweets, or third-party advertisements—will be liable under GDPR.

It’s almost as if the ubiquitous surveillance of people’s every move on the web wasn’t a very good idea in the first place.


I wrote about Google Analytics yesterday. As usual, I syndicated the post to Ev’s blog, and I got an interesting response over there. Kelly Burgett set me straight on some of the finer details of how goals work, and finished with this thought:

You mention “delivering a performant, accessible, responsive, scalable website isn’t enough” as if it should be, and I have to disagree. It’s not enough for a business to simply have a great website if you are unable to understand performance of channel marketing, track user demographics and behavior on-site, and optimize your site/brand based on that data. I’ve seen a lot of ugly sites who have done exceptionally well in terms of ROI, simply because they are getting the data they need from the site in order make better business decisions. If your site cannot do that (ie. through data collection, often third party scripts), then your beautifully-designed site can only take you so far.

That makes an excellent case for having analytics. But that’s not necessarily the same as having Google analytics, or even JavaScript-driven analytics at all.

By far the most useful information you get from analytics is around where people have come from, where did they go next, and what kind of device are they using. None of that information requires JavaScript. It’s all available from your server logs.

I don’t want to come across all old-man-yell-at-cloud here, but I’m trying to remember at what point self-hosted software for analysing your log traffic became not good enough.

Here’s the thing: logging on the server has no effect on the user experience. It’s basically free, in terms of performance. Logging via JavaScript, by its very nature, has some cost. Even if its negligible, that’s one more request, and that’s one more bit of processing for the CPU.

All of the data that you can only get via JavaScript (in-page actions, heat maps, etc.) are, in my experience, better handled by dedicated software. To me, that kind of more precise data feels different to analytics in the sense of funnels, conversions, goals and all that stuff.

So in order to get more fine-grained data to analyse, our analytics software has now doubled down on a technology—JavaScript—that has an impact on the end user, where previously the act of observation could be done at a distance.

There are also blind spots that come with JavaScript-based tracking. According to Google Analytics, 0% of your customers don’t have JavaScript. That’s not necessarily true, but there’s literally no way for Google Analytics—which relies on JavaScript—to even do its job in the absence of JavaScript. That can lead to a dangerous situation where you might be led to think that 100% of your potential customers are getting by, when actually a proportion might be struggling, but you’ll never find out about it.

Related: according to Google Analytics, 0% of your customers are using ad-blockers that block requests to Google’s servers. Again, that’s not necessarily a true fact.

So I completely agree than analytics are a good thing to have for your business. But it does not follow that Google Analytics is a good thing for your business. Other options are available.

I feel like the assumption that “analytics = Google Analytics” is like the slippery slope in reverse. If we’re all agreed that analytics are important, then aren’t we also all agreed that JavaScript-based tracking is important?

In a word, no.

This reminds me of the arguments made in favour of intrusive, bloated advertising scripts. All of the arguments focus on the need for advertising—to stay in business, to pay the writers—which are all great reasons for advertising, but have nothing to do with JavaScript, which is at the root of the problem. Everyone I know who uses an ad-blocker—including me—doesn’t use it to stop seeing adverts, but to stop the performance of the page being degraded (and to avoid being tracked across domains).

So let’s not confuse the means with the ends. If you need to have advertising, that doesn’t mean you need to have horribly bloated JavaScript-based advertising. If you need analytics, that doesn’t mean you need an analytics script on your front end.

Empire State

I’m in New York. Again. This time it’s for Google’s AMP Conf, where I’ll be giving ‘em a piece of my mind on a panel.

The conference starts tomorrow so I’ve had a day or two to acclimatise and explore. Seeing as Google are footing the bill for travel and accommodation, I’m staying at a rather nice hotel close to the conference venue in Tribeca. There’s live jazz in the lounge most evenings, a cinema downstairs, and should I request it, I can even have a goldfish in my room.

Today I realised that my hotel sits in the apex of a triangle of interesting buildings: carrier hotels.

32 Avenue Of The Americas.Telephone wires and radio unite to make neighbors of nations

Looming above my hotel is 32 Avenue of the Americas. On the outside the building looks like your classic Gozer the Gozerian style of New York building. Inside, the lobby features a mosaic on the ceiling, and another on the wall extolling the connective power of radio and telephone.

The same architects also designed 60 Hudson Street, which has a similar Art Deco feel to it. Inside, there’s a cavernous hallway running through the ground floor but I can’t show you a picture of it. A security guard told me I couldn’t take any photos inside …which is a little strange seeing as it’s splashed across the website of the building.

60 Hudson.HEADQUARTERS The Western Union Telegraph Co. and telegraph capitol of the world 1930-1973

I walked around the outside of 60 Hudson, taking more pictures. Another security guard asked me what I was doing. I told her I was interested in the history of the building, which is true; it was the headquarters of Western Union. For much of the twentieth century, it was a world hub of telegraphic communication, in much the same way that a beach hut in Porthcurno was the nexus of the nineteenth century.

For a 21st century hub, there’s the third and final corner of the triangle at 33 Thomas Street. It’s a breathtaking building. It looks like a spaceship from a Chris Foss painting. It was probably designed more like a spacecraft than a traditional building—it’s primary purpose was to withstand an atomic blast. Gone are niceties like windows. Instead there’s an impenetrable monolith that looks like something straight out of a dystopian sci-fi film.

33 Thomas Street.33 Thomas Street, New York

Brutalist on the outside, its interior is host to even more brutal acts of invasive surveillance. The Snowden papers revealed this AT&T building to be a centrepiece of the Titanpointe programme:

They called it Project X. It was an unusually audacious, highly sensitive assignment: to build a massive skyscraper, capable of withstanding an atomic blast, in the middle of New York City. It would have no windows, 29 floors with three basement levels, and enough food to last 1,500 people two weeks in the event of a catastrophe.

But the building’s primary purpose would not be to protect humans from toxic radiation amid nuclear war. Rather, the fortified skyscraper would safeguard powerful computers, cables, and switchboards. It would house one of the most important telecommunications hubs in the United States…

Looking at the building, it requires very little imagination to picture it as the lair of villainous activity. Laura Poitras’s short film Project X basically consists of a voiceover of someone reading an NSA manual, some ominous background music, and shots of 33 Thomas Street looming in its oh-so-loomy way.

A top-secret handbook takes viewers on an undercover journey to Titanpointe, the site of a hidden partnership. Narrated by Rami Malek and Michelle Williams, and based on classified NSA documents, Project X reveals the inner workings of a windowless skyscraper in downtown Manhattan.


On Jessica’s recommendation, I read a piece on the Guardian website called The eeriness of the English countryside:

Writers and artists have long been fascinated by the idea of an English eerie – ‘the skull beneath the skin of the countryside’. But for a new generation this has nothing to do with hokey supernaturalism – it’s a cultural and political response to contemporary crises and fears

I liked it a lot. One of the reasons I liked it was not just for the text of the writing, but the hypertext of the writing. Throughout the piece there are links off to other articles, books, and blogs. For me, this enriches the piece and it set me off down some rabbit holes of hyperlinks with fascinating follow-ups waiting at the other end.

Back in 2010, Scott Rosenberg wrote a series of three articles over the course of two months called In Defense of Hyperlinks:

  1. Nick Carr, hypertext and delinkification,
  2. Money changes everything, and
  3. In links we trust.

They’re all well worth reading. The whole thing was kicked off with a well-rounded debunking of Nicholas Carr’s claim that hyperlinks harm text. Instead, Rosenberg finds that hyperlinks within a text embiggen the writing …providing they’re done well:

I see links as primarily additive and creative. Even if it took me a little longer to read the text-with-links, even if I had to work a bit harder to get through it, I’d come out the other side with more meat and more juice.

Links, you see, do so much more than just whisk us from one Web page to another. They are not just textual tunnel-hops or narrative chutes-and-ladders. Links, properly used, don’t just pile one “And now this!” upon another. They tell us, “This relates to this, which relates to that.”

The difference between a piece of writing being part of the web and a piece of writing being merely on the web is something I talked about a few years back in a presentation called Paranormal Interactivity at ‘round about the 15 minute mark:

Imagine if you were to take away all the regular text and only left the hyperlinks on Wikipedia, you could still get the gist, right? Every single link there is like a wormhole to another part of this “choose your own adventure” game that we’re playing every day on the web. I love that. I love the way that Wikipedia uses links.

That ability of the humble hyperlink to join concepts together lies at the heart of Tim Berners Lee’s World Wide Web …and Ted Nelson’s Project Xanudu, and Douglas Engelbart’s Dynamic Knowledge Environments, and Vannevar Bush’s idea of the Memex. All of those previous visions of a hyperlinked world were—in many ways—superior to the web. But the web shipped. It shipped with brittle, one-way linking, but it shipped. And now today anyone can create a connection between two ideas by linking to resources that represent those ideas. All you need is an HTML document that contains some A elements with href attributes, and a URL to act as that document’s address.

Like the one you’re accessing now.

Not only can I link to that article on the Guardian’s website, I can also pair it up with other related links, like Warren Ellis’s talk from dConstruct 2014:

Inventing the next twenty years, strategic foresight, fictional futurism and English rural magic: Warren Ellis attempts to convince you that they are all pretty much the same thing, and why it was very important that some people used to stalk around village hedgerows at night wearing iron goggles.

There is definitely the same feeling of “the eeriness of the English countryside” in Warren’s talk. If you haven’t listened to it yet, set aside some time. It is enticing and disquieting in equal measure …like many of the works linked to from the piece on the Guardian.

There’s another link I’d like to make, and it happens to be to another dConstruct speaker.

From that Guardian piece:

Yet state surveillance is no longer testified to in the landscape by giant edifices. Instead it is mostly carried out in by software programs running on computers housed in ordinary-looking government buildings, its sources and effects – like all eerie phenomena – glimpsed but never confronted.

James Bridle has been confronting just that. His recent series The Nor took him on a tour of a parallel, obfuscated English countryside. He returned with three pieces of hypertext:

  1. All Cameras Are Police Cameras,
  2. Living in the Electromagnetic Spectrum, and
  3. Low Latency.

I love being able to do this. I love being able to add strands to this world-wide web of ours. Not only can I say “this idea reminds me of another idea”, but I can point to both ideas. It’s up to you whether you follow those links.


Ajax was a really big deal six, seven, eight years ago. My second book was all about Ajax. I spoke about Ajax at conferences and gave workshops all about using Ajax and progressive enhancement.

During those workshops, I would often point out that Ajax had the potential to be abused terribly. Until the advent of Ajax, it was very clear to a user when data was being submitted to a server: you’d have to click a link or submit a form. As soon as you introduce asynchronous communication, it’s possible for the server to get information from the client even without a full-page refresh.

Imagine, for example, that you’re typing a message into a textarea. You might begin by typing, “Why, you stuck up, half-witted, scruffy-looking nerf…” before calming down and thinking better of it. Before Ajax, there was no way that what you had typed could ever reach the server. But now, it’s entirely possible to send data via Ajax with every key press.

It was just a thought experiment. I wasn’t actually that worried that anyone would ever do something quite so creepy.

Then I came across this article by Jennifer Golbeck in Slate all about Facebook tracking what’s entered—but then erased—within its status update form:

Unfortunately, the code that powers Facebook still knows what you typed—even if you decide not to publish it. It turns out that the things you explicitly choose not to share aren’t entirely private.

Initially I thought there must have been some mistake. I erronously called out Jen Golbeck when I found the PDF of a paper called The Post that Wasn’t: Exploring Self-Censorship on Facebook. The methodology behind the sample group used for that paper was much more old-fashioned than using Ajax:

First, participants took part in a weeklong diary study during which they used SMS messaging to report all instances of unshared content on Facebook (i.e., content intentionally self-censored). Participants also filled out nightly surveys to further describe unshared content and any shared content they decided to post on Facebook. Next, qualified participants took part in in-lab interviews.

But the Slate article was referencing a different paper that does indeed use Ajax to track instances of deleted text:

This research was conducted at Facebook by Facebook researchers. We collected self-censorship data from a random sample of approximately 5 million English-speaking Facebook users who lived in the U.S. or U.K. over the course of 17 days (July 6-22, 2012).

So what I initially thought was a case of alarmism—conflating something as simple as simple as a client-side character count with actual server-side monitoring—turned out to be a pretty accurate reading of the situation. I originally intended to write a scoffing post about Slate’s linkbaiting alarmism (and call it “The shocking truth behind the latest Facebook revelation”), but it turns out that my scoffing was misplaced.

That said, the article has been updated to reflect that the Ajax requests are only sending information about deleted characters—not the actual content. Still, as we learned very clearly from the NSA revelations, there’s not much practical difference between logging data and logging metadata.

The nerds among us may start firing up our developer tools to keep track of unexpected Ajax requests to the server. But what about everyone else?

This isn’t the first time that the power of JavaScript has been abused. Every browser now ships with an option to block pop-up windows. That’s because the ability to spawn new windows was so horribly misused. Maybe we’re going to see similar preference options to avoid firing Ajax requests on keypress.

It would be depressingly reductionist to conclude that any technology that can be abused will be abused. But as long as there are web developers out there who are willing to spawn pop-up windows or force persistent cookies or use Ajax to track deleted content, the depressingly reductionist conclusion looks like self-fulfilling prophecy.