Journal tags: cookies

2

sparkline

Third party

The web turned 30 this year. When I was back at CERN to mark this anniversary, there was a lot of introspection and questioning the direction that the web has taken. Everyone I know that uses the web is in agreement that tracking and surveillance are out of control. It seems only right to question whether the web has lost its way.

But here’s the thing: the technologies that enable tracking and surveillance didn’t exist in the early years of the web—JavaScript and cookies.

Without cookies, the web was stateless. This was by design. Now, I totally understand why cookies—or something like cookies—were needed. Without some way of keeping track of state, there’s no good way for a website to “remember” what’s in your shopping cart, or whether you’ve authenticated yourself.

But why would cookies ever need to work across domains? Authentication, shopping carts and all that good stuff can happen on the same domain. Third-party cookies, on the other hand, seem custom made for tracking and frankly, not much else.

Browsers allow you to disable third-party cookies, though it’s not yet the default. If enough people do it—and complain about the sites that stop working when third-party cookies are disabled—then maybe it can become the default.

Firefox is taking steps in this direction, automatically disabling some third-party cookies—the ones that known trackers. Safari is also taking steps to prevent cross-site tracking. It’s not too late to change the tide of third-party cookies.

Then there’s third-party JavaScript.

In retrospect, it seems unbelievable that third-party JavaScript is even possible. I mean, putting arbitrary code—that can then inject even more arbitrary code—onto your website? That seems like a security nightmare!

I imagine if JavaScript were being specced today, it would almost certainly be restricted to the same origin by default. But I guess the precedent had been set with images and style sheets: they could be embedded regardless of whether their domain names matched yours. Still, this is executable code we’re talking about here: that’s quite a footgun that the web has given site owners. And boy, oh boy, has it been used by the worst people to do the most damage.

Again, as with cookies, if we were to imagine what the web would be like if JavaScript was restricted by a same-domain policy, there are certainly things that would be trickier to do.

  • Embedding video, audio, and maps would get a lot finickier.
  • Analytics would need to be self-hosted. I don’t think that would bother any site owners. An analytics platform like Google Analytics that tracks people across domains is doing it for its own benefit rather than that of site owners.
  • Advertising wouldn’t be creepy and annoying. Instead of what’s so euphemistically called “personalisation”, advertisers would have to rely on serving relevant ads based on the content of the site rather than an invasive psychological profile of the user. (I honestly think that advertisers would benefit from this kind of targetting.)

It’s harder to imagine putting the genie back in the bottle when it comes to third-party JavaScript than it is with third-party cookies. All the same, I wish that browsers made it easier to experiment with it. Just as I can choose to accept all cookies, reject all cookies, or only accept same-origin cookies, I wish I could accept all JavaScript, reject all JavaScript, or only accept same-origin JavaScript.

As it is, browsers are making it harder and harder to exercise any control over JavaScript at all. So we reach for third-party tools. We don’t call them JavaScript managers though. We call them ad blockers. But honestly, most of the ad-blocker users I know—myself included—are not bothered by the advertising; we’re bothered by the tracking. We should really call them surveillance blockers.

If third-party JavaScript weren’t the norm, not only would it make the web more secure, it would make it way more performant. Read the chapter on third parties in this year’s newly-released Web Almanac. The figures are staggering.

93% of pages include at least one third-party resource, 76% of pages issue a request to an analytics domain, the median page requests content from at least 9 unique third-party domains that represent 35% of their total network activity, and the most active 10% of pages issue a whopping 175 third-party requests or more.

I don’t think all the web’s performance ills are due to third-party scripts; developers are doing a bang-up job of making their sites big and bloated with their own self-hosted frameworks and code. But as long as third-party JavaScript is allowed onto a site, there’s a limit to how much good developers can do to improve the performance of their sites.

I go to performance-related conferences and you know who I’ve never seen at those events? The people who write the JavaScript for third-party tracking scripts. Those developers are wielding an outsized influence on the health of the web.

I’m very happy to see the work being done by Mozilla and Apple to normalise the idea of rejecting third-party cookies. I’d love to see the rejection of third-party JavaScript normalised in the same way. I know that it would make my life as a developer harder. But that’s of lesser importance. It would be better for the web.

GDPR and Google Analytics

Enforcement of the European Union’s General Data Protection Regulation is coming very, very soon. Look busy. This regulation is not limited to companies based in the EU—it applies to any service anywhere in the world that can be used by citizens of the EU.

It’s less about data protection and more like a user’s bill of rights. That’s good. Cennydd has written a techie’s rough guide to GDPR.

The Open Data Institute’s Jeni Tennison wrote down her thoughts on how it could change data portability in particular. While she welcomes GDPR, she has some misgivings.

Blaine—who really needs to get a blog—shared his concerns in the form of the online equivalent of interpretive dance …a twitter thread (it’s called a thread because it inevitably gets all tangled, and it’s easy to break.)

The interesting thing about the so-called “cookie law” is that it makes no mention of cookies whatsoever. It doesn’t list any specific technology. Instead it states that any means of tracking or identifying users across websites requires disclosure. So if you’re setting a cookie just to manage state—so that users can log in, or keep items in a shopping basket—the legislation doesn’t apply. But as soon as your site allows a third-party to set a cookie, it’s banner time.

Google Analytics is a classic example of a third-party service that uses cookies to track people across domains. That’s pretty much why it exists. We, as site owners, get to use this incredibly powerful tool, and all we have to do in return is add one little snippet of JavaScript to our pages. In doing so, we’re allowing a third party to read or write a cookie from their domain.

Before Google Analytics, Google—the search engine business—was able to identify and track what users were searching for, and which search results they clicked on. But as soon as the user left google.com, the trail went cold. By creating an enormously useful analytics product that only required site owners to add a single line of JavaScript, Google—the online advertising business—gained the ability to keep track of users across most of the web, whether they were on a site owned by Google or not.

Under the old “cookie law”, using a third-party cookie-setting service like that meant you had to inform any of your users who were citizens of the EU. With GDPR, that changes. Now you have to get consent. A dismissible little overlay isn’t going to cut it any more. Implied consent isn’t enough.

Now this situation raises an interesting question. Who’s responsible for getting consent? Is it the site owner or the third party whose script is the conduit for the tracking?

In the first scenario, you’d need to wait for an explicit agreement from a visitor to your site before triggering the Google Analytics functionality. Suddenly it’s not as simple as adding a single line of JavaScript to your site.

In the second scenario, you don’t do anything differently than before—you just add that single line of JavaScript. But now that script would need to launch the interface for getting consent before doing any tracking. Google Analytics would go from being something invisible to something that directly impacts the user experience of your site.

I’m just using Google Analytics as an example here because it’s so widespread. This also applies to third-party sharing buttons—Twitter, Facebook, etc.—and of course, advertising.

In the case of advertising, it gets even thornier because quite often, the site owner has no idea which third party is about to do the tracking. Many, many sites use intermediary services (y’know, ‘cause bloated ad scripts aren’t slowing down sites enough so let’s throw some just-in-time bidding into the mix too). You could get consent for the intermediary service, but not for the final advert—neither you nor your site’s user would have any idea what they were consenting to.

Interesting times. One way or another, a massive amount of the web—every website using Google Analytics, embedded YouTube videos, Facebook comments, embedded tweets, or third-party advertisements—will be liable under GDPR.

It’s almost as if the ubiquitous surveillance of people’s every move on the web wasn’t a very good idea in the first place.