Journal tags: surveillance

11

sparkline

Prediction

Arthur C. Clarke once said:

Trying to predict the future is a discouraging and hazardous occupation becaue the profit invariably falls into two stools. If his predictions sounded at all reasonable, you can be quite sure that in 20 or most 50 years, the progress of science and technology has made him seem ridiculously conservative. On the other hand, if by some miracle a prophet could describe the future exactly as it was going to take place, his predictions would sound so absurd, so far-fetched, that everybody would laugh him to scorn.

But I couldn’t resist responding to a recent request for augery. Eric asked An Event Apart speakers for their predictions for the coming year. The responses have been gathered together and published, although it’s in the form of a PDF for some reason.

Here’s what I wrote:

This is probably more of a hope than a prediction, but 2021 could be the year that the ponzi scheme of online tracking and surveillance begins to crumble. People are beginning to realize that it’s far too intrusive, that it just doesn’t work most of the time, and that good ol’-fashioned contextual advertising would be better. Right now, it feels similar to the moment before the sub-prime mortgage bubble collapsed (a comparison made in Tim Hwang’s recent book, Subprime Attention Crisis). Back then people thought “Well, these big banks must know what they’re doing,” just as people have thought, “Well, Facebook and Google must know what they’re doing”…but that confidence is crumbling, exposing the shaky stack of cards that props up behavioral advertising. This doesn’t mean that online advertising is coming to an end—far from it. I think we might see a golden age of relevant, content-driven advertising. Laws like Europe’s GDPR will play a part. Apple’s recent changes to highlight privacy-violating apps will play a part. Most of all, I think that people will play a part. They will be increasingly aware that there’s nothing inevitable about tracking and surveillance and that the web works better when it respects people’s right to privacy. The sea change might not happen in 2021 but it feels like the water is beginning to swell.

Still, predicting the future is a mug’s game with as much scientific rigour as astrology, reading tea leaves, or haruspicy.

Much like behavioural advertising.

Get safe

The verbs of the web are GET and POST. In theory there’s also PUT, DELETE, and PATCH but in practice POST often does those jobs.

I’m always surprised when front-end developers don’t think about these verbs (or request methods, to use the technical term). Knowing when to use GET and when to use POST is crucial to having a solid foundation for whatever you’re building on the web.

Luckily it’s not hard to know when to use each one. If the user is requesting something, use GET. If the user is changing something, use POST.

That’s why links are GET requests by default. A link “gets” a resource and delivers it to the user.

<a href="/items/id">

Most forms use the POST method becuase they’re changing something—creating, editing, deleting, updating.

<form method="post" action="/items/id/edit">

But not all forms should use POST. A search form should use GET.

<form method="get" action="/search">
<input type="search" name="term">

When a user performs a search, they’re still requesting a resource (a page of search results). It’s just that they need to provide some specific details for the GET request. Those details get translated into a query string appended to the URL specified in the action attribute.

/search?term=value

I sometimes see the GET method used incorrectly:

  • “Log out” links that should be forms with a “log out” button—you can always style it to look like a link if you want.
  • “Unsubscribe” links in emails that immediately trigger the action of unsubscribing instead of going to a form where the POST method does the unsubscribing. I realise that this turns unsubscribing into a two-step process, which is a bit annoying from a usability point of view, but a destructive action should never be baked into a GET request.

When the it was first created, the World Wide Web was stateless by design. If you requested one web page, and then subsequently requested another web page, the server had no way of knowing that the same user was making both requests. After serving up a page in response to a GET request, the server promptly forgot all about it.

That’s how web browsing should still work. In fact, it’s one of the Web Platform Design Principles: It should be safe to visit a web page:

The Web is named for its hyperlinked structure. In order for the web to remain vibrant, users need to be able to expect that merely visiting any given link won’t have implications for the security of their computer, or for any essential aspects of their privacy.

The expectation of safe stateless browsing has been eroded over time. Every time you click on a search result in Google, or you tap on a recommended video in YouTube, or—heaven help us—you actually click on an advertisement, you just know that you’re adding to a dossier of your online profile. That’s not how the web is supposed to work.

Don’t get me wrong: building a profile of someone based on their actions isn’t inherently wrong. If a user taps on “like” or “favourite” or “bookmark”, they are actively telling the server to perform an update (and so those actions should be POST requests). But do you see the difference in where the power lies? With POST actions—fave, rate, save—the user is in charge. With GET requests, no one is supposed to be in charge—it’s meant to be a neutral transaction. Alas, the reality of today’s web is that many GET requests give more power to the dossier-building servers at the expense of the user’s agency.

The very first of the Web Platform Design Principles is Put user needs first :

If a trade-off needs to be made, always put user needs above all.

The current abuse of GET requests is damage that the web needs to route around.

Browsers are helping to a certain extent. Most browsers have the concept of private browsing, allowing you some level of statelessness, or at least time-limited statefulness. But it’s kind of messed up that private browsing is the exception, while surveillance is the default. It should be the other way around.

Firefox and Safari are taking steps to reduce tracking and fingerprinting. Rejecting third-party coookies by default is a good move. I’d love it if third-party JavaScript were also rejected by default:

In retrospect, it seems unbelievable that third-party JavaScript is even possible. I mean, putting arbitrary code—that can then inject even more arbitrary code—onto your website? That seems like a security nightmare!

I imagine if JavaScript were being specced today, it would almost certainly be restricted to the same origin by default.

Chrome has different priorities, which is understandable given that it comes from a company with a business model that is currently tied to tracking and surveillance (though it needn’t remain that way). With anti-trust proceedings rumbling in the background, there’s talk of breaking up Google to avoid monopolistic abuses of power. I honestly think it would be the best thing that could happen to Chrome if it were an independent browser that could fully focus on user needs without having to consider the surveillance needs of an advertising broker.

But we needn’t wait for the browsers to make the web a safer place for users.

Developers write the code that updates those dossiers. Developers add those oh-so-harmless-looking third-party scripts to page templates.

What if we refused?

Front-end developers in particular should be the last line of defence for users. The entire field of front-end devlopment is supposed to be predicated on the prioritisation of user needs.

And if the moral argument isn’t enough, perhaps the technical argument can get through. Tracking users based on their GET requests violates the very bedrock of the web’s architecture. Stop doing that.

Clean advertising

Imagine if you were told that fossil fuels were the only way of extracting energy. It would be an absurd claim. Not only are other energy sources available—solar, wind, geothermal, nuclear—fossil fuels aren’t even the most effecient source of energy. To say that you can’t have energy without burning fossil fuels would be pitifully incorrect.

And yet when it comes to online advertising, we seem to have meekly accepted that you can’t have effective advertising without invasive tracking. But nothing could be further from the truth. Invasive tracking is to online advertising as fossil fuels are to energy production—an outmoded inefficient means of getting substandard results.

Before the onslaught of third party cookies and scripts, online advertising was contextual. If I searched for property insurance, I was likely to see an advertisement for property insurance. If I was reading an article about pet food, I was likely to be served an advertisement for pet food.

Simply put, contextual advertising ensured that the advertising that accompanied content could be relevant and timely. There was no big mystery about it: advertisers just needed to know what the content was about and they could serve up the appropriate advertisement. Nice and straightforward.

Too straightforward.

What if, instead of matching the advertisement to the content, we could match the advertisement to the person? Regardless of what they were searching for or reading, they’d be served advertisements that were relevant to them not just in that moment, but relevant to their lifestyles, thoughts and beliefs? Of course that would require building up dossiers of information about each person so that their profiles could be targeted and constantly updated. That’s where cross-site tracking comes in, with third-party cookies and scripts.

This is behavioural advertising. It has all but elimated contextual advertising. It has become so pervasive that online advertising and behavioural advertising have become synonymous. Contextual advertising is seen as laughably primitive compared with the clairvoyant powers of behavioural advertising.

But there’s a problem with behavioural advertising. A big problem.

It doesn’t work.

First of all, it relies on mind-reading powers by the advertising brokers—Facebook, Google, and the other middlemen of ad tech. For all the apocryphal folk tales of spooky second-guessing in online advertising, it mostly remains rubbish.

Forget privacy: you’re terrible at targeting anyway:

None of this works. They are still trying to sell me car insurance for my subway ride.

Have you actually paid attention to what advertisements you’re served? Maciej did:

I saw a lot of ads for GEICO, a brand of car insurance that I already own.

I saw multiple ads for Red Lobster, a seafood restaurant chain in America. Red Lobster doesn’t have any branches in San Francisco, where I live.

Finally, I saw a ton of ads for Zipcar, which is a car sharing service. These really pissed me off, not because I have a problem with Zipcar, but because they showed me the algorithm wasn’t even trying. It’s one thing to get the targeting wrong, but the ad engine can’t even decide if I have a car or not! You just showed me five ads for car insurance.

And yet in the twisted logic of ad tech, all of this would be seen as evidence that they need to gather even more data with even more invasive tracking and surveillance.

It turns out that bizarre logic is at the very heart of behavioural advertising. I highly recommend reading the in-depth report from The Correspondent called The new dot com bubble is here: it’s called online advertising:

It’s about a market of a quarter of a trillion dollars governed by irrationality.

The benchmarks that advertising companies use – intended to measure the number of clicks, sales and downloads that occur after an ad is viewed – are fundamentally misleading. None of these benchmarks distinguish between the selection effect (clicks, purchases and downloads that are happening anyway) and the advertising effect (clicks, purchases and downloads that would not have happened without ads).

Suppose someone told you that they keep tigers out of their garden by turning on their kitchen light every evening. You might think their logic is flawed, but they’ve been turning on the kitchen light every evening for years and there hasn’t been a single tiger in the garden the whole time. That’s the logic used by ad tech companies to justify trackers.

Tracker-driven behavioural advertising is bad for users. The advertisements are irrelevant most of the time, and on the few occasions where the advertising hits the mark, it just feels creepy.

Tracker-driven behavioural advertising is bad for advertisers. They spend their hard-earned money on invasive ad tech that results in no more sales or brand recognition than if they had relied on good ol’ contextual advertising.

Tracker-driven behavioural advertising is very bad for the web. Megabytes of third-party JavaScript are injected at exactly the wrong moment to make for the worst possible performance. And if that doesn’t ruin the user experience enough, there are still invasive overlays and consent forms to click through (which, ironically, gets people mad at the legislation—like GDPR—instead of the underlying reason for these annoying overlays: unnecessary surveillance and tracking by the site you’re visiting).

Tracker-driven behavioural advertising is good for the middlemen doing the tracking. Facebook and Google are two of the biggest players here. But that doesn’t mean that their business models need to be permanently anchored to surveillance. The very monopolies that make them kings of behavioural advertising—the biggest social network and the biggest search engine—would also make them titans of contextual advertising. They could pivot from an invasive behavioural model of advertising to a privacy-respecting contextual advertising model.

The incumbents will almost certainly resist changing something so fundamental. It would be like expecting an energy company to change their focus from fossil fuels to renewables. It won’t happen quickly. But I think that it may eventually happen …if we demand it.

In the meantime, we can all play our part. Just as we can do our bit for the environment at an individual level by sorting our recycling and making green choices in our day to day lives, we can all do our bit for the web too.

The least we can do is block third-party cookies. Some browsers are now doing this by default. That’s good.

Blocking third-party JavaScript is a bit trickier. That requires a browser extension. Most of these extensions to block third-party tracking are called ad blockers. That’s a shame. The issue is not with advertising. The issue is with tracking.

Alas, because this software is labelled under ad blocking, it has led to the ludicrous situation of an ethical argument being made to allow surveillance and tracking! It goes like this: websites need advertising to survive; if you block the ads, then you are denying these sites revenue. That argument would make sense if we were talking about contextual advertising. But it makes no sense when it comes to behavioural advertising …unless you genuinely believe that online advertising has to be behavioural, which means that online advertising has to track you to be effective. Such a belief would be completely wrong. But that doesn’t stop it being widely held.

To argue that there is a moral argument against blocking trackers is ridiculous. If anything, there’s a moral argument to be made for installing anti-tracking software for yourself, your friends, and your family. Otherwise we are collectively giving up our privacy for a business model that doesn’t even work.

It’s a shame that advertisers will lose out if tracking-blocking software prevents their ads from loading. But that’s only going to happen in the case of behavioural advertising. Contextual advertising won’t be blocked. Contextual advertising is also more lightweight than behavioural advertising. Contextual advertising is far less creepy than behavioural advertising. And crucially, contextual advertising works.

That shouldn’t be a controversial claim: the idea that people would be interested in adverts that are related to the content they’re currently looking at. The greatest trick the ad tech industry has pulled is convincing the world that contextual relevance is somehow less effective than some secret algorithm fed with all our data that’s supposed to be able to practically read our minds and know us better than we know ourselves.

Y’know, if this mind-control ray really could give me timely relevant adverts, I might possibly consider paying the price with my privacy. But as it is, YouTube still hasn’t figured out that I’m not interested in Top Gear or football.

The next time someone is talking about the necessity of advertising on the web as a business model, ask for details. Do they mean contextual or behavioural advertising? They’ll probably laugh at you and say that behavioural advertising is the only thing that works. They’ll be wrong.

I know it’s hard to imagine a future without tracker-driven behavioural advertising. But there are no good business reasons for it to continue. It was once hard to imagine a future without oil or coal. But through collective action, legislation, and smart business decisions, we can make a cleaner future.

Third party

The web turned 30 this year. When I was back at CERN to mark this anniversary, there was a lot of introspection and questioning the direction that the web has taken. Everyone I know that uses the web is in agreement that tracking and surveillance are out of control. It seems only right to question whether the web has lost its way.

But here’s the thing: the technologies that enable tracking and surveillance didn’t exist in the early years of the web—JavaScript and cookies.

Without cookies, the web was stateless. This was by design. Now, I totally understand why cookies—or something like cookies—were needed. Without some way of keeping track of state, there’s no good way for a website to “remember” what’s in your shopping cart, or whether you’ve authenticated yourself.

But why would cookies ever need to work across domains? Authentication, shopping carts and all that good stuff can happen on the same domain. Third-party cookies, on the other hand, seem custom made for tracking and frankly, not much else.

Browsers allow you to disable third-party cookies, though it’s not yet the default. If enough people do it—and complain about the sites that stop working when third-party cookies are disabled—then maybe it can become the default.

Firefox is taking steps in this direction, automatically disabling some third-party cookies—the ones that known trackers. Safari is also taking steps to prevent cross-site tracking. It’s not too late to change the tide of third-party cookies.

Then there’s third-party JavaScript.

In retrospect, it seems unbelievable that third-party JavaScript is even possible. I mean, putting arbitrary code—that can then inject even more arbitrary code—onto your website? That seems like a security nightmare!

I imagine if JavaScript were being specced today, it would almost certainly be restricted to the same origin by default. But I guess the precedent had been set with images and style sheets: they could be embedded regardless of whether their domain names matched yours. Still, this is executable code we’re talking about here: that’s quite a footgun that the web has given site owners. And boy, oh boy, has it been used by the worst people to do the most damage.

Again, as with cookies, if we were to imagine what the web would be like if JavaScript was restricted by a same-domain policy, there are certainly things that would be trickier to do.

  • Embedding video, audio, and maps would get a lot finickier.
  • Analytics would need to be self-hosted. I don’t think that would bother any site owners. An analytics platform like Google Analytics that tracks people across domains is doing it for its own benefit rather than that of site owners.
  • Advertising wouldn’t be creepy and annoying. Instead of what’s so euphemistically called “personalisation”, advertisers would have to rely on serving relevant ads based on the content of the site rather than an invasive psychological profile of the user. (I honestly think that advertisers would benefit from this kind of targetting.)

It’s harder to imagine putting the genie back in the bottle when it comes to third-party JavaScript than it is with third-party cookies. All the same, I wish that browsers made it easier to experiment with it. Just as I can choose to accept all cookies, reject all cookies, or only accept same-origin cookies, I wish I could accept all JavaScript, reject all JavaScript, or only accept same-origin JavaScript.

As it is, browsers are making it harder and harder to exercise any control over JavaScript at all. So we reach for third-party tools. We don’t call them JavaScript managers though. We call them ad blockers. But honestly, most of the ad-blocker users I know—myself included—are not bothered by the advertising; we’re bothered by the tracking. We should really call them surveillance blockers.

If third-party JavaScript weren’t the norm, not only would it make the web more secure, it would make it way more performant. Read the chapter on third parties in this year’s newly-released Web Almanac. The figures are staggering.

93% of pages include at least one third-party resource, 76% of pages issue a request to an analytics domain, the median page requests content from at least 9 unique third-party domains that represent 35% of their total network activity, and the most active 10% of pages issue a whopping 175 third-party requests or more.

I don’t think all the web’s performance ills are due to third-party scripts; developers are doing a bang-up job of making their sites big and bloated with their own self-hosted frameworks and code. But as long as third-party JavaScript is allowed onto a site, there’s a limit to how much good developers can do to improve the performance of their sites.

I go to performance-related conferences and you know who I’ve never seen at those events? The people who write the JavaScript for third-party tracking scripts. Those developers are wielding an outsized influence on the health of the web.

I’m very happy to see the work being done by Mozilla and Apple to normalise the idea of rejecting third-party cookies. I’d love to see the rejection of third-party JavaScript normalised in the same way. I know that it would make my life as a developer harder. But that’s of lesser importance. It would be better for the web.

Name That Script! by Trent Walton

Trent is about to pop his AEA cherry and give a talk at An Event Apart in Boston. I’m going to attempt to liveblog this:

How many third-party scripts are loading on our web pages these days? How can we objectively measure the value of these (advertising, a/b testing, analytics, etc.) scripts—considering their impact on web performance, user experience, and business goals? We’ve learned to scrutinize content hierarchy, browser support, and page speed as part of the design and development process. Similarly, Trent will share recent experiences and explore ways to evaluate and discuss the inclusion of 3rd-party scripts.

Trent is going to speak about third-party scripts, which is funny, because a year ago, he never would’ve thought he’d be talking about this. But he realised he needed to pay more attention to:

any request made to an external URL.

Or how about this:

A resource included with a web page that the site owner doesn’t explicitly control.

When you include a third-party script, the third party can change the contents of that script.

Here are some uses:

  • advertising,
  • A/B testing,
  • analytics,
  • social media,
  • content delivery networks,
  • customer interaction,
  • comments,
  • tag managers,
  • fonts.

You get data from things like analytics and A/B testing. You get income from ads. You get content from CDNs.

But Trent has concerns. First and foremost, the user experience effects of poor performance. Also, there are the privacy implications.

Why does Trent—a designer—care about third party scripts? Well, over the years, the areas that Trent pays attention to has expanded. He’s progressed from image comps to frontend to performance to accessibility to design systems to the command line and now to third parties. But Trent has no impact on those third-party scripts. That’s very different to all those other areas.

Trent mostly builds prototypes. Those then get handed over for integration. Sometimes that means hooking it up to a CMS. Sometimes it means adding in analytics and ads. It gets really complex when you throw in third-party comments, payment systems, and A/B testing tools. Oftentimes, those third-party scripts can outweigh all the gains made beforehand. It happens with no discussion. And yet we spent half a meeting discussing a border radius value.

Delivering a performant, accessible, responsive, scalable website isn’t enough: I also need to consider the impact of third-party scripts.

Trent has spent the last few months learning about third parties so he can be better equiped to discuss them.

UX, performance and privacy impact

We feel the UX impact every day we browse the web (if we turn off our content blockers). The Food Network site has an intersitial asking you to disable your ad blocker. They promise they won’t spawn any pop-up windows. Trent turned his ad blocker off—the page was now 15 megabytes in size. And to top it off …he got a pop up.

Privacy can harder to perceive. We brush aside cookie notifications. What if the wording was “accept trackers” instead of “accept cookies”?

Remarketing is that experience when you’re browsing for a spatula and then every website you visit serves you ads for spatula. That might seem harmless but allowing access to our browsing history has serious privacy implications.

Web builders are on the front lines. It’s up to us to advocate for data protection and privacy like we do for web standards. Don’t wait to be told.

Categories of third parties

Ghostery categories third-party providers: advertising, comments, customer interaction, essential, site analytics, social media. You can dive into each layer and see the specific third-party services on the page you’re viewing.

Analyse and itemise third-party scripts

We have “view source” for learning web development. For third parties, you need some tool to export the data. HAR files (HTTP ARchive) are JSON files that you can create from most browsers’ network request panel in dev tools. But what do you do with a .har file? The site har.tech has plenty of resources for you. That’s where Trent found the Mac app, Charles. It can open .har files. Best of all, you can export to CSV so you can share spreadsheets of the data.

You can visualise third-party requests with Simon Hearne’s excellent Request Map. It’s quite impactful for delivering a visceral reaction in a meeting—so much more effective than just saying “hey, we have a lot of third parties.” Request Map can also export to CSV.

Know industry averages

Trent wanted to know what was “normal.” He decided to analyse HAR files for Alexa’s top 50 US websites. The result was a massive spreadsheet of third-party providers. There were 213 third-party domains (which is not even the same as the number of requests). There was an average of 22 unique third-party domains per site. The usual suspects were everywhere—Google, Amazon, Facebook, Adobe—but there were many others. You can find an alphabetical index on better.fyi/trackers. Often the lesser-known domains turn out to be owned by the bigger domains.

News sites and shopping sites have the most third-party scripts, unsurprisingly.

Understand benefits

Trent realised he needed to listen and understand why third-party scripts are being included. He found out what tag managers do. They’re funnels that allow you to cram even more third-party scripts onto your website. Trent worried that this was a Pandora’s box. The tag manager interface is easy to access and use. But he was told that it’s more like a way of organising your third-party scripts under one dashboard. But still, if you get too focused on the dashboard, you could lose focus of the impact on load times. So don’t blame the tool: it’s all about how it’s used.

Take action

Establish a centre of excellence. Put standards in place—in a cross-discipline way—to define how third-party scripts are evaluated. For example:

  1. Determine the value to the business.
  2. Avoid redundant scripts and services.
  3. Fit within the established performance budget.
  4. Comply with the organistional privacy policy.

Document those decisions, maybe even in your design system.

Also, include third-party scripts within your prototypes to get a more accurate feel for the performance implications.

On a live site, you can regularly audit third-party scripts on a regular basis. Check to see if any are redundant or if they’re exceeding the performance budget. You can monitor performance with tools like Calibre and Speed Curve to cover the time in between audits.

Make your case

Do competitive analysis. Look at other sites in your sector. It’s a compelling way to make a case for change. WPO Stats is very handy for anecdata.

You can gather comparative data with Web Page Test: you can run a full test, and you can run a test with certain third parties blocked. Use the results to kick off a discussion about the impact of those third parties.

Talk it out

Work to maintain an ongoing discussion with the entire team. As Tim Kadlec says:

Everything should have a value, because everything has a cost.

Alternative analytics

Contrary to the current consensual hallucination, there are alternatives to Google Analytics.

I haven’t tried Open Web Analytics. It looks a bit geeky, but the nice thing about it is that you can set it up to work with JavaScript or PHP (sort of like Mint, which I miss).

Also on the geeky end, there’s GoAccess which provides an interface onto your server logs. You can view the data in a browser or on the command line. I gave this a go on adactio.com and it all worked just fine.

Matomo was previously called Piwik, and it’s the closest to Google Analytics. Chris Ruppel wrote about using it as a drop-in replacement. I gave it a go on adactio.com and it did indeed collect analytics very nicely …but then I deleted it, because it still felt creepy to have any kind of analytics script at all (neither Huffduffer or The Session have any analytics tracking either).

Fathom isn’t out yet, but it looks interesting:

It will track users on a website, the key actions they are taking, and give you a non-nerdy breakdown of their journey. It’ll do so with user-centric rights and privacy, and without selling, sharing or giving away the data you collect.

I don’t think any of these alternatives offer quite the same ease-of-use that you’d get from Google Analytics. But I also don’t think that should be your highest priority. There’s a fundamental difference between doing your own analytics (self-hosted), and outsourcing the job to Google who can then track your site’s visitors across domains.

I was hoping that GDPR would put the squeeze on third-party tracking, but it looks like Google have found a way out. By declaring themselves a data controller (but not a data processor), they pass can pass the buck to the data processors to obtain consent.

If you have Google Analytics on your site, that’s you, that is.

GDPR and Google Analytics

Enforcement of the European Union’s General Data Protection Regulation is coming very, very soon. Look busy. This regulation is not limited to companies based in the EU—it applies to any service anywhere in the world that can be used by citizens of the EU.

It’s less about data protection and more like a user’s bill of rights. That’s good. Cennydd has written a techie’s rough guide to GDPR.

The Open Data Institute’s Jeni Tennison wrote down her thoughts on how it could change data portability in particular. While she welcomes GDPR, she has some misgivings.

Blaine—who really needs to get a blog—shared his concerns in the form of the online equivalent of interpretive dance …a twitter thread (it’s called a thread because it inevitably gets all tangled, and it’s easy to break.)

The interesting thing about the so-called “cookie law” is that it makes no mention of cookies whatsoever. It doesn’t list any specific technology. Instead it states that any means of tracking or identifying users across websites requires disclosure. So if you’re setting a cookie just to manage state—so that users can log in, or keep items in a shopping basket—the legislation doesn’t apply. But as soon as your site allows a third-party to set a cookie, it’s banner time.

Google Analytics is a classic example of a third-party service that uses cookies to track people across domains. That’s pretty much why it exists. We, as site owners, get to use this incredibly powerful tool, and all we have to do in return is add one little snippet of JavaScript to our pages. In doing so, we’re allowing a third party to read or write a cookie from their domain.

Before Google Analytics, Google—the search engine business—was able to identify and track what users were searching for, and which search results they clicked on. But as soon as the user left google.com, the trail went cold. By creating an enormously useful analytics product that only required site owners to add a single line of JavaScript, Google—the online advertising business—gained the ability to keep track of users across most of the web, whether they were on a site owned by Google or not.

Under the old “cookie law”, using a third-party cookie-setting service like that meant you had to inform any of your users who were citizens of the EU. With GDPR, that changes. Now you have to get consent. A dismissible little overlay isn’t going to cut it any more. Implied consent isn’t enough.

Now this situation raises an interesting question. Who’s responsible for getting consent? Is it the site owner or the third party whose script is the conduit for the tracking?

In the first scenario, you’d need to wait for an explicit agreement from a visitor to your site before triggering the Google Analytics functionality. Suddenly it’s not as simple as adding a single line of JavaScript to your site.

In the second scenario, you don’t do anything differently than before—you just add that single line of JavaScript. But now that script would need to launch the interface for getting consent before doing any tracking. Google Analytics would go from being something invisible to something that directly impacts the user experience of your site.

I’m just using Google Analytics as an example here because it’s so widespread. This also applies to third-party sharing buttons—Twitter, Facebook, etc.—and of course, advertising.

In the case of advertising, it gets even thornier because quite often, the site owner has no idea which third party is about to do the tracking. Many, many sites use intermediary services (y’know, ‘cause bloated ad scripts aren’t slowing down sites enough so let’s throw some just-in-time bidding into the mix too). You could get consent for the intermediary service, but not for the final advert—neither you nor your site’s user would have any idea what they were consenting to.

Interesting times. One way or another, a massive amount of the web—every website using Google Analytics, embedded YouTube videos, Facebook comments, embedded tweets, or third-party advertisements—will be liable under GDPR.

It’s almost as if the ubiquitous surveillance of people’s every move on the web wasn’t a very good idea in the first place.

Heisenberg

I wrote about Google Analytics yesterday. As usual, I syndicated the post to Ev’s blog, and I got an interesting response over there. Kelly Burgett set me straight on some of the finer details of how goals work, and finished with this thought:

You mention “delivering a performant, accessible, responsive, scalable website isn’t enough” as if it should be, and I have to disagree. It’s not enough for a business to simply have a great website if you are unable to understand performance of channel marketing, track user demographics and behavior on-site, and optimize your site/brand based on that data. I’ve seen a lot of ugly sites who have done exceptionally well in terms of ROI, simply because they are getting the data they need from the site in order make better business decisions. If your site cannot do that (ie. through data collection, often third party scripts), then your beautifully-designed site can only take you so far.

That makes an excellent case for having analytics. But that’s not necessarily the same as having Google analytics, or even JavaScript-driven analytics at all.

By far the most useful information you get from analytics is around where people have come from, where did they go next, and what kind of device are they using. None of that information requires JavaScript. It’s all available from your server logs.

I don’t want to come across all old-man-yell-at-cloud here, but I’m trying to remember at what point self-hosted software for analysing your log traffic became not good enough.

Here’s the thing: logging on the server has no effect on the user experience. It’s basically free, in terms of performance. Logging via JavaScript, by its very nature, has some cost. Even if its negligible, that’s one more request, and that’s one more bit of processing for the CPU.

All of the data that you can only get via JavaScript (in-page actions, heat maps, etc.) are, in my experience, better handled by dedicated software. To me, that kind of more precise data feels different to analytics in the sense of funnels, conversions, goals and all that stuff.

So in order to get more fine-grained data to analyse, our analytics software has now doubled down on a technology—JavaScript—that has an impact on the end user, where previously the act of observation could be done at a distance.

There are also blind spots that come with JavaScript-based tracking. According to Google Analytics, 0% of your customers don’t have JavaScript. That’s not necessarily true, but there’s literally no way for Google Analytics—which relies on JavaScript—to even do its job in the absence of JavaScript. That can lead to a dangerous situation where you might be led to think that 100% of your potential customers are getting by, when actually a proportion might be struggling, but you’ll never find out about it.

Related: according to Google Analytics, 0% of your customers are using ad-blockers that block requests to Google’s servers. Again, that’s not necessarily a true fact.

So I completely agree than analytics are a good thing to have for your business. But it does not follow that Google Analytics is a good thing for your business. Other options are available.

I feel like the assumption that “analytics = Google Analytics” is like the slippery slope in reverse. If we’re all agreed that analytics are important, then aren’t we also all agreed that JavaScript-based tracking is important?

In a word, no.

This reminds me of the arguments made in favour of intrusive, bloated advertising scripts. All of the arguments focus on the need for advertising—to stay in business, to pay the writers—which are all great reasons for advertising, but have nothing to do with JavaScript, which is at the root of the problem. Everyone I know who uses an ad-blocker—including me—doesn’t use it to stop seeing adverts, but to stop the performance of the page being degraded (and to avoid being tracked across domains).

So let’s not confuse the means with the ends. If you need to have advertising, that doesn’t mean you need to have horribly bloated JavaScript-based advertising. If you need analytics, that doesn’t mean you need an analytics script on your front end.

Empire State

I’m in New York. Again. This time it’s for Google’s AMP Conf, where I’ll be giving ‘em a piece of my mind on a panel.

The conference starts tomorrow so I’ve had a day or two to acclimatise and explore. Seeing as Google are footing the bill for travel and accommodation, I’m staying at a rather nice hotel close to the conference venue in Tribeca. There’s live jazz in the lounge most evenings, a cinema downstairs, and should I request it, I can even have a goldfish in my room.

Today I realised that my hotel sits in the apex of a triangle of interesting buildings: carrier hotels.

32 Avenue Of The Americas.Telephone wires and radio unite to make neighbors of nations

Looming above my hotel is 32 Avenue of the Americas. On the outside the building looks like your classic Gozer the Gozerian style of New York building. Inside, the lobby features a mosaic on the ceiling, and another on the wall extolling the connective power of radio and telephone.

The same architects also designed 60 Hudson Street, which has a similar Art Deco feel to it. Inside, there’s a cavernous hallway running through the ground floor but I can’t show you a picture of it. A security guard told me I couldn’t take any photos inside …which is a little strange seeing as it’s splashed across the website of the building.

60 Hudson.HEADQUARTERS The Western Union Telegraph Co. and telegraph capitol of the world 1930-1973

I walked around the outside of 60 Hudson, taking more pictures. Another security guard asked me what I was doing. I told her I was interested in the history of the building, which is true; it was the headquarters of Western Union. For much of the twentieth century, it was a world hub of telegraphic communication, in much the same way that a beach hut in Porthcurno was the nexus of the nineteenth century.

For a 21st century hub, there’s the third and final corner of the triangle at 33 Thomas Street. It’s a breathtaking building. It looks like a spaceship from a Chris Foss painting. It was probably designed more like a spacecraft than a traditional building—it’s primary purpose was to withstand an atomic blast. Gone are niceties like windows. Instead there’s an impenetrable monolith that looks like something straight out of a dystopian sci-fi film.

33 Thomas Street.33 Thomas Street, New York

Brutalist on the outside, its interior is host to even more brutal acts of invasive surveillance. The Snowden papers revealed this AT&T building to be a centrepiece of the Titanpointe programme:

They called it Project X. It was an unusually audacious, highly sensitive assignment: to build a massive skyscraper, capable of withstanding an atomic blast, in the middle of New York City. It would have no windows, 29 floors with three basement levels, and enough food to last 1,500 people two weeks in the event of a catastrophe.

But the building’s primary purpose would not be to protect humans from toxic radiation amid nuclear war. Rather, the fortified skyscraper would safeguard powerful computers, cables, and switchboards. It would house one of the most important telecommunications hubs in the United States…

Looking at the building, it requires very little imagination to picture it as the lair of villainous activity. Laura Poitras’s short film Project X basically consists of a voiceover of someone reading an NSA manual, some ominous background music, and shots of 33 Thomas Street looming in its oh-so-loomy way.

A top-secret handbook takes viewers on an undercover journey to Titanpointe, the site of a hidden partnership. Narrated by Rami Malek and Michelle Williams, and based on classified NSA documents, Project X reveals the inner workings of a windowless skyscraper in downtown Manhattan.

Relinkification

On Jessica’s recommendation, I read a piece on the Guardian website called The eeriness of the English countryside:

Writers and artists have long been fascinated by the idea of an English eerie – ‘the skull beneath the skin of the countryside’. But for a new generation this has nothing to do with hokey supernaturalism – it’s a cultural and political response to contemporary crises and fears

I liked it a lot. One of the reasons I liked it was not just for the text of the writing, but the hypertext of the writing. Throughout the piece there are links off to other articles, books, and blogs. For me, this enriches the piece and it set me off down some rabbit holes of hyperlinks with fascinating follow-ups waiting at the other end.

Back in 2010, Scott Rosenberg wrote a series of three articles over the course of two months called In Defense of Hyperlinks:

  1. Nick Carr, hypertext and delinkification,
  2. Money changes everything, and
  3. In links we trust.

They’re all well worth reading. The whole thing was kicked off with a well-rounded debunking of Nicholas Carr’s claim that hyperlinks harm text. Instead, Rosenberg finds that hyperlinks within a text embiggen the writing …providing they’re done well:

I see links as primarily additive and creative. Even if it took me a little longer to read the text-with-links, even if I had to work a bit harder to get through it, I’d come out the other side with more meat and more juice.

Links, you see, do so much more than just whisk us from one Web page to another. They are not just textual tunnel-hops or narrative chutes-and-ladders. Links, properly used, don’t just pile one “And now this!” upon another. They tell us, “This relates to this, which relates to that.”

The difference between a piece of writing being part of the web and a piece of writing being merely on the web is something I talked about a few years back in a presentation called Paranormal Interactivity at ‘round about the 15 minute mark:

Imagine if you were to take away all the regular text and only left the hyperlinks on Wikipedia, you could still get the gist, right? Every single link there is like a wormhole to another part of this “choose your own adventure” game that we’re playing every day on the web. I love that. I love the way that Wikipedia uses links.

That ability of the humble hyperlink to join concepts together lies at the heart of Tim Berners Lee’s World Wide Web …and Ted Nelson’s Project Xanudu, and Douglas Engelbart’s Dynamic Knowledge Environments, and Vannevar Bush’s idea of the Memex. All of those previous visions of a hyperlinked world were—in many ways—superior to the web. But the web shipped. It shipped with brittle, one-way linking, but it shipped. And now today anyone can create a connection between two ideas by linking to resources that represent those ideas. All you need is an HTML document that contains some A elements with href attributes, and a URL to act as that document’s address.

Like the one you’re accessing now.

Not only can I link to that article on the Guardian’s website, I can also pair it up with other related links, like Warren Ellis’s talk from dConstruct 2014:

Inventing the next twenty years, strategic foresight, fictional futurism and English rural magic: Warren Ellis attempts to convince you that they are all pretty much the same thing, and why it was very important that some people used to stalk around village hedgerows at night wearing iron goggles.

There is definitely the same feeling of “the eeriness of the English countryside” in Warren’s talk. If you haven’t listened to it yet, set aside some time. It is enticing and disquieting in equal measure …like many of the works linked to from the piece on the Guardian.

There’s another link I’d like to make, and it happens to be to another dConstruct speaker.

From that Guardian piece:

Yet state surveillance is no longer testified to in the landscape by giant edifices. Instead it is mostly carried out in by software programs running on computers housed in ordinary-looking government buildings, its sources and effects – like all eerie phenomena – glimpsed but never confronted.

James Bridle has been confronting just that. His recent series The Nor took him on a tour of a parallel, obfuscated English countryside. He returned with three pieces of hypertext:

  1. All Cameras Are Police Cameras,
  2. Living in the Electromagnetic Spectrum, and
  3. Low Latency.

I love being able to do this. I love being able to add strands to this world-wide web of ours. Not only can I say “this idea reminds me of another idea”, but I can point to both ideas. It’s up to you whether you follow those links.

Tracking

Ajax was a really big deal six, seven, eight years ago. My second book was all about Ajax. I spoke about Ajax at conferences and gave workshops all about using Ajax and progressive enhancement.

During those workshops, I would often point out that Ajax had the potential to be abused terribly. Until the advent of Ajax, it was very clear to a user when data was being submitted to a server: you’d have to click a link or submit a form. As soon as you introduce asynchronous communication, it’s possible for the server to get information from the client even without a full-page refresh.

Imagine, for example, that you’re typing a message into a textarea. You might begin by typing, “Why, you stuck up, half-witted, scruffy-looking nerf…” before calming down and thinking better of it. Before Ajax, there was no way that what you had typed could ever reach the server. But now, it’s entirely possible to send data via Ajax with every key press.

It was just a thought experiment. I wasn’t actually that worried that anyone would ever do something quite so creepy.

Then I came across this article by Jennifer Golbeck in Slate all about Facebook tracking what’s entered—but then erased—within its status update form:

Unfortunately, the code that powers Facebook still knows what you typed—even if you decide not to publish it. It turns out that the things you explicitly choose not to share aren’t entirely private.

Initially I thought there must have been some mistake. I erronously called out Jen Golbeck when I found the PDF of a paper called The Post that Wasn’t: Exploring Self-Censorship on Facebook. The methodology behind the sample group used for that paper was much more old-fashioned than using Ajax:

First, participants took part in a weeklong diary study during which they used SMS messaging to report all instances of unshared content on Facebook (i.e., content intentionally self-censored). Participants also filled out nightly surveys to further describe unshared content and any shared content they decided to post on Facebook. Next, qualified participants took part in in-lab interviews.

But the Slate article was referencing a different paper that does indeed use Ajax to track instances of deleted text:

This research was conducted at Facebook by Facebook researchers. We collected self-censorship data from a random sample of approximately 5 million English-speaking Facebook users who lived in the U.S. or U.K. over the course of 17 days (July 6-22, 2012).

So what I initially thought was a case of alarmism—conflating something as simple as simple as a client-side character count with actual server-side monitoring—turned out to be a pretty accurate reading of the situation. I originally intended to write a scoffing post about Slate’s linkbaiting alarmism (and call it “The shocking truth behind the latest Facebook revelation”), but it turns out that my scoffing was misplaced.

That said, the article has been updated to reflect that the Ajax requests are only sending information about deleted characters—not the actual content. Still, as we learned very clearly from the NSA revelations, there’s not much practical difference between logging data and logging metadata.

The nerds among us may start firing up our developer tools to keep track of unexpected Ajax requests to the server. But what about everyone else?

This isn’t the first time that the power of JavaScript has been abused. Every browser now ships with an option to block pop-up windows. That’s because the ability to spawn new windows was so horribly misused. Maybe we’re going to see similar preference options to avoid firing Ajax requests on keypress.

It would be depressingly reductionist to conclude that any technology that can be abused will be abused. But as long as there are web developers out there who are willing to spawn pop-up windows or force persistent cookies or use Ajax to track deleted content, the depressingly reductionist conclusion looks like self-fulfilling prophecy.