Tags: privacy

76

sparkline

Saturday, November 10th, 2018

Webmentions at Indie Web Camp Berlin

I was in Berlin for most of last week, and every day was packed with activity:

By the time I got back to Brighton, my brain was full …just in time for FF Conf.

All of the events were very different, but equally enjoyable. It was also quite nice to just attend events without speaking at them.

Indie Web Camp Berlin was terrific. There was an excellent turnout, and once again, I found that the format was just right: a day of discussions (BarCamp style) followed by a day of doing (coding, designing, hacking). I got very inspired on the first day, so I was raring to go on the second.

What I like to do on the second day is try to complete two tasks; one that’s fairly straightforward, and one that’s a bit tougher. That way, when it comes time to demo at the end of the day, even if I haven’t managed to complete the tougher one, I’ll still be able to demo the simpler one.

In this case, the tougher one was also tricky to demo. It involved a lot of invisible behind-the-scenes plumbing. I was tweaking my webmention endpoint (stop sniggering—tweaking your endpoint is no laughing matter).

Up until now, I could handle straightforward webmentions, and I could handle updates (if I receive more than one webmention from the same link, I check it each time). But I needed to also handle deletions.

The spec is quite clear on this. A 404 isn’t enough to trigger a deletion—that might be a temporary state. But a status of 410 Gone indicates that a resource was once here but has since been deliberately removed. In that situation, any stored webmentions for that link should also be removed.

Anyway, I think I got it working, but it’s tricky to test and even trickier to demo. “Not to worry”, I thought, “I’ve always got my simpler task.”

For that, I chose to add a little map to my homepage showing the last location I published something from. I’ve been geotagging all my content for years (journal entries, notes, links, articles), but not really doing anything with that data. This is a first step to doing something interesting with many years of location data.

I’ve got it working now, but the demo gods really weren’t with me at Indie Web Camp. Both of my demos failed. The webmention demo failed quite embarrassingly.

As well as handling deletions, I also wanted to handle updates where a URL that once linked to a post of mine no longer does. Just to be clear, the URL still exists—it’s not 404 or 410—but it has been updated to remove the original link back to one of my posts. I know this sounds like another very theoretical situation, but I’ve actually got an example of it on my very first webmention test post from five years ago. Believe it or not, there’s an escort agency in Nottingham that’s using webmention as a vector for spam. They post something that does link to my test post, send a webmention, and then remove the link to my test post. I almost admire their dedication.

Still, I wanted to foil this particular situation so I thought I had updated my code to handle it. Alas, when it came time to demo this, I was using someone else’s computer, and in my attempt to right-click and copy the URL of the spam link …I accidentally triggered it. In front of a room full of people. It was midly NSFW, but more worryingly, a potential Code Of Conduct violation. I’m very sorry about that.

Apart from the humiliating demo, I thoroughly enjoyed Indie Web Camp, and I’m going to keep adjusting my webmention endpoint. There was a terrific discussion around the ethical implications of storing webmentions, led by Sebastian, based on his epic post from earlier this year.

We established early in the discussion that we weren’t going to try to solve legal questions—like GDPR “compliance”, which varies depending on which lawyer you talk to—but rather try to figure out what the right thing to do is.

Earlier that day, during the introductions, I quite happily showed webmentions in action on my site. I pointed out that my last blog post had received a response from another site, and because that response was marked up as an h-entry, I displayed it in full on my site. I thought this was all hunky-dory, but now this discussion around privacy made me question some inferences I was making:

  1. By receiving a webention in the first place, I was inferring a willingness for the link to be made public. That’s not necessarily true, as someone pointed out: a CMS could be automatically sending webmentions, which the author might be unaware of.
  2. If the linking post is marked up in h-entry, I was inferring a willingness for the content to be republished. Again, not necessarily true.

That second inferrence of mine—that publishing in a particular format somehow grants permissions—actually has an interesting precedent: Google AMP. Simply by including the Google AMP script on a web page, you are implicitly giving Google permission to store a complete copy of that page and serve it from their servers instead of sending people to your site. No terms and conditions. No checkbox ticked. No “I agree” button pressed.

Just sayin’.

Anyway, when it comes to my own processing of webmentions, I’m going to take some of the suggestions from the discussion on board. There are certain signals I could be looking for in the linking post:

  • Does it include a link to a licence?
  • Is there a restrictive robots.txt file?
  • Are there meta declarations that say noindex?

Each one of these could help to infer whether or not I should be publishing a webmention or not. I quickly realised that what we’re talking about here is an algorithm.

Despite its current usage to mean “magic”, an algorithm is a recipe. It’s a series of steps that contribute to a decision point. The problem is that, in the case of silos like Facebook or Instagram, the algorithms are secret (which probably contributes to their aura of magical thinking). If I’m going to write an algorithm that handles other people’s information, I don’t want to make that mistake. Whatever steps I end up codifying in my webmention endpoint, I’ll be sure to document them publicly.

Thursday, September 20th, 2018

The costs and benefits of tracking scripts – business vs. user // Sebastian Greger

I am having a hard time seeing the business benefits weighing in more than the user cost (at least for those many organisations out there who rarely ever put that data to proper use). After all, keeping the costs low for the user should be in the core interest of the business as well.

Friday, September 14th, 2018

On using tracking scripts | justmarkup

Weighing up the pros and cons of adding tracking scripts to a website, from a business perspective and from a user perspective.

When looking at the costs versus the benefits it is hard to believe that almost every website is using tracking scripts.

The next time, you implement a tracking script it would be great if you could rethink it and ask yourself if it is really worth it.

Wednesday, September 12th, 2018

Private by Default

Feedbin has removed third-party iframes and JavaScript (oEmbed provides a nice alternative), as well as stripping out Google Analytics, and even web fonts that aren’t self-hosted. This is excellent!

Saturday, September 1st, 2018

Changing Our Approach to Anti-tracking - Future Releases

This is excellent news from Mozilla. Firefox is going to make it easier to block vampiric privacy-leeching and performance-draining third-party scripts and trackers.

In the physical world, users wouldn’t expect hundreds of vendors to follow them from store to store, spying on the products they look at or purchase. Users have the same expectations of privacy on the web, and yet in reality, they are tracked wherever they go.

Friday, July 20th, 2018

Laura Kalbag – Insecure

The web can be used to find common connections with folks you find interesting, and who don’t make you feel like so much of a weirdo. It’d be nice to be able to do this in a safe space that is not being surveilled.

Owning your own content, and publishing to a space you own can break through some of these barriers. Sharing your own weird scraps on your own site makes you easier to find by like-minded folks. If you’ve got no tracking on your site (no Google Analytics etc), you are harder to profile. People can’t come to harass you on your own site if you do not offer them the means to do so

Thursday, July 19th, 2018

Fixing these webs - daverupert.com

I’m a fan of fast websites. Your website needs to be fast. Our collective excuses, hand-wringing, and inability to come to terms with the problem-set (There is too much script) and solutions (Use less script) of modern web development is getting tired.

I agree with every word of this.

Sadly, I think the one company with a browser that has marketshare dominance and could exert the kind of pressure required to stop ad tracking and surveillance capitalism is not incentivized to do so.

So the problem is approached from the other end. Blame is piled on authors for slow first-party code. We’re told to use certain mobile publishing frameworks that syndicate to proprietary CDNs to appease the gods of luck and fortune.

Sunday, July 8th, 2018

BBC News on HTTPS – BBC Design + Engineering – Medium

BBC News has switched to HTTPS—hurrah!

Here, one of the engineers writes on Ev’s blog about the challenges involved. Personally, I think this is far more valuable and inspiring to read than the unempathetic posts claiming that switching to HTTPS is easy.

Update: Paul found the original URL for this …weird that they don’t link to it from the syndicated version.

Tuesday, July 3rd, 2018

“I Was Devastated”: Tim Berners-Lee, the Man Who Created the World Wide Web, Has Some Regrets | Vanity Fair

Are we headed toward an Orwellian future where a handful of corporations monitor and control our lives? Or are we on the verge of creating a better version of society online, one where the free flow of ideas and information helps cure disease, expose corruption, reverse injustices?

It’s hard to believe that anyone—even Zuckerberg—wants the 1984 version. He didn’t found Facebook to manipulate elections; Jack Dorsey and the other Twitter founders didn’t intend to give Donald Trump a digital bullhorn. And this is what makes Berners-Lee believe that this battle over our digital future can be won. As public outrage grows over the centralization of the Web, and as enlarging numbers of coders join the effort to decentralize it, he has visions of the rest of us rising up and joining him.

Tuesday, June 26th, 2018

Name That Script! by Trent Walton

Trent is about to pop his AEA cherry and give a talk at An Event Apart in Boston. I’m going to attempt to liveblog this:

How many third-party scripts are loading on our web pages these days? How can we objectively measure the value of these (advertising, a/b testing, analytics, etc.) scripts—considering their impact on web performance, user experience, and business goals? We’ve learned to scrutinize content hierarchy, browser support, and page speed as part of the design and development process. Similarly, Trent will share recent experiences and explore ways to evaluate and discuss the inclusion of 3rd-party scripts.

Trent is going to speak about third-party scripts, which is funny, because a year ago, he never would’ve thought he’d be talking about this. But he realised he needed to pay more attention to:

any request made to an external URL.

Or how about this:

A resource included with a web page that the site owner doesn’t explicitly control.

When you include a third-party script, the third party can change the contents of that script.

Here are some uses:

  • advertising,
  • A/B testing,
  • analytics,
  • social media,
  • content delivery networks,
  • customer interaction,
  • comments,
  • tag managers,
  • fonts.

You get data from things like analytics and A/B testing. You get income from ads. You get content from CDNs.

But Trent has concerns. First and foremost, the user experience effects of poor performance. Also, there are the privacy implications.

Why does Trent—a designer—care about third party scripts? Well, over the years, the areas that Trent pays attention to has expanded. He’s progressed from image comps to frontend to performance to accessibility to design systems to the command line and now to third parties. But Trent has no impact on those third-party scripts. That’s very different to all those other areas.

Trent mostly builds prototypes. Those then get handed over for integration. Sometimes that means hooking it up to a CMS. Sometimes it means adding in analytics and ads. It gets really complex when you throw in third-party comments, payment systems, and A/B testing tools. Oftentimes, those third-party scripts can outweigh all the gains made beforehand. It happens with no discussion. And yet we spent half a meeting discussing a border radius value.

Delivering a performant, accessible, responsive, scalable website isn’t enough: I also need to consider the impact of third-party scripts.

Trent has spent the last few months learning about third parties so he can be better equiped to discuss them.

UX, performance and privacy impact

We feel the UX impact every day we browse the web (if we turn off our content blockers). The Food Network site has an intersitial asking you to disable your ad blocker. They promise they won’t spawn any pop-up windows. Trent turned his ad blocker off—the page was now 15 megabytes in size. And to top it off …he got a pop up.

Privacy can harder to perceive. We brush aside cookie notifications. What if the wording was “accept trackers” instead of “accept cookies”?

Remarketing is that experience when you’re browsing for a spatula and then every website you visit serves you ads for spatula. That might seem harmless but allowing access to our browsing history has serious privacy implications.

Web builders are on the front lines. It’s up to us to advocate for data protection and privacy like we do for web standards. Don’t wait to be told.

Categories of third parties

Ghostery categories third-party providers: advertising, comments, customer interaction, essential, site analytics, social media. You can dive into each layer and see the specific third-party services on the page you’re viewing.

Analyse and itemise third-party scripts

We have “view source” for learning web development. For third parties, you need some tool to export the data. HAR files (HTTP ARchive) are JSON files that you can create from most browsers’ network request panel in dev tools. But what do you do with a .har file? The site har.tech has plenty of resources for you. That’s where Trent found the Mac app, Charles. It can open .har files. Best of all, you can export to CSV so you can share spreadsheets of the data.

You can visualise third-party requests with Simon Hearne’s excellent Request Map. It’s quite impactful for delivering a visceral reaction in a meeting—so much more effective than just saying “hey, we have a lot of third parties.” Request Map can also export to CSV.

Know industry averages

Trent wanted to know what was “normal.” He decided to analyse HAR files for Alexa’s top 50 US websites. The result was a massive spreadsheet of third-party providers. There were 213 third-party domains (which is not even the same as the number of requests). There was an average of 22 unique third-party domains per site. The usual suspects were everywhere—Google, Amazon, Facebook, Adobe—but there were many others. You can find an alphabetical index on better.fyi/trackers. Often the lesser-known domains turn out to be owned by the bigger domains.

News sites and shopping sites have the most third-party scripts, unsurprisingly.

Understand benefits

Trent realised he needed to listen and understand why third-party scripts are being included. He found out what tag managers do. They’re funnels that allow you to cram even more third-party scripts onto your website. Trent worried that this was a Pandora’s box. The tag manager interface is easy to access and use. But he was told that it’s more like a way of organising your third-party scripts under one dashboard. But still, if you get too focused on the dashboard, you could lose focus of the impact on load times. So don’t blame the tool: it’s all about how it’s used.

Take action

Establish a centre of excellence. Put standards in place—in a cross-discipline way—to define how third-party scripts are evaluated. For example:

  1. Determine the value to the business.
  2. Avoid redundant scripts and services.
  3. Fit within the established performance budget.
  4. Comply with the organistional privacy policy.

Document those decisions, maybe even in your design system.

Also, include third-party scripts within your prototypes to get a more accurate feel for the performance implications.

On a live site, you can regularly audit third-party scripts on a regular basis. Check to see if any are redundant or if they’re exceeding the performance budget. You can monitor performance with tools like Calibre and Speed Curve to cover the time in between audits.

Make your case

Do competitive analysis. Look at other sites in your sector. It’s a compelling way to make a case for change. WPO Stats is very handy for anecdata.

You can gather comparative data with Web Page Test: you can run a full test, and you can run a test with certain third parties blocked. Use the results to kick off a discussion about the impact of those third parties.

Talk it out

Work to maintain an ongoing discussion with the entire team. As Tim Kadlec says:

Everything should have a value, because everything has a cost.

Saturday, June 16th, 2018

Artificial Intelligence for more human interfaces | Christian Heilmann

An even-handed assessment of the benefits and dangers of machine learning.

Thursday, May 17th, 2018

New Privacy Rules Could Make This Woman One of Tech’s Most Important Regulators - The New York Times

It’s kind of surreal to see a profile in the New York Times of my sister-in-law. Then again, she is Ireland’s data protection commissioner, and what with Facebook, Twitter, and Google all being based in Ireland, and with GDPR looming, her work is more important than ever.

By the way, this article has 26 tracking scripts. I don’t recall providing consent for any of them.

Tuesday, April 10th, 2018

Facebook Is Tracking Me Even Though I’m Not on Facebook | American Civil Liberties Union

But while I’ve never “opted in” to Facebook or any of the other big social networks, Facebook still has a detailed profile that can be used to target me. I’ve never consented to having Facebook collect my data, which can be used to draw very detailed inferences about my life, my habits, and my relationships. As we aim to take Facebook to task for its breach of user trust, we need to think about what its capabilities imply for society overall. After all, if you do #deleteFacebook, you’ll find yourself in my shoes: non-consenting, but still subject to Facebook’s globe-spanning surveillance and targeting network.

Facebook’s “shadow profiles” are truly egregious …and if you include social sharing buttons on a website, you’re contributing to the data harvest.

If you administer a website and you include a “Like” button on every page, you’re helping Facebook to build profiles of your visitors, even those who have opted out of the social network.

If you are responsible for running a website, try browsing it with a third-party-blocking extension turned on. Think about how much information you’re requiring your users to send to third parties as a condition for using your site. If you care about being a good steward of your visitors’ data, you can re-design your website to reduce this kind of leakage.

Thursday, April 5th, 2018

Doc Searls Weblog · Facebook’s Cambridge Analytica problems are nothing compared to what’s coming for all of online publishing

What will happen when the Times, the New Yorker and other pubs own up to the simple fact that they are just as guilty as Facebook of leaking its readers’ data to other parties, for—in many if not most cases—God knows what purposes besides “interest-based” advertising? And what happens when the EU comes down on them too? It’s game-on after 25 May, when the EU can start fining violators of the General Data Protection Regulation (GDPR). Key fact: the GDPR protects the data blood of EU citizens wherever they risk having it sucked in the digital world.

#davewentandroid - daverupert.com

Yeah. Fuck this. That’s creepy. Technically I opted into this feature because Google Maps asked “Google Maps would like to know your location, YES or NO?” Of course my answer was “YES” because, hey, it’s a fucking map. I didn’t realize I consented to having my information and location history stored indefinitely on Google’s servers.

I began all the work of disabling this “feature” but it seemed like a fruitless task. Also worth noting, Google Maps for iOS keeps Location History as well.

Friday, March 30th, 2018

Facebook Container Extension: Take control of how you’re being tracked | The Firefox Frontier

A Firefox plugin that ring-fences all Facebook activity to the facebook.com domain. Once you close that tab, this extension takes care of garbage collection, ensuring that Facebook tracking scripts don’t leak into any other browsing activities.

Wednesday, March 21st, 2018

Facebook and the end of the world

I’d love to see some change, and some introspection. A culture of first, do no harm. A recognition that there are huge dangers if you just do what’s possible, or build a macho “fail fast” culture that promotes endangerment. It’s about building teams that know they’ll make mistakes but also recognize the difference between great businesses opportunities and gigantic, universe-sized fuck ups.

Monday, February 26th, 2018

as days pass by — Collecting user data while protecting user privacy

Really smart thinking from Stuart on how the randomised response technique could be applied to analytics. My only question is who exactly does the implementation.

The key point here is that, if you’re collecting data about a load of users, you’re usually doing so in order to look at it in aggregate; to draw conclusions about the general trends and the general distribution of your user base. And it’s possible to do that data collection in ways that maintain the aggregate properties of it while making it hard or impossible for the company to use it to target individual users. That’s what we want here: some way that the company can still draw correct conclusions from all the data when collected together, while preventing them from targeting individuals or knowing what a specific person said.

Saturday, February 17th, 2018

Chrome’s default ad blocker strengthens Google’s data-driven advertising platforms

From a consumer’s point of view, less intrusive ad formats are of course desirable. Google’s approach is therefore basically heading in the right direction. From a privacy perspective, however, the “Better Ads” are no less aggressive than previous forms of advertising. Highly targeted ads based on detailed user profiles work subtle. They replace aggressive visuals with targeted manipulation.

Friday, February 2nd, 2018

Privacy could be the next big thing

A brilliant talk by Stuart on how privacy could be a genuinely disruptive angle for companies looking to gain competitive advantage over the businesses currently in the ascendent.

How do you end up shaping the world? By inventing a thing that the current incumbents can’t compete against. By making privacy your core goal. Because companies who have built their whole business model on monetising your personal information cannot compete against that. They’d have to give up on everything that they are, which they can’t do. Facebook altering itself to ensure privacy for its users… wouldn’t exist. Can’t exist. That’s how you win.

The beauty of this is that it’s a weapon which only hurts bad people. A company who are currently doing creepy things with your data but don’t actually have to can alter themselves to not be creepy, and then they’re OK! A company who is utterly reliant on doing creepy things with your data and that’s all they can do, well, they’ll fail. But, y’know, I’m kinda OK with that.