As it turns out, some sites are much harder to archive than others. This article goes through the process of archiving traditional web sites and shows how it falls short when confronted with the latest fashions in the single-page applications that are bloating the modern web.
…a full one-third of my window is covered by a pop-over trying to get me to sign in or sign up for Facebook. I will go out of my way to avoid linking to websites that are hostile to users with pop-overs. (For example, I’ve largely stopped linking to anything from Wired, because they have such an aggressive anti-ad-block detection scheme. Fuck them.)
Facebook forbids search engines from indexing Facebook posts. Content that isn’t indexable by search engines is not part of the open web.
And then there’s this:
And in the same way they block indexing by search engines, Facebook forbids The Internet Archive from saving copies of posts.
This move by Google to start executing some POST requests makes me very uneasy: the web is agreement and part of that agreement is that POST requests are initiated by the user.