Tags: storage

66

sparkline

Tuesday, July 27th, 2021

Safeguarding music | Global Music Vault | Svalbard

This sounds like an interesting long-term storage project, but colour me extremely sceptical of their hand-wavey vagueness around their supposedly flawless technical solution:

This technology will be revealed to the world in the near future.

Also, they keep hyping up the Svalbard location as though it were purpose-built for this project, rather than the global seed bank (which they don’t even mention).

This might be a good way to do marketing, but it’s a shitty way to go about digital preservation.

Saturday, July 24th, 2021

Reflections as the Internet Archive turns 25

Brewster Kahle:

The World Wide Web at its best is a mechanism for people to share what they know, almost always for free, and to find one’s community no matter where you are in the world.

Saturday, July 3rd, 2021

The Internet Is Rotting - The Atlantic

A terrific piece by Jonathan Zittrain on bitrot and online digital preservation:

Too much has been lost already. The glue that holds humanity’s knowledge together is coming undone.

Tuesday, February 23rd, 2021

Introducing State Partitioning - Mozilla Hacks - the Web developer blog

This is a terrific approach to tackling cross-site surveillance. I’d love it to be implemented in all browsers. I can imagine Safari implementing this. Chrome …we’ll see.

Friday, December 18th, 2020

Conditional JavaScript - JavaScript - Dev Tips

This is a good round-up of APIs you can use to decide if and how much JavaScript to load. I might look into using storage.estimate() in service workers to figure out how much gets pre-cached.

Thursday, November 12th, 2020

Caching and storing

When I was speaking at conferences last year about service workers, I’d introduce the Cache API. I wanted some way of explaining the difference between caching and other kinds of storage.

The way I explained was that, while you might store stuff for a long time, you’d only cache stuff that you knew you were going to need again. So according to that definition, when you make a backup of your hard drive, that’s not caching …becuase you hope you’ll never need to use the backup.

But that explanation never sat well with me. Then more recently, I was chatting with Amber about caching. Once again, we trying to define the difference between, say, the Cache API and things like LocalStorage and IndexedDB. At some point, we realised the fundamental difference: caches are for copies.

Think about it. If you store something in LocalStorage or IndexedDB, that’s the canonical home for that data. But anything you put into a cache must be a copy of something that exists elsewhere. That’s true of the Cache API, the browser cache, and caches on the server. An item in one of those caches is never the original—it’s always a copy of something that has a canonical home elsewhere.

By that definition, backing up your hard drive definitely is caching.

Anyway, I was glad to finally have a working definition to differentiate between caching and storing.

Wednesday, October 21st, 2020

Chrome exempts Google sites from user site data settings

Collusion between three separate services owned by the same company: the Google search engine, the YouTube website, and the Chrome web browser.

Gosh, this kind of information could be really damaging if there were, say, antitrust proceedings initiated.

In the meantime, use Firefox

Wednesday, October 14th, 2020

Saving forms

I added a long-overdue enhancement to The Session recently. Here’s the scenario…

You’re on a web page with a comment form. You type your well-considered thoughts into a textarea field. But then something happens. Maybe you accidentally navigate away from the page or maybe your network connection goes down right when you try to submit the form.

This is a textbook case for storing data locally on the user’s device …at least until it has safely been transmitted to the server. So that’s what I set about doing.

My first decision was choosing how to store the data locally. There are multiple APIs available: sessionStorage, IndexedDB, localStorage. It was clear that sessionStorage wasn’t right for this particular use case: I needed the data to be saved across browser sessions. So it was down to IndexedDB or localStorage. IndexedDB is the more versatile and powerful—because it’s asynchronous—but localStorage is nice and straightforward so I decided on that. I’m not sure if that was the right decision though.

Alright, so I’m going to store the contents of a form in localStorage. It accepts key/value pairs. I’ll make the key the current URL. The value will be the contents of that textarea. I can store other form fields too. Even though localStorage technically only stores one value, that value can be a JSON object so in reality you can store multiple values with one key (just remember to parse the JSON when you retrieve it).

Now I know what I’m going to store (the textarea contents) and how I’m going to store it (localStorage). The next question is when should I do it?

I could play it safe and store the comment whenever the user presses a key within the textarea. But that seems like overkill. It would be more efficient to only save when the user leaves the current page for any reason.

Alright then, I’ll use the unload event. No! Bad Jeremy! If I use that then the browser can’t reliably add the current page to the cache it uses for faster back-forwards navigations. The page life cycle is complicated.

So beforeunload then? Well, maybe. But modern browsers also support a pagehide event that looks like a better option.

In either case, just adding a listener for the event could screw up the caching of the page for back-forwards navigations. I should only listen for the event if I know that I need to store the contents of the textarea. And in order to know if the user has interacted with the textarea, I’m back to listening for key presses again.

But wait a minute! I don’t have to listen for every key press. If the user has typed anything, that’s enough for me. I only need to listen for the first key press in the textarea.

Handily, addEventListener accepts an object of options. One of those options is called “once”. If I set that to true, then the event listener is only fired once.

So I set up a cascade of event listeners. If the user types anything into the textarea, that fires an event listener (just once) that then adds the event listener for when the page is unloaded—and that’s when the textarea contents are put into localStorage.

I’ve abstracted my code into a gist. Here’s what it does:

  1. Cut the mustard. If this browser doesn’t support localStorage, bail out.
  2. Set the localStorage key to be the current URL.
  3. If there’s already an entry for the current URL, update the textarea with the value in localStorage.
  4. Write a function to store the contents of the textarea in localStorage but don’t call the function yet.
  5. The first time that a key is pressed inside the textarea, start listening for the page being unloaded.
  6. When the page is being unloaded, invoke that function that stores the contents of the textarea in localStorage.
  7. When the form is submitted, remove the entry in localStorage for the current URL.

That last step isn’t something I’m doing on The Session. Instead I’m relying on getting something back from the server to indicate that the form was successfully submitted. If you can do something like that, I’d recommend that instead of listening to the form submission event. After all, something could still go wrong between the form being submitted and the data being received by the server.

Still, this bit of code is better than nothing. Remember, it’s intended as an enhancement. You should be able to drop it into any project and improve the user experience a little bit. Ideally, no one will ever notice it’s there—it’s the kind of enhancement that only kicks in when something goes wrong. A little smidgen of resilient web design. A defensive enhancement.

Monday, April 6th, 2020

Local-first software: You own your data, in spite of the cloud

The cloud gives us collaboration, but old-fashioned apps give us ownership. Can’t we have the best of both worlds?

We would like both the convenient cross-device access and real-time collaboration provided by cloud apps, and also the personal ownership of your own data embodied by “old-fashioned” software.

This is a very in-depth look at the mindset and the challenges involved in building truly local-first software—something that Tantek has also been thinking about.

Thursday, March 26th, 2020

Apple’s attack on service workers

Apple aren’t the best at developer relations. But, bad as their communications can be, I’m willing to cut them some slack. After all, they’re not used to talking with the developer community.

John Wilander wrote a blog post that starts with some excellent news: Full Third-Party Cookie Blocking and More. Safari is catching up to Firefox and disabling third-party cookies by default. Wonderful! I’ve had third-party cookies disabled for a few years now, and while something occassionally breaks, it’s honestly a pretty great experience all around. Denying companies the ability to track users across sites is A Good Thing.

In the same blog post, John said that client-side cookies will be capped to a seven-day lifespan, as previously announced. Just to be clear, this only applies to client-side cookies. If you’re setting a cookie on the server, using PHP or some other server-side language, it won’t be affected. So persistent logins are still doable.

Then, in an audacious example of burying the lede, towards the end of the blog post, John announces that a whole bunch of other client-side storage technologies will also be capped to seven days. Most of the technologies are APIs that, like cookies, can be used to store data: Indexed DB, Local Storage, and Session Storage (though there’s no mention of the Cache API). At the bottom of the list is this:

Service Worker registrations

Okay, let’s clear up a few things here (because they have been so poorly communicated in the blog post)…

The seven day timer refers to seven days of Safari usage, not seven calendar days (although, given how often most people use their phones, the two are probably interchangable). So if someone returns to your site within a seven day period of using Safari, the timer resets to zero, and your service worker gets a stay of execution. Lucky you.

This only applies to Safari. So if your site has been added to the home screen and your web app manifest has a value for the “display” property like “standalone” or “full screen”, the seven day timer doesn’t apply.

That piece of information was missing from the initial blog post. Since the blog post was updated to include this clarification, some people have taken this to mean that progressive web apps aren’t affected by the upcoming change. Not true. Only progressive web apps that have been added to the home screen (and that have an appropriate “display” value) will be spared. That’s a vanishingly small percentage of progressive web apps, especially on iOS. To add a site to the home screen on iOS, you need to dig and scroll through the share menu to find the right option. And you need to do this unprompted. There is no ambient badging in Safari to indicate that a site is installable. Chrome’s install banner isn’t perfect, but it’s better than nothing.

Just a reminder: a progressive web app is a website that

  • runs on HTTPS,
  • has a service worker,
  • and a web manifest.

Adding to the home screen is something you can do with a progressive web app (or any other website). It is not what defines progressive web apps.

In any case, this move to delete service workers after seven days of using Safari is very odd, and I’m struggling to find the connection to the rest of the blog post, which is about technologies that can store data.

As I understand it, with the crackdown on setting third-party cookies, trackers are moving to first-party technologies. So whereas in the past, a tracking company could tell its customers “Add this script element to your pages”, now they have to say “Add this script element and this script file to your pages.” That JavaScript file can then store a unique idenitifer on the client. This could be done with a cookie, with Local Storage, or with Indexed DB, for example. But I’m struggling to understand how a service worker script could be used in this way. I’d really like to see some examples of this actually happening.

The best explanation I can come up with for this move by Apple is that it feels like the neatest solution. That’s neat as in tidy, not as in nifty. It is definitely not a nifty solution.

If some technologies set by a specific domain are being purged after seven days, then the tidy thing to do is purge all technologies from that domain. Service workers are getting included in that dragnet.

Now, to be fair, browsers and operating systems are free to clean up storage space as they see fit. Caches, Local Storage, Indexed DB—all of those are subject to eventually getting cleaned up.

So I was curious. Wanting to give Apple the benefit of the doubt, I set about trying to find out how long service worker registrations currently last before getting deleted. Maybe this announcement of a seven day time limit would turn out to be not such a big change from current behaviour. Maybe currently service workers last for 90 days, or 60, or just 30.

Nope:

There was no time limit previously.

This is not a minor change. This is a crippling attack on service workers, a technology specifically designed to improve the user experience for return visits, whether it’s through improved performance or offline access.

I wouldn’t be so stunned had this announcement come with an accompanying feature that would allow Safari users to know when a website is a progressive web app that can be added to the home screen. But Safari continues to ignore the existence of progressive web apps. And now it will actively discourage people from using service workers.

If you’d like to give feedback on this ludicrous development, you can file a bug (down in the cellar in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying “Beware of the Leopard”).

No doubt there will still be plenty of Apple apologists telling us why it’s good that Safari has wished service workers into the cornfield. But make no mistake. This is a terrible move by Apple.

I will say this though: given The Situation we’re all living in right now, some good ol’ fashioned Hot Drama by a browser vendor behaving badly feels almost comforting.

Friday, February 28th, 2020

The Permanent Legacy Foundation

A non-profit that offers digital preservation services for individuals.

Permanence means no subscriptions; a one-time payment for dedicated storage that preserves your most precious memories and an institution that will be there to protect the digital legacy of all people for all time.

Thursday, February 6th, 2020

Local First, Undo Redo, JS-Optional, Create Edit Publish - Tantek

Tantek documents the features he wants his posting interface to have.

Monday, December 2nd, 2019

Just enough Internet | doteveryone

The carbon cost of collecting and storing data no one can use is already a moral issue.

So before you add another field, let alone make a new service, can you be sure it will make enough of a difference to legitimise its impact on the planet?

Tuesday, November 19th, 2019

The GitHub Archive Program will safely store every public GitHub repo for 1,000 years in the Arctic World Archive in Svalbard, Norway.

This is a fascinating project from Github, the Long Now Foundation, the Internet Archive, the Bodleian Library and others. All of the public code on Github on February 2nd, 2020 will be archived for 1000 years in a vault in Svalbard.

Mind you, given the amount of dependencies that most “modern” code projects rely on, I can’t foresee the code working after 1000 days.

Friday, September 6th, 2019

Offline listings

This is brilliant technique by Remy!

If you’ve got a custom offline page that lists previously-visited pages (like I do on my site), you don’t have to choose between localStorage or IndexedDB—you can read the metadata straight from the HTML of the cached pages instead!

This seems forehead-smackingly obvious in hindsight. I’m totally stealing this.

Saturday, July 6th, 2019

The Hiding Place: Inside the World’s First Long-Term Storage Facility for Highly Radioactive Nuclear Waste - Pacific Standard

Robert McFarlane’s new book is an exploration of deep time. In this extract, he visits the Onkalo nuclear waste storage facility in Finland.

Sometimes we bury materials in order that they may be preserved for the future. Sometimes we bury materials in order to preserve the future from them.

Monday, January 28th, 2019

The 500-Year-Long Science Experiment - The Atlantic

Running an experiment for 500 years is hard enough. Then there’s the documentation…

The hard part is ensuring someone will continue doing this on schedule well into the future. The team left a USB stick with instructions, which Möller realizes is far from adequate, given how quickly digital technology becomes obsolete. They also left a hard copy, on paper. “But think about 500-year-old paper,” he says, how it would yellow and crumble. “Should we carve it in stone? Do we have to carve it in a metal plate?” But what if someone who cannot read the writing comes along and decides to take the metal plate as a cool, shiny relic, as tomb raiders once did when looting ancient tombs?

No strategy is likely to be completely foolproof 500 years later. So the team asks that researchers at each 25-year time point copy the instructions so that they remain linguistically and technologically up to date.

Friday, November 23rd, 2018

Home - Memory of Mankind

A time capsule for the long now. Laser-etched ceramic tablets in an Austrian salt mine carry memories of our civilisation in three categories: news editorials, scientific works, and personal stories.

You can contribute a personal story, your favorite poem, or newspaper articles which describe our problems, visions or our daily life.

Tokens that mark the location of the site are also being distributed across the planet.

Sunday, September 9th, 2018

Why the Future of Data Storage is (Still) Magnetic Tape - IEEE Spectrum

It turns out that a whole lot of The So-Called Cloud is relying on magnetic tape for its backups.

Friday, June 22nd, 2018

Offline-Friendly Forms by Max Böck

A clever use of localStorage to stop data from being lost when your visitors are offline.