The AI Incident Database is dedicated to indexing the collective history of harms or near harms realized in the real world by the deployment of artificial intelligence systems.
Friday, April 14th, 2023
Saturday, November 14th, 2020
I like the way that Simon is liberating his data from silos and making it work for him.
Sunday, September 27th, 2020
The title says it all, really. This is another great piece of writing from Paul Ford.
I’ve noticed that when software lets nonprogrammers do programmer things, it makes the programmers nervous. Suddenly they stop smiling indulgently and start talking about what “real programming” is. This has been the history of the World Wide Web, for example. Go ahead and tweet “HTML is real programming,” and watch programmers show up in your mentions to go, “As if.” Except when you write a web page in HTML, you are creating a data model that will be interpreted by the browser. This is what programming is.
Friday, May 8th, 2020
Trys describes the backend architecture of the excellent Sofa Conf website. In short, it’s a Jamstack dream: all of the convenience and familiarity of using a database-driven CMS (Craft), combined with all the speed and resilience of using a static site generator (Eleventy).
I love the fact that anyone on the Clearleft events team can push to production with a Slack message.
I also love that the site is Lighthousetastically fast.
Saturday, August 10th, 2019
Harry wrote a really good article all about the performance measurement Time To First Byte. Time To First Byte: What It Is and Why It Matters:
While a good TTFB doesn’t necessarily mean you will have a fast website, a bad TTFB almost certainly guarantees a slow one.
Time To First Byte has been the chink in my armour over at thesession.org, especially on the home page. Every time I ran Lighthouse, or some other performance testing tool, I’d get a high score …with some points deducted for taking too long to get that first byte from the server.
Harry’s proposed solution is to set up some Server Timing headers:
With a little bit of extra work spent implementing the Server Timing API, we can begin to measure and surface intricate timings to the front-end, allowing web developers to identify and debug potential bottlenecks previously obscured from view.
I rememberd that Drew wrote an excellent article on Smashing Magazine last year called Measuring Performance With Server Timing:
The job of Server Timing is not to help you actually time activity on your server. You’ll need to do the timing yourself using whatever toolset your backend platform makes available to you. Rather, the purpose of Server Timing is to specify how those measurements can be communicated to the browser.
He even provides some PHP code, which I was able to take wholesale and drop into the codebase for thesession.org. Then I was able to put start/stop points in my code for measuring how long some operations were taking. Then I could output the results of these measurements into Server Timing headers that I could inspect in the “Network” tab of a browser’s dev tools (Chrome is particularly good for displaying Server Timing, so I used that while I was conducting this experiment).
I started with overall database requests. Sure enough, that was where most of the time in time-to-first-byte was being spent.
Then I got more granular. I put start/stop points around specific database calls. By doing this, I was able to zero in on which operations were particularly costly. Once I had done that, I had to figure out how to make the database calls go faster.
Spoiler: I did it by adding an extra index on one particular table. It’s almost always indexes, in my experience, that make the biggest difference to database performance.
I don’t know why it took me so long to get around to messing with Server Timing headers. It has paid off in spades. I wish I had done it sooner.
Tuesday, July 10th, 2018
This strikes me as a sensible way of thinking about machine learning: it’s like when we got relational databases—suddenly we could do more, quicker, and easier …but it doesn’t require us to treat the technology like it’s magic.
An important parallel here is that though relational databases had economy of scale effects, there were limited network or ‘winner takes all’ effects. The database being used by company A doesn’t get better if company B buys the same database software from the same vendor: Safeway’s database doesn’t get better if Caterpillar buys the same one. Much the same actually applies to machine learning: machine learning is all about data, but data is highly specific to particular applications. More handwriting data will make a handwriting recognizer better, and more gas turbine data will make a system that predicts failures in gas turbines better, but the one doesn’t help with the other. Data isn’t fungible.
Tuesday, February 27th, 2018
This post goes into specifics on Django, but the broader points apply no matter what your tech stack. I’m relieved to find out that The Session is using the tripartite identity pattern (although Huffduffer, alas, isn’t):
What we really want in terms of identifying users is some combination of:
- System-level identifier, suitable for use as a target of foreign keys in our database
- Login identifier, suitable for use in performing a credential check
- Public identity, suitable for displaying to other users
Many systems ask the username to fulfill all three of these roles, which is probably wrong.
Wednesday, April 19th, 2017
Tuesday, June 7th, 2016
I need to wrap my head around the details of this approach, but it sounds like it might be something I could do here on my site (where I feel nervous about my current dependency on a database).
Thursday, December 3rd, 2015
A really nice piece by Paul Ford on the history of databases and the dream of the Semantic Web.
Sometimes I get a little wistful. The vision of a world of connected facts, one big, living library, remains beautiful, and unfulfilled.
Monday, July 30th, 2012
Some good database character-encoding advice from Mathias.
Thursday, February 10th, 2011
For once, I’m happy to see data being destroyed.
Tuesday, October 27th, 2009
You can now store (and scale) MySQL databases with Amazon. Handy.
Sunday, May 10th, 2009
Now *this* is how you explain technical concepts.
Wednesday, October 24th, 2007
I just learned from Kelly that Webkit is supporting local storage and database queries, as proposed in HTML5. Kinda like Google Gears. Potentially excited for the iPhone/iPod Touch.
Thursday, October 11th, 2007
Yes, you have to be a bit of a database geek to find this funny but if you are, this is very funny indeed.
Monday, May 7th, 2007
This article is a life-saver for me. I'm constantly having trouble with special characters when I'm backing up databases for local copies of my sites.