I think, with the sheer volume of functionality available to us nowadays on the front-end, it can be easy to forget how powerful and strong the functionality is that we get right off shelf with HTML. Yes, you read that right, functionality.
Thursday, May 12th, 2022
Saturday, April 25th, 2020
At the beginning of the year, Remy wrote about extracting Goodreads metadata so he could create his end-of-year reading list. More recently, Mark Llobrera wrote about how he created a visualisation of his reading history. In his case, he’s using JSON to store the information.
This kind of JSON storage is exactly what Tom Critchlow proposes in his post, Library JSON - A Proposal for a Decentralized Goodreads:
Thinking through building some kind of “web of books” I realized that we could use something similar to RSS to build a kind of decentralized GoodReads powered by indie sites and an underlying easy to parse format.
His proposal looks kind of similar to what Mark came up with. There’s a title, an author, an image, and some kind of date for when you started and/or finished reading the book.
Matt then points out that RSS gets close to the data format being suggested and asks how about using RSS?:
Rather than inventing a new format, my suggestion is that this is RSS plus an extension to deal with books. This is analogous to how the podcast feeds are specified: they are RSS plus custom tags.
Like Matt, I’m in favour of re-using existing wheels rather than inventing new ones, mostly to avoid a 927 situation.
But all of these proposals—whether JSON or RSS—involve the creation of a separate file, and yet the information is originally published in HTML. Along the lines of Matt’s idea, I could imagine extending the
h-entry collection of class names to allow for books (or films, or other media). It already handles images (with
u-photo). I think the missing fields are the date-related ones: when you start and finish reading. Those fields are present in a different microformat,
h-event in the form of
dt-end. Maybe they could be combined:
<article class="h-entry h-event h-review"> <h1 class="p-name p-item">Book title</h1> <img class="u-photo" src="image.jpg" alt="Book cover."> <p class="p-summary h-card">Book author</p> <time class="dt-start" datetime="YYYY-MM-DD">Start date</time> <time class="dt-end" datetime="YYYY-MM-DD">End date</time> <div class="e-content">Remarks</div> <data class="p-rating" value="5">★★★★★</data> <time class="dt-published" datetime="YYYY-MM-DDThh:mm">Date of this post</time> </article>
That markup is simultaneously a post (
h-entry) and an event (
h-event) and you can even throw in
h-card for the book author (as well as
h-review if you like to rate the books you read). It can be converted to RSS and also converted to
.ics for calendars—those parsers are already out there. It’s ready for aggregation and it’s ready for visualisation.
I publish very minimal reading posts here on adactio.com. What little data is there isn’t very structured—I don’t even separate the book title from the author. But maybe I’ll have a little play around with turning these h-entries into combined h-entry/event posts.
Wednesday, May 22nd, 2019
Bruce wonders why Google seems to prefer separate chunks of JSON-LD in web pages instead of interwoven microdata attributes:
I strongly feel that metadata that is separated from the user-visible data associated with it highly susceptible to metadata partial copy-paste necrosis. User-visible text is also developer-visible text. When devs copy/ paste that, it’s very easy to forget to copy any associated metadata that’s not interleaved, leading to errors.