Tags: tagging




Speaking of URLs

We were having a discussion in the Clearleft office recently about that perennially-tricky navigation pivot: tags. Specifically, we were discussing how to represent the interface for combinatorial tags i.e. displaying results of items that have been tagged with tag A and tag B.

I realised that this was functionality that I wasn’t even offering on Huffduffer, so I set to work on implementing it. I decided to dodge the interface question completely by only offering this functionality through the browser address bar. As a fairly niche, power-user feature, I’m not sure it warrants valuable interface real estate—though I may revisit that challenge later.

I can’t use the + symbol as a tag separator because Huffduffer allows spaces in tags (and spaces are converted to pluses in URLs), so I’ve settled on commas instead.

For example, there are plenty of items tagged with “music” (/tags/music) and plenty of items tagged with “science” (/tags/science) but there’s only a handful of items tagged with both “music” and “science” (/tags/music,science).

This being Huffduffer, where just about every page has corresponding JSON, RSS and Atom representations, you can also subscribe to the podcast of everything tagged with both “music” and “science” (/tags/music,science/rss).

There’s an OR operator as well; the vertical pipe symbol. You can view the 60 items tagged with “html5”, the 14 items tagged with “css3”, or the 66 items tagged with either “html5” or “css3” (/tags/html5|css3).

Wait a minute …66 items? But 60 plus 14 equals 74, not 66!

The discrepancy can be explained by the 8 items tagged with both “css3” and “html5” (/tags/html5,css3).

The AND and OR operators can be combined, so you can find items tagged with either “science” or “religion” that are also tagged with “politics” (/tags/science|religion,politics).

While it’s fun to do this in the browser’s address bar, I think the real power is in the way that the corresponding podcast allows you to subscribe to precisely-tailored content. Find just the right combination of tags, click on the RSS link, and you’re basically telling iTunes to automatically download audio whenever there’s something new that matches criteria like:

I’m sure there are plenty of intriguing combinations out there. Now I can use Huffduffer’s URLs to go spelunking for audio gems at the most promising intersections of tags.

Machine tag browsing

After I started rewarding machine tagging on Huffduffer with API calls to Amazon and Last.fm, people started using them quite a bit. But when it came to displaying tag clouds, I wasn’t treating machine tags any differently to other tags. Everything was being displayed in one big cloud.

I decided it would be good to separate out machine tags and display them after displaying “regular” tags. That started me thinking about how best to display machine tags.

One of the best machine tag visualisations I’ve seen so far is Paul Mison’s Flickr machine tag browser, somewhat like the list view in OS X’s Finder. Initially, I tried doing something similar for Huffduffer: a table with three columns; namespace, predicate, and values.

That morphed into a two column layout (predicate and values) with the namespace spanning both columns. The values themselves are still displayed as a cloud to indicate usage.

Huffduffer machine tags

This is marked up as a table. The namespace is in a th inside the thead. In the tbody, each tr contains a th for the predicate and td for the values.

   <th colspan="2"><a href="/tags/book">book</a></th>
   <th><a href="/tags/book:author">author</a></th>
   <td><a href="/tags/book:author=arthur+c.+clarke">arthur c. clarke</a></td>

Table markup allows for some nice :hover styles (in browsers that allow :hover styles on more than links). Whenever you hover over a table cell, you are also hovering over a table row and a table. By setting :hover states on all three elements, wayfinding becomes a bit clearer.

table:hover thead th a
table tbody tr:hover th a
table tbody tr td a:hover

Huffduffer machine tags on hover

See for yourself. I think it’s a pretty sturdy markup and style pattern that I’ll probably use again.

Machine-tagging Huffduffer

Over the weekend I was looking at the latest additions to Huffduffer. I noticed that Xavier Roy was using to tag a reading by Richard Dawkins. What an excellent idea!

I set aside a little time to do a little hacking with Amazon’s API. Now you can tag stuff on Huffduffer with machine tags like book:author=steven johnson, book:title=the invention of air or music:artist=my morning jacket. Other namespaces are film and movie. Anything matching that pattern will trigger a search on Amazon and display a list of results.

Amazon’s API was one of the first I ever messed about with, first on The Session and later on Adactio Elsewhere. There are things I really like about it and things I really dislike.

I dislike the fact that there’s no option to receive JSON instead of XML. However, one of the things I like is the option to pass the URL of an file to transform the XML (I wish more APIs offered that service). So even though JSON isn’t officially offered, it’s perfectly feasible to generate JSON from the combination of XML + XSL. That’s what I did for the Huffduffer hacking—I find it a lot easier to deal with JSON than XML in PHP5. If you fancy doing something similar, help yourself to my XSL file. It’s very basic but it could make a decent starting point.

But the thing I dislike the most about the Amazon API is the documentation. It’s not that there’s a lack of documentation. Far from it. It’s just not organised very well. I find it very hard to get the information I need, even when I know that the information is there somewhere. Flickr still leads the pack when it comes to API documentation. Amazon would do well to take a leaf out of Flickr’s documentation book (hope you’re listening, Jeff).

Welcome to the machine tag

At the same time that Flickr are demonstrating idiocy in the human resources department, they continue to do so some very cool stuff behind the scenes.

Aaron has been walking through some new API methods over on the Flickr code blog, quoting something I said in a chat with Steve Ivy:


…which I don’t even remember saying but the shoe fits.

There’s something about the mix of rigidity and haphazardness in machine tags that appeals to me. While they all share the same structure, everyone is free to invent their own usage. If machine tags were required to go through a specification process we would have event:lastfm=... and event:upcoming=... instead of lastfm:event=... and upcoming:event=... but really, it simply doesn’t matter as long as people are actually doing the tagging.

With the introduction of these new API methods, it looks like there’s room to build more finely-tuned apps to pivot around namespaces, predicates and values.

Paul Mison has written an desktop-like machine tag browser which shows at a glance just how many different machine tag namespaces are out there. Quite a few pictures have been tagged with adactio:post=... since I first introduced the idea.

Common people

George just announced a wonderful new initiative. It’s a collaboration between Flickr and the Library of Congress called simply The Commons.

The library has a lot of wonderful historic images. Flickr has a lot of wonderful people who enjoy tagging pictures. Put the two together and let’s see what happens.

I think this is a great idea. They get access to the collective intelligence of our parallel-processing distributed mechanical Turk. We get access to wonderful collections of old pictures. And when I say access, I don’t just mean that we get to look at them. These pictures have an interesting new license: no known copyright restrictions. This covers the situation for photos that once had copyright that wasn’t renewed.

The naysayers might not approve of putting metadata in the hands of the masses but I think it will work out very well indeed. Sure, there might be some superfluous tags but they will be vastly outweighed by the valuable additions. The proportion will be at least which, let’s face it, is a lot better than 0/0. That’s something I’ve learned personally from opening up my own photos to be tagged by anyone: any inconvenience with deleting “bad” tags is massively outweighed by the benefits of all the valuable tags that my pictures have accrued. If you haven’t yet opened up your photos to tagging by any Flickr user, I strongly suggest you do so.

Now set aside some time to browse the cornucopia of . And if at any stage you feel compelled to annotate a picture with some appropriate tags, go for it.

I really hope that other institutions will see the value in this project. This could be just the start of a whole new chapter in collaborative culture.

Machine Tags of Loving Grace

One of the highlights of Refresh Edinburgh for me was listening to Dan Champion give a presentation on his new site, Revish. He talked through the motivation, planning and production of the site. This was an absolute joy to listen to and it was filled with very valuable practical advice.

Revish is a book review site with a heavy dollop of social interaction. Even in its not-quite-finished state, it’s pushing all the right buttons with me:

  • The markup is clean, semantic and valid.
  • The layout is uncluttered and flexible.
  • The URL structure is logical.
  • The data is available through microformats, RSS and an API.

There’s some really smart stuff going on with the sign-up process. If your chosen username matches a Flickr username, it automatically grabs the buddy icon. At the sign-up stage you also have the option of globally disabling any Ajax on the site—an accessibility option that I advocate in my book. Truth be told, there isn’t yet any Ajax on the site but the availability of this option shows a lot of forethought.

Also at the sign-up stage, there’s a quick’n’dirty auto-discovery of contacts wherever there’s overlap with Revish usernames and your Flickr contacts. This is very cool—one small step toward portable social networks.

One of the features dovetails nicely with Richard’s recent discussion about machine tags ISBNs. If you tag a picture of a book on Flickr with book:isbn=[ISBN number], that picture will then show up on the corresponding Revish page. You can see it in action on the page for Bulletproof Ajax.

Oh, and don’t worry about whether a book has any reviews on Revish yet: the site uses Amazon’s API to pull in the basic book info. As long as a book has an ISBN, it has a page on Revish. So the Revish page for a book can effectively become a mashup of Amazon details and Flickr pictures (just take a look at the page for John’s new microformats book).

I like this format for machine tagging information related to books. As pointed out in a comment on Richard’s post, this opens up the way for plenty of other tagging like book:title="[book title]" and book:author="[author name]".

I’ve started to implement this machine tag format here. If you look at my last post—which has a whole list of books—you’ll see that I’ve tagged the post with a bunch of machine tags in the book:isbn format. By making a quick call to Amazon, I can pull in some information on each book. For now I’m just displaying a small cover image with a link through to the Amazon page.

That last entry is a bit of an extreme example; I’m assuming that most of the time I’ll be just adding one book machine tag to a post at most, probably to accompany a review.

Machine tags (or triple tags) is still a relatively young idea. Most of the structures so far have been emergent, like Upcoming and Last.fm’s event tags and my own blog post machine tags. There’s now a site dedicated to standardising on some namespaces—MachineTags.org has a blog, a wiki and a mailing list. Right now, the wiki has pages for existing conventions like geo tagging and drafts for events and book tagging. This will be an interesting space to watch.

Ghost in the Machine Tags

Richard has some very nifty ideas up his sleeve for the next iteration of his site. Some of these are design-related and some are technical. He just gave a peek into the technical side of things by explaining how he’s using tags to tie content together. Not just any old tags, mind: machine tags.

You may remember that Flickr rolled out machine tags a while back. That’s their name for what’s basically tripletags; tags that take the form of namespace:predicate=value. There’s some tight integration between Upcoming and Flickr using the machine tag upcoming:event=[ID]. You can see a looser coupling (one way rather than bi-directional) in the recently-updated events section of Last.fm which uses lastfm:event=[ID]. As an example, take a look at the page for a Low Lows concert I went to and took pictures of.

Richard is making use of machine tagging to associate his Flickr pictures with his blog posts. He’s also planning to use Amazon’s API to associate ISBN numbers with blog posts, raising the question of which namespace to use:

We therefore need a triple-tag version of the ISBN tag, and here’s my suggestion: iso:isbn=0713998393. ISBN is a standard recognised by the International Organisation for Standardization (ISO) so I thought it made a certain sense for ISO to be the namespace. Other standardised entities could be tagged in a similar way, such as iso:issn=15340295.

Seems like a sound idea to me. I might experiment with machine tagging reviews here in that way and then pulling in complementary information from Amazon.

But that’s for another day. For now, I’ve gone ahead and integrated Flickr machine tagging here… but this works from the opposite direction. Instead of tagging my blog posts with flickr:photo=[ID], I’m pulling in any photos on Flickr tagged with adactio:post=[ID].

Now, I’ve already been integrating Flickr pictures with my blog posts using regular “human” tags, but this is a bit different. For a start, to see the associations using the regular tags, you need to click a link (then the Hijax-y goodness takes over and shows any of my tagged photos without a page refresh). Also, this searches specifically for any of my photos that share a tag with my blog post. If I were to run a search on everyone’s photos, the amount of false positives would get really high. That’s not a bug; it’s a feature of the gloriously emergent nature of human tagging.

For the machine tagging, I can be a bit more confident. If a picture is tagged with adactio:post=1245, I can be pretty confident that it should be associated with http://adactio.com/journal/1245. If any matches are found, thumbnails of the photos are shown right after the blog post: no click required.

I’m not restricting the search to just my photos, either. Any photos tagged with adactio:post=[ID] will show up on http://adactio.com/journal/[ID]. In a way, I’m enabling comments on all my posts. But instead of text comments, anyone now has the ability to add photos that they think are related to a blog post of mine. Remember, it doesn’t even need to be your Flickr picture that you’re machine tagging: you can also machine tag photos from your contacts or anyone else who is allowing their pictures to be tagged.

I realise that I’m opening myself up for a whole new kind of spam. But any kind of spam that requires namespaced tagging on a third-party site is pretty dedicated. If someone actually goes to that much effort to put a thumbnail of an inappropriate image at the end of one of my blog posts, I probably wrote something particularly inflammatory in that post—which would make the associated thumbnail a valid comment, I guess.

Here are some examples of posts I’ve been machine tagging on Flickr:

Once again, like Upcoming and Last.fm, these are event-based. But the machine tagging would work equally well for location-based posts. So when I go up to Scotland next week and blog about it, I (or you or anybody) can then go to Flickr, find some nice pictures of Edinburgh and using the adactio namespace, associate the pictures with the blog post.

It’s a strange mixture of RESTful URLs here and taggable objects there.

If nothing else, this will be an interesting experiment. Machine tags don’t have the low barrier to entry of regular tagging but they aren’t as complex as something like RDF. It might be that they hit the sweet spot between accuracy and ease of use.

Oh, and if you find any Flickr pictures related to this blog post, tag them with adactio:post=1274.

Museum piece

I had a thoroughly enjoyable time at the Semantic Web Think Tank this week. Most of the other attendees were there representing museums and—geek that I am—I found it thrilling to be able to chat with people from such venerable institutions as the V&A Museum, the Science Museum and the Natural History Museum.

Apart from one focused effort to explain microformats, my contributions were pretty rambling affairs. Still, I thought I’d try to put together a list of things I mentioned while they’re still fresh in my mind.

There was a lot of talk about tagging and folksonomies, including a great demo of the user-contributed content at the Powerhouse Museum in Sydney. I was chatting with Glenda later on and she pointed me to the steve.museum project, which is described as the first experiment in social tagging of art museum collections. That sounds like exactly what we were talking about at the think tank.

There was also plenty of talk about (uppercase) Semantic Web technologies. Museums have to deal with classifications far beyond the reach of microformats—with the glaring exception of events, which are crying out to be marked up in hCalendar. I don’t envy them the task of classifying and publishing all their data, but I remain convinced that user contributions (a la Wikipedia) can help enormously.

Pictorial Ajaxitagging

I talked a while back about how I was attempting to add some extra context to my posts by pulling in corresponding tag results from Del.icio.us and Technorati, and then displaying them together through the magic of Ajax.

It struck me that there was another tag space that I had completely forgotten about: Flickr. Now at the end of any post that’s been tagged, you’ll find links entreating you to pull in any of my Flickr pics that have been likewise tagged.

This is all possible thanks to a single method of Flickr’s API. I’m reusing the same method to search for other pictures too…

A had a little epiphany in the pub the other night, chatting after the WSG meetup. I was talking about geotagging and I mentioned that it probably won’t be too long before just about every file will be geotagged in the same way that just about every file already has a time stamp. Then I realised, “hey, all my blog posts have time stamps and so do all my Flickr pics!”

So I added an extra link. You can search for any pictures of mine that were taken on the same day as a journal entry. I like the extra context that provides.

While I was testing this new functionality, I couldn’t figure out why some pictures weren’t being pulled in. Looking at the post from the Opera event written on Tuesday, I expected to be able to view the pictures I took on the same night. They weren’t showing up and I couldn’t understand why not. I assumed I was doing something wrong in the code. As it turned out, the problem was with my camera. I never reset the date and time when I came back from Australia, so all the pictures I’ve taken in the last couple of weeks have been off by a few hours.

Keep your camera’s clock updated, kids. It’s valuable metadata.

Hmmm… I guess I should take a picture today to illustrate the new functionality. In the meantime, check out this older post from BarCamp to see the Ajaxitagging in action.


Ever since I switched over to a new CMS back in February, I’ve been tagging all my journal entries. Until now, I haven’t been doing anything with those tags apart from exposing them in category elements in my RSS feed. Now that I’ve got a good head of steam going with my tags, I’ve decided to play around with them a bit.

Each journal entry page now shows the tags at the end of the post. These are linked (using rel-tag of course) to an aggregate tag page that shows any other posts with the same tag. Pretty standard stuff.

But then I thought it would be fun to tie the post in with other things I’ve tagged, not on this site but on Del.icio.us. Under the heading “Related”, you’ll find links to the same tags for my del.icio.us links.

Rather then sending you off to Del.icio.us, I’m using the Del.icio.us API to bring the results back to this site. Using a bit of Ajax, these results are displayed without a page refresh. I’m using Hijax so if JavaScript is disabled, the links will still work.

I’ve got a nice little progress bar going while the request is being sent, and a bit of a colour fade happening when the response comes back. The results themselves could probably do with some more styling. Right now I’m just displaying them in a regular unordered list of x-folk entries but I think they might look nice if they were more comment-like in appearance.

After the Del.icio.us links, I’ve got the same tags pointing off to Technorati. Again, instead of sending you away, I’m pulling in the results with the Technorati API. In some ways, these results are more interesting than the del.icio.us links because, instead of just showing things that I have tagged, this shows results from everywhere. The results are constantly changing. Right now I’m using the search query, but I must look into the experimental tag query.

I’m also using the Technorati API to find any blogs that are linking to the current post. This works like Trackback. If you want to respond to a post I’ve written, just blog about it. As long as you include a link back to the post, your entry will now show up in the results. It won’t be instantaneous, but if your blogging software is set up to ping Technorati when you post, it should show up pretty fast. I’d be interested in finding out just how long it takes for the API to reflect recent pings. If you blog about this post (with a link), try coming back to it and using the Technorati link to see how long your post takes to show up.

The Technorati API isn’t the most full-featured and sometimes it just seems to not respond. The Del.icio.us API allows me to do quite a bit with my own links, but doesn’t offer any access to other peoples’. Still, by combining the two with the tags for any particular journal entry, an interesting picture emerges.

I have some other ideas for making individual journal entry pages more interesting. None of them involve the addition of buttons that invite the reader to add the page to Digg, Newsvine, Del.icio.us, Reddit, Furl, Magnolia, Blinklist, or any other others I may be forgetting.

Mashing up with microformats

Back in March, during South by Southwest, Tantek asked me if I’d like to sit in on his microformats panel alongside Chris Messina and Norm! The audio recording of the panel is now available through the conference podcast.

I’ve taken the liberty of having the recording transcribed (using castingWords.com) and I’ve posted a tidied up version of the transcript to the articles section: Microformats: Evolving the Web. You can listen along through the articles RSS feed which doubles up as a podcast.

I’ve also posted the transcript on the microformats wiki so that others can edit it if they catch any glaring mistakes in the transcription.

During the panel I talked about Adactio Austin, a fairly trivial use of microformats but one that I’ve been building upon. I’d like to provide some cut’n’paste JavaScript that would allow people to get some added value from using microformats. Supposing you have a bunch of locations marked up in hCard with geotags, you could drop in a script and have a map appear showing those locations.

Perhaps the geotagging won’t even be necessary. Google added a geocoder to their mapping API two weeks ago. The UK, alas, is not yet supported (probably because the Post Office won’t let go of its monopoly that easily… Postman Pat, your money-grabbing days are numbered).

Unfortunately, Google Maps isn’t very suited to the cut’n’paste idea: you have to register a different API key for each domain where you want to use the mapping API.

The Yahoo maps API is less draconian about registration but its lack of detailed UK maps makes it a non-starter for me.

Maybe I should step away from maps and concentrate on events instead. It probably wouldn’t be too hard too write a script to create a calendar based on any hCalendar data found in a document. Perhaps I’ll investigate the calendar widget from Yahoo.

Ultimately I’d like to create something like Chris’s Mapendar idea. If only there were enough hours in the day.

Upcoming webolution

At the risk of becoming API-watch Central, I feel I must point out some nifty new features that have been added to Upcoming.org.

Andy and the gang have been diligently geotagging events using Yahoo’s geocoder API. Best of all, these latitude and longitude co-ordinates are now also being exposed through the API. Methinks Adactio Austin won’t be the last mashing up of event and map data I’ll be doing.

On the Upcoming site itself, you can now limit the number of attendees for an event, edit any venues you’ve added and edit your comments. This comes just a few days after Brian Suda mentioned in a chat that he would like to have the option to edit this comment later (right now he’s looking for somewhere to stay during XTech).

Feature wished for; feature added. This is exactly the kind of iterative, evolutionary growth that goes a long way towards what Kathy Sierra calls creating passionate users. By all accounts, her panel at South by Southwest was nothing short of outstanding. Everyone I spoke to who attended was raving about it for days. Muggins here missed it but I have a good excuse. I was busy signing freshly-purchased books, so I can’t complain.