HTML5 and me

I can never pinpoint the exact moment at which I “get into” a particular technology. CSS, DOM Scripting, microformats …there was never any Damascene conversion to any of them. Instead, I’d just notice one day, after gradually using the technology more and more, that I was immersed in it.

That’s how I feel about now.

There’s another feeling that accompanies this realisation. I remember feeling it about CSS in the late 90s and about DOM Scripting half a decade ago. At the same time as I look up from my immersion, I cast a glance around the web development landscape and ask Why aren’t more people paying attention to this?

In the case of HTML5, this puzzling state of affairs can, to a large extent, be explained by the toxic 2022 meme. Working web developers with an idle interest in HTML5 would google the term, find a blog post telling that it won’t be “ready” until 2022, and then happily return to their work, comforted by the knowledge that HTML5 was some distant dream on the horizon—one that doesn’t affect them in any way today.

Nothing could be further from the truth. The Last Call Working Draft status is (optimistically) planned for October; that’s one month away.

And what rough beast, its hour come round at last, slouches towards Bethlehem to be born?

If you want to have a say in the formation of the most important web standard in existence, don’t put off getting involved. As Bruce says, If you don’t vote, you can’t bitch.

Still, I think the attitude of most web developers towards HTML5 right now is, at the very least, “interested, if a little sceptical”—that’s certainly how I felt when I started dabbling in it.

A little while back, I got together with some of my interested (if a a little sceptical) colleagues in New York, thanks to a generous invitation from Zeldman.

Dan Cederholm, Jeremy Keith, Eric Meyer, Ethan Marcotte, Tantek Çelik, Nicole Sullivan, Wendy Chisholm

After a fairly intense two days of poring over the spec, I think it’s fair to say that, on balance, the interest increased and the scepticism decreased. That’s not to say that everything looks rosy in the current incarnation of HTML5. When you’ve got some of the smartest front-end web developers I know of in the same room together and they all agree that some parts of the spec are confusing or downright wrong, that’s quite worrying.

On the plus side, most of the issues are pretty minor in the grand scheme of things. It’s fair to say that most of the stuff that interests web authors—the semantic side of things—only accounts for a small part of HTML5. Most of the HTML5 specification is about error handling, APIs and shiny new interactive content. There are plenty of programmers and browser makers forging those powerful new tools. But as qualified as they are to hammer out those complex constructs, they are not necessarily the most qualified to make decisions on creating new structural elements. For that, you need the input of authors. And authors have been decidedly slow to get involved with HTML5.

It’s time for authors to get involved. I believe our voices will be welcomed. According to the HTML design principles:

…consider users over authors over implementors over specifiers over theoretical purity.

I’ll get the ball rolling with my own little list of things that are troubling me…

small

I’m with Bruce and Remy. If the small element is being redefined for disclaimers, caveats, legal restrictions, or copyrights, it needs to be handling how that kind of content is published in the wild. That means it needs to be able to wrap paragraphs, lists and other flow content.

Alternatively, it should go the way of its evil twin, the big element, and simply be deprecated …sorry, I mean obsolete and non-conforming.

time

I’ll join in the chorus of people who think that the restrictions on the information that the new time element can contain are unnecessarily draconian. You can encode a date and time, you can encode a date, but you can’t encode just a month and a year. So you can’t make a piece of information like “April 1912” machine-readable. The spec says the time element:

…is intended as a way to encode modern dates and times in a machine-readable way

Which is great. But the sentence doesn’t finish there. It goes on:

so that user agents can offer to add them to the user’s calendar.

That’s one use case! I don’t think it’s wise to rain on the parade of anyone wanting to build, say, timeline mashups. Trying to mandate use cases ahead of time is not just counter-productive, it’s probably impossible. Can you imagine if Flickr had launched their API with strict instructions that it could only be used for one particular purpose?

figure

I have nothing against the figure element itself, although it does seem uncomfortably close to aside, but the insistence on recycling the legend element to handle the caption is problematic.

Don’t get me wrong: I’m all for re-using existing elements rather then creating new ones, and I know that Hixie looked at all the options. But the way that browsers currently treat the legend element makes it unusable outside of a form.

I think that the label element could work instead.

details

Just like figure, the details element reuses legend. In this case, label won’t do the trick. details is an interactive element and it doesn’t look like the label element can be made keyboard accessible.

In this case, as undesirable as it is, a new element may be called for.

article

I’ve got two issues with the article element.

  1. Firstly, its definition sounds awfully similar to section. I’m not convinced that there needs to be two different elements. Having two elements that look like a duck, walk like a duck and quack like a duck is just going to lead to confusion amongst authors wondering which duck to use.

  2. The article element, unlike the section element can take an optional pubdate attribute to encode the publication date. I’m all in favour of having this information be machine-readable but the pubdate attribute smells like dark data, subject to metacrap rottage. In most cases, the publication date will be repeated in the content of the article anyway, so I’m in favour of adding a flag there rather than duplicating data. A Boolean pubdate attribute on a time element within an article header or footer should do the trick.

    Update: Belay that last gripe, ensign. As proof of just how fast this spec moves, less than 24 hours after I published this, Hixie has implemented what I was suggesting.

Speaking of footer, this one is the biggie…

footer

There is a big disconnect between what the HTML5 spec calls a footer and what authors on the web call a footer.

According to the spec, you’re only supposed to put some kinds of content inside a footer:

Flow content, but with no heading content descendants, no sectioning content descendants, and no header or footer element descendants.

That means no nav or headings in footer. The way that the footer element is defined in the spec, it’s a slightly more expanded version of address.

Ah, address! One of the most problematic elements in HTML 4. It is often incorrectly used to mark up street addresses. But is it any wonder? When an element has a name address, it’s hardly surprising that authors are going to use it for marking up addresses. The same thing is going to happen with footer.

The term “footer” was not invented for HTML5. It’s been in use on the web for years and in print for even longer. But if you ask any author to define what they mean by the term “footer”, you’ll get a very different definition to the one in the HTML5 spec. They may even point to specific examples of footers on sites like Flickr or on blogs, where they contain headings and navigation.

To be fair, when the new structural elements were being forged back in 2005, there wasn’t as much prevalence of what Derek Powazek termed fat footers. So when Hixie ran his analytics on a shitload of web pages crawled by Google and found that “footer” was by far the most common class name, most footer content was pretty meagre. But usage changes (see also: time).

The way that the element named footer is defined in HTML5—to be used multiple times in a single document in sections and articles as well as at the document level—is very different from the convention named footer in common usage on the web today. Most of the instances of what authors call a footer are more like what the HTML5 spec defines as aside.

I don’t want to spend the next decade telling authors not to mark up their footers as footers. It was bad enough telling people not to mark up addresses as addresses. In any case, authors aren’t going to listen. If they see there’s an element called footer, they will assume it refers to the device known as a footer, and mark up their content accordingly. At that point, the HTML5 spec will have become a work of fiction instead of documenting what’s actually on the web.

One of two things needs to happen. Either:

  1. The content model of footer is updated to match that of header, which is much more liberal in what it accepts, or:
  2. The name of the element currently called footer should be changed to match the current, restrictive definition. I suggest using contentinfo, which is the name of an existing ARIA role for exactly this kind of content.

ARIA roles, by the way, are an excellent addition to HTML5. ARIA integration is a win for ARIA and a win for HTML5, in my opinion. Most of all, it’s a win for authors who now have a whole swathe of extra semantics they can sprinkle into their documents (and use as styling hooks with attribute selectors).

Thus endeth my list of things I want to see fixed in HTML5. I’m leaving out the massive issue of canvas accessibility because:

  1. that’s beyond my area of expertise,
  2. smarter people than me are working on it, and
  3. I think that canvas would probably benefit from being spun off into a separate spec.

There are other little things that bother me in HTML5—hgroup smells funny, cite shouldn’t be restricted to titles of works, and I miss the rev attribute on links—but those are all personal foibles; opinions unsupported by data. I’d rather concede than argue without data.

Because, make no mistake, data is what’s needed if you want to affect change in HTML5. Despite the attempts to paint Hixie as a stubborn, opinionated dictator, he is himself a slave to data. He shows an almost robot-like ability to remove his own ego from a debate and follow where the data leads.

If you are an author of HTML documents, I strongly encourage you to get involved in the HTML5 process.

  1. Read the spec.
  2. Join the mailing list.
  3. Hang out in the IRC channel.

Like I said, most of the spec and discussion is about APIs rather than semantics, but it’s precisely because the spec isn’t directly aimed at authors that authors need to get involved.

Have you published a response to this? :

Comments

What? I get to comment on Jeremy Keith’s blog? Holy shit I better make it a good one…

I’ve used the Doctype for a couple of sites and was a little confused when I started using the new structural elements. Mainly the footer element is strange to use and looks funny when you see it littered all over the document (as is the case for some new blogs using them to markup the end of a post). I’d like to see the spec changed to allow for headings as you suggested and just keep the normal naming convention that we’ve been using for so long now. I really like the ability to wrap a elements around block content. Makes for some really cool interface interactivity.

Thanks for the encouragement to actually read the spec. For the most part, we authors seem to need a lot of it. There are others who’ve been usefully encouraging the same thing: http://webdesignernotebook.com/rants/rtfm/

# Posted by Simon on Monday, August 31st, 2009 at 4:50am

Have to say I was waiting to hear a bit more about it. Thanks to @zeldman now I have. Commencing to immerse myself in, 3..2..1 Thanks Jeremy

A well-rounded contribution and a mouthful-speech to the world of web designers.

Let me see if I could humbly add-up a little more - killing the readers with boredom :)

Reading the drafts and specifications for several years, from what I see, the good old HTML is apparently up to the task of modern times for once and for all.

With the advent of mobile connectivity at full pace along with its own limitations and new horizons, the designers are now facing the problem of following which standards and norms to stick to, if there is in deed "one" everybody is agreed.

Inside the terminology garbage heap, one can pick from full-pledged CSS design to the "people’s standard" of table-based designs, from WAP cards to stand-alone XML’s, and other HTML sub-classes - for each and every display varieties, along this path of front-end web development. Also possible to add for greater-chaos is adding rich media streams, building Flash interfaces, Java applications, widgets for gadgets here and there - and still many more! into that drop-down list.

Actually there is one, and there should be: The good-old HTML, or perhaps, a superior brand-new HTML to honor what Tim Berner Lee had stated 16 years before: one standard to present documents on a world-wide stage. One standard HTML revised to cover all modern-day issues (such as natural support for rich media), and modern needs (such as improved semantics, enriched DOM elasticity) and spotting modernized goals (inter-connectivity of the data, possibly that of the people).

Even I, myself, trying to optimize my own coding standards, designing standards, along with the mark-up and scripting standards at a single work-frame for maximizing my efficiency and mental/financial profits since the years when it had begun - all those good days back to the dearest Veronica Gopher and Netscape. Even then, most of us would remember we had Java thumblers, dazzling mouse-over applets, even I could have watched and added movie contents - like a gigantic(!) video-clip of one song from top40, so we had to choose what sort of standards we were to follow on our overly-noisy Pentium-60/90/120 or Power Mac desktop monsters (lol - I was bestowed to have an SGI work station).

Today, although we have minor to pseudo-major incompatibilities in every area of web development (among the browsers, among the responsible people of world wide web - even including our very selves), the conflicts are no-match to what we had in past. Now, I believe all things have been said, and tried and virtually countless things have been achieved during these long, but somehow, quickly-spent years. As the industry leaders of this sector, as professional and academic researchers, the market entrepreneurs and visionaries, it seems to me that we have finally settled regarding to what we are expecting from 1. machines to do -> since performance-wise we’re over the top, 2. people on streets and companies to do -> the internet awareness is now "hand-held" :p, and 0. ourselves, the developers to do -> obeying and sticking to STRICT STANDARDS so our creativity do not tackle our -combined- performances as the hümans of this blue planet in one of its struggling times.

best regards;

hey jeremy, all this written after reading yours and no intentions to role-stealing - I swear - this was a spontaneous (del)role-stealing(/del) :)

kunter.

Thanks Jeremy. I love reading about the evolving HTML 5 spec and I have started using it in small doses.

Speaking of small, I’ve always used the small tag for fine print — copyright, disclaimers, et al — so the redefinition is somewhat in line with how I currently use the tag. I think that it still has its place, but I’m with you in that it feels like it should be a block-level container. Something akin to a div or an HTML 5 aside. In either case, I still think it has merit and I would hate to see it removed from the spec.

Footer on the other hand…confused the heck out of me at first. I was expecting a block-level element that functioned just like its twin, header. I’m hoping that the spec will redefine this tag to look and act similar to header. It makes more sense to me when used in this fashion.

Thanks again for the reading!

# Posted by Billee D. on Monday, August 31st, 2009 at 4:04pm

Nice write up, thanks. I’m looking forward to more ARIA support

i’ve had a problem with "article" as well. it seems like it should be called "Content" i.e. a video of a kid getting kicked in the nuts on YouTube is hardly an "Article". the spec is WAY to blog-centric.

# Posted by dave rupert on Monday, August 31st, 2009 at 5:43pm

Please make sure to send your comments to the list!

Interesting post; thanks for the HTML 5 update… it’s interesting to hear your opinions on the elements, from a semantic point of view.

Is there any way besides the IRC channel to get involved? I don’t mind chat, but i tend to have less time these days.

Another element that I believe needs more consideration is the aside element which, if you follow the current definition of the spec, shouldn’t really be used for what is generally considered sidebar content. If we’re moving towards every other major area of the page having its own element - header, footer, nav, section etc. - then I believe there should be some sort of containing element for sidebar content too.

I’ve also been using ARIA roles for a little while now but it never occurred to me to use attribute selectors to target them for styling. Good point!

I totally started reading this article expecting to find something that I didn’t agree with but I actually agree 100% with everything that you’ve had to say. HTML5 seems to have a small number of rather large kinks like these that need to be ironed-out pronto.

Well said. The footer thing has been bugging me since I first read the spec a couple months ago.

Many of the tags discussed in the post seem redundant. Why would the HTML 5 spec attempt to replicate things that RDFa and Dublin Core Metadata Initiative already handle? Namely standards based, machine readable, general purpose meta data. Take for example the time and footer elements. The purpose of these elements can be filled through RDFa and DC. The Web development profession (and the search geeks like me) may be better served by guidelines and best practices on using RDFa to wrap authorship, copyright, licensing, time and other general information. Considering that RDFa is still a relatively new format, and developers are still wrapping their heads around how it might or might not fit within the HTML5 framework, that guidance would be valuable.

An IRC channel? I might join then. Most Web authors like myself don’t care for being on mailing lists. Maybe this is the issue with the W3C. It needs a more modern way to get people involved. Maybe a forum or public comments area like this if they don’t already.