Tags: whatwg

26

sparkline

Wednesday, July 26th, 2017

Bruce Lawson’s personal site  : Eulogy for Flash

Web developers aren’t going to shed many tears for Flash, but as Bruce rightly points out, it led the way for many standards that followed. Flash was the kick up the arse that the web needed.

He also brings up this very important question:

I’m also nervous; one of the central tenets of HTML is to be backwards-compatible and not to break the web. It would be a huge loss if millions of Flash movies become unplayable. How can we preserve this part of our digital heritage?

This is true of the extinction of any format. Perhaps this is an opportunity for us to tackle this problem head on.

Friday, October 31st, 2014

Hitler reacts to the HTML5 URL normative reference controversy

This is hilarious …for about two dozen people.

For everyone else, it’s as opaque as the rest of the standardisation process.

Tuesday, January 8th, 2013

Interview with Ian Hickson, HTML editor on HTML5 Doctor

Bruce sits down for a chat with Hixie. This is a good insight into the past and present process behind HTML.

Friday, December 14th, 2012

Main element - WHATWG Wiki

Tantek has put together a wiki page to document the arguments for and against adding a new “main” element to HTML.

Friday, May 18th, 2012

Shallow Thoughts » srcset vs. picture

A well thought-out evaluation on responsive images from Bridget.

Wednesday, May 16th, 2012

Secret src

There’s been quite a brouhaha over the past couple of days around the subject of standardising responsive images. There are two different matters here: the process and the technical details. I’d like to address both of them.

Ill communication

First of all, there’s a number of very smart developers who feel that they’ve been sidelined by the WHATWG. Tim has put together a timeline of what happened:

  1. Developers got involved in trying to standardize a solution to a common and important problem.
  2. The WHATWG told them to move the discussion to a community group.
  3. The discussion was moved (back in February), a general consenus (not unanimous, but a majority) was reached about the picture element.
  4. Another (partial) solution was proposed directly on the WHATWG list by an Apple employee.
  5. A discussion ensued regarding the two methods, where they overlapped, and how the general opinions of each. The majority of developers favored the picture element and the majority of implementors favored the srcset attribute.
  6. While the discussion was still taking place, and only 5 days after it was originally proposed, the srcset attribute (but not the picture element) was added to the draft.

A few points in that timeline have since been clarified. That second step—“The WHATWG told them to move the discussion to a community group”—turns out to be untrue. Some random person on the WHATWG mailing list (which is open to everyone) suggested forming a Community Group at the W3C. Alas, nobody else on the WHATWG mailing list corrected that suggestion.

Then there’s apparent causality between step 4 and 6. Initially, I also assumed that this was what happened: that Ted had proposed the srcset solution without even being aware of the picture solution that the Community Group had independently come up with it. It turns out that’s not the case. Ted had another email about the picture proposal but he never ended up sending it. In fact, his email about srcset had been sitting in draft for quite a while and he only sent it out when he saw that Hixie was finally collating feedback on responsive images.

So from the outside it looked like there was preferential treatment being given to Ted’s proposal because it came from within the WHATWG. That’s not the case, but it must be said: the fact that srcset was so quickly added to the spec (albeit in a different form) doesn’t look good. It’s easy to understand why the smart folks in the Responsive Images Community Group felt miffed.

But let’s be clear: this is exactly how the WHATWG is supposed to work. Use-cases are evaluated and whatever Hixie thinks is best solution gets put in the spec, regardless of how popular or unpopular it is.

Now, if that sounds abhorrent to you, I completely understand. A dictatorship should cause us to recoil.

That’s where the W3C come in. Their model is completely different. Everything is done by committee there.

Steve Faulkner chimed in on Tim’s post with his take on the two groups:

It seems like the development of HTML has turned full circle, the WHATWG was formed to overthrow the hegemony of the W3C, now the W3C acts as a counter to the hegemony of the WHATWG.

I think he’s right. The W3C keeps the rapid, sometimes anarchic approach of the WHATWG in check. But the opposite is also true. Without the impetus provided by the WHATWG, I’m not sure that the W3C HTML Working Group would ever get anything done. There’s a balance that actually works quite well in practice.

Back to the situation with responsive images…

Unfortunately, it appears to people within the Responsive Images Community Group that all their effort was wasted because their proposed solution was summarily rejected. In actuality all the use-cases they gathered were immensely valuable. But it’s certainly true that the WHATWG didn’t make it clearer how and where developers could best contribute.

Community Groups are a W3C creation. They don’t have anything to do with the WHATWG, who do all their work on their own mailing list, their own wiki and their own IRC channel.

I do think that the W3C Community Groups offer a good place to go bike-shedding on problems. That’s a term that’s usually used derisively but sometimes it’s good to have a good ol’ bike-shedding without clogging up the mailing list for everyone. But it needs to be clear that there’s a big difference between a Community Group and a Working Group.

I wish the WHATWG had done a better job of communicating to newcomers how best to contribute. It would have avoided a lot of the frustrations articulated by Wilto:

Unfortunately, we were laboring under the impression that Community Groups shared a deeper inherent connection with the standards bodies than it actually does.

But in any case, as Doctor Bruce writes at least now there’s a proposed solution for responsive images in HTML: The Living Standard:

I don’t really care which syntax makes the spec, as long as it addresses the majority of use cases and it is usable by authors. I’m just glad we’re discussing the adaptive image problem at all.

So let’s take a look at the technical details.

src code

The Responsive Images Community Group came up with a proposal based off the idea of minting a new element, called say picture, that mimics the behaviour of video

<picture alt="image description">
  <source src="/path/to/image.png" media="(min-width: 600px)">
  <source src="/path/to/otherimage.png" media="(min-width: 800px)">
  <img src="/path/to/image.png" alt="image description">
</picture>

One of the reasons why a new element was chosen rather than extending the existing img element was due to a misunderstanding. The WHATWG had explained that the parsing of img couldn’t be easily altered. That means that img must remain a self-closing element—any solution that requires a closing /img tag wouldn’t work. Alas, that was taken to mean that extending the img element in any way was off the cards.

The picture proposal has a number of things going for it. Its syntax is easily understandable for authors: if you know media queries, then you know how to use picture. It also has a good fallback for older browsers : a regular img element. This fallback mechanism (and the idea of multiple source elements with media queries) is exactly how the video element is specced.

Unfortunately using media queries on the sources of videos has proven to be very tricky for implementors, so they don’t want to see that pattern repeated.

Another issue with multiple source elements is that parsers must wait until the closing /picture tag before they can even begin to evaluate which image to show. That’s not good for performance.

So the alternate solution, based on Ted’s proposal, extends the img element using a new srcset attribute that takes a comma-separated list of values:

<img alt="image description"
src="/path/to/fallbackimage.png"
srcset="/path/to/image.png 800w, /path/to/otherimage.png 600w">

Not nearly as pretty, I think you’ll agree. But it is actually nice and compact for the “retina display” use-case:

<img alt="image description" src="/path/to/image.png" srcset="/path/to/otherimage.png 2x">

Just to be clear, that does not mean that otherimage.png is twice the size of image.png (though it could be). What you’re actually declaring is “Use image.png unless the device supports double-pixel density, in which case, use otherimage.png.”

Likewise, when I declare:

srcset="/path/to/image.png 600w 400h"

…it does not mean that image.png is 600 pixels wide by 400 pixels tall. Instead, it means that an action should be taken if the viewport matches those dimensions.

It took me a while to wrap my head around that distinction: I’m used to attributes describing the element they’re attached to, not the viewport.

Now for the really tricky bit: what do those numbers—600w and 400h—mean? Currently the spec is giving conflicting information.

Each image that’s listed in the srcset comma-separated list can have up to three values associated with it: w, h, and x. The x is pretty clear: that’s the pixel density of the device. The w and h values refer to the width and height of the viewport …but it’s not clear if they mean min-width/height or max-width/height.

If I’m taking a “Mobile First” approach to development, then srcset will meet my needs if w and h refer to min-width and min-height.

In this example, I’ll just use w to keep things simple:

<img src="small.png" srcset="medium.png 600w, large.png 800w">

(Expected behaviour: use small.png unless the viewport is wider than 600 pixels, in which case use medium.png unless the viewport is wider than 800 pixels, in which case use large.png).

If, on the other hand, w and h refer to max-width and max-height, I have to take a “Desktop First” approach:

<img src="large.png" srcset="medium.png 800w, small.png 600w">

(Expected behaviour: use large.png unless the viewport is narrower than 800 pixels, in which case use medium.png unless the viewport is narrower than 600 pixels, in which case use small.png).

One of the advantages of media queries is that, because they support both min- and max- width, they can be used in either use-case: “Mobile First” or “Desktop First”.

Because the srcset syntax will support either min- or max- width (but not both), it will therefore favour one case at the expense of the either.

Both use-cases are valid. Personally, I happen to use the “Mobile First” approach, but that doesn’t mean that other developers shouldn’t be able to take a “Desktop First” approach if they want. By the same logic, I don’t much like the idea of srcset forcing me to take a “Desktop First” approach.

My only alternative, if I want to take a “Mobile First” approach, is to duplicate image paths and declare ludicrous breakpoints:

<img src="small.png" srcset="small.png 600w, medium.png 800w, large.png 99999w">

I hope that this part of the spec offers a way out:

for the purposes of this requirement, omitted width descriptors and height descriptors are considered to have the value “Infinity”

I think that means I should be able to write this:

<img src="small.png" srcset="small.png 600w, medium.png 800w, large.png">

It’s all quite confusing and srcset doesn’t have anything approaching the extensibility of media queries, but I hope we can get it to work somehow.

Wednesday, October 26th, 2011

Aryeh Gregor on being an editor and the W3C process — Anne’s Blog

This encapsulates the difference between the WHATWG and the W3C: a concern for interoperability matched against a concern for procedure.

Friday, July 8th, 2011

[whatwg] The blockquote element spec vs common quoting practices

I’m getting behind Oli’s proposal to allow non-quoted footers within blockquotes in HTML. Here’s where I quote the design principles to support his case.

Monday, February 21st, 2011

HTML5 — Edition for Web Developers

A beautifully readable subset of the HTML spec, with an emphasis on writing web apps (and with information intended for browser makers has been removed). Very handy indeed!

Tuesday, January 25th, 2011

Three questions

Craig Grannell from .Net magazine got in touch to ask me a few short questions about last week’s events around HTML5. I thought I’d share my answers here rather than wait for the tortuously long print release cycle.

What are your thoughts on the logo?

The logo is nice. Looks pretty sharp to me.

Why were you unhappy with W3C’s original stance (“general purpose visual identity”)? What do you think now they’ve changed this?

I was unhappy with the W3C’s original definition of HTML5 in the logo’s accompanying FAQ, where they lumped CSS, SVG and WOFF under the “HTML5” banner. I’m happy they changed that.

What’s your thinking on the current state of the HTML5 situation, given that WHATWG is dropping the 5 and just going with HTML?

I think the current situation makes things much clearer. The WHATWG are working on a continuous, iterative document called simply HTML. The W3C use that as a starting pointing for nailing down an official specification which will be the fifth official iteration of the HTML language called, sensibly enough, HTML5.

The WHATWG spec is the place to look for what’s new and evolving. The W3C spec, once it goes into Last Call, is the place to look for the official milestone that is HTML5. In practice, the two specs will be pretty much identical for quite a while yet.

But the truth is that authors shouldn’t be looking at specs to decide what to use—look at what browsers support in order to decide if you should use a particular feature—look at the spec to understand how to use features of HTML5.

For authors, it probably makes more sense to talk about HTML rather than HTML5. Remember that most of HTML5 is the same as HTML 4.01, HTML 3.2, etc. Answering yourself a question like “When can I use HTML5?” is a lot easier to answer if you rephrase it as “When can I use HTML?”

Most of the time, it makes a lot more sense to talk about specific features rather than referring to an entire specification. For example, asking “Does this browser support HTML5?” is fairly pointless, but asking “Does this browser support canvas?” is much more sensible.

Friday, January 21st, 2011

Clarity

Two good things have happened.

WHATWG

Firstly, as I hoped, the WHATWG have updated the name of their work to simply be HTML. This is something they tried to do a year ago, and I kicked up a stink. I was wrong. Having a version number attached to an always-evolving standard just doesn’t make sense. As Hixie put it:

…the technology is not versioned and instead we just have a living document that defines the technology as it evolves.

This change means that the whole confusing 2022 business that was misunderstood by so many people is now history:

Now that we’ve moved to a more incremental model without macro-level milestones, the 2022 date is no longer relevant.

The spec is currently labeled as a “Living Standard”. Personally, I find the “Living Standard” strapline a bit cheesy. I’d much rather the document title was simply “HTML” followed by the date of the last update.

Note that this change only applies to the WHATWG. The W3C will continue to pursue the “snapshot” model of development so the spec there definitely retains the number 5 and will follow the usual path of Working Draft, Last Call, Proposed Recommendation, and so on.

I think this difference makes it clearer what each group is doing. It was a pretty confusing situation to have two groups working on two specs, both called HTML5. Now it’s clear that the WHATWG is working more like how browsers do: constantly evolving and implementing features rather than entire specifications. Meanwhile the W3C are working on having a development milestone of those features set in stone and labelled as the fifth revision to the HTML language …and I think that is also an important and worthy goal.

W3C

The second piece of good news is that the W3C have backtracked on their “embrace and extend” attitude towards buzzwordism. When they launched the new HTML5 logo a few days ago, the W3C Communications Team initially said that HTML5 included CSS, SVG and WOFF‽ As Tantek said:

Fire the W3C Communications person that led this new messaging around HTML5 because it is one of the worst messages (if not the worst) about a technology to ever come out of W3C.

Following the unsurprising outbreak of confusion and disappointment that this falsehood caused, the W3C have now backtracked. HTML5 means HTML5. The updated FAQ makes it very clear that CSS3 is not part of HTML5.

Hallelujah!

I really wasn’t looking forward to starting every HTML5 workshop, presentation or article with the words, “Despite what the W3C says, HTML5 is not a meaningless buzzword…” Now I can safely say that the term “HTML5” refers to a technical specification, published at the W3C, called HTML5. Also, I can use that nice logo with a clear conscience.

Over time though, I’ll probably follow the WHATWG’s lead and simply talk about “what’s new in HTML.” As Remy points out, there are pedagogical advantages to untethering version numbers:

I don’t have to answer “Is HTML ready to be used?” ‘cos that’s a really daft question!

Wednesday, January 19th, 2011

The WHATWG Blog — HTML is the new HTML5

The spec previously known as HTML5 is now simply HTML. Good. The W3C are now free to abuse the term “HTML5” to mean everything under the sun.

Tuesday, January 18th, 2011

Bye, bye 5

One year ago, I objected strenuously when the WHAT WG temporarily changed the name of their spec from “HTML5” to plain ol’ “HTML”:

Accurate as that designation may be, I became very concerned about the potential confusion it would cause.

I understand why the WHATWG need to transition from using the term HTML5 to simply using the term HTML to describe their all-encompassing ongoing work, but flipping that switch too soon could cause a lot pain and confusion.

Now that term the “HTML5” has become completely meaningless—even according to the WC3—I think it’s time to rip off the bandaid and flip that switch.

I was wrong. Hixie was right. The spec should be called HTML.

If you need an all-encompassing term for every front-end technology under the sun, go ahead and use the term “HTML5.” Although personally, I quite like “the Web.”

Update: The WHATWG have duly updated the name of the spec.

Sunday, January 16th, 2011

A Brief History of Markup

The first chapter of HTML5 For Web Designers, originally published in issue 305 of A List Apart.

Chapter one of HTML5 For Web Designers.

HTML is the unifying language of the World Wide Web. Using just the simple tags it contains, the human race has created an astoundingly diverse network of hyperlinked documents, from Amazon, eBay, and Wikipedia, to personal blogs and websites dedicated to cats that look like Hitler.

HTML5 is the latest iteration of this lingua franca. While it is the most ambitious change to our common tongue, this isn’t the first time that HTML has been updated. The language has been evolving from the start.

As with the web itself, the HyperText Markup Language was the brainchild of Sir Tim Berners-Lee. In 1991 he wrote a document called “HTML Tags” in which he proposed fewer than two dozen elements that could be used for writing web pages.

Sir Tim didn’t come up with the idea of using tags consisting of words between angle brackets; those kinds of tags already existed in the SGML (Standard Generalized Markup Language) format. Rather than inventing a new standard, Sir Tim saw the benefit of building on top of what already existed—a trend that can still be seen in the development of HTML5.

From IETF to W3C: The road to HTML 4

There was never any such thing as HTML 1. The first official specification was HTML 2.0, published by the IETF, the Internet Engineering Task Force. Many of the features in this specification were driven by existing implementations. For example, the market-leading Mosaic web browser of 1994 already provided a way for authors to embed images in their documents using an <img> tag. The img element later appeared in the HTML 2.0 specification.

The role of the IETF was superceded by the W3C, the World Wide Web Consortium, where subsequent iterations of the HTML standard have been published at www.w3.org. The latter half of the nineties saw a flurry of revisions to the specification until HTML 4.01 was published in 1999.

At that time, HTML faced its first major turning point.

XHTML 1: HTML as XML

After HTML 4.01, the next revision to the language was called XHTML 1.0. The X stood for “eXtreme” and web developers were required to cross their arms in an X shape when speaking the letter.

No, not really. The X stood for “eXtensible” and arm crossing was entirely optional.

The content of the XHTML 1.0 specification was identical to that of HTML 4.01. No new elements or attributes were added. The only difference was in the syntax of the language. Whereas HTML allowed authors plenty of freedom in how they wrote their elements and attributes, XHTML required authors to follow the rules of XML, a stricter markup language upon which the W3C was basing most of their technologies.

Having stricter rules wasn’t such a bad thing. It encouraged authors to use a single writing style. Whereas previously tags and attributes could be written in uppercase, lowercase, or any combination thereof, a valid XHTML 1.0 document required all tags and attributes to be lowercase.

The publication of XHTML 1.0 coincided with the rise of browser support for CSS. As web designers embraced the emergence of web standards, led by The Web Standards Project, the stricter syntax of XHTML was viewed as a “best practice” way of writing markup.

Then the W3C published XHTML 1.1.

While XHTML 1.0 was simply HTML reformulated as XML, XHTML 1.1 was real, honest-to-goodness XML. That meant it couldn’t be served with a mime-type of text/html. But if authors published a document with an XML mime-type, then the most popular web browser in the world at the time—Internet Explorer—couldn’t render the document.

It seemed as if the W3C were losing touch with the day-to-day reality of publishing on the web.

XHTML 2: Oh, we’re not gonna take it!

If Dustin Hoffman’s character in The Graduate had been a web designer, the W3C would have said one word to him, just one word: XML.

As far as the W3C was concerned, HTML was finished as of version 4. They began working on XHTML 2, designed to lead the web to a bright new XML-based future.

Although the name XHTML 2 sounded very similar to XHTML 1, they couldn’t have been more different. Unlike XHTML 1, XHTML 2 wasn’t going to be backwards compatible with existing web content or even previous versions of HTML. Instead, it was going to be a pure language, unburdened by the sloppy history of previous specifications.

It was a disaster.

The schism: WHATWG TF?

A rebellion formed within the W3C. The consortium seemed to be formulating theoretically pure standards unrelated to the needs of web designers. Representatives from Opera, Apple, and Mozilla were unhappy with this direction. They wanted to see more emphasis placed on formats that allowed the creation of web applications.

Things came to a head in a workshop meeting in 2004. Ian Hickson, who was working for Opera Software at the time, proposed the idea of extending HTML to allow the creation of web applications. The proposal was rejected.

The disaffected rebels formed their own group: the Web Hypertext Application Technology Working Group, or WHATWG for short.

From Web Apps 1.0 to HTML5

From the start, the WHATWG operated quite differently than the W3C. The W3C uses a consensus-based approach: issues are raised, discussed, and voted on. At the WHATWG, issues are also raised and discussed, but the final decision on what goes into a specification rests with the editor. The editor is Ian Hickson.

On the face of it, the W3C process sounds more democratic and fair. In practice, politics and internal bickering can bog down progress. At the WHATWG, where anyone is free to contribute but the editor has the last word, things move at a faster pace. But the editor doesn’t quite have absolute power: an invitation-only steering committee can impeach him in the unlikely event of a Strangelove scenario.

Initially, the bulk of the work at the WHATWG was split into two specifications: Web Forms 2.0 and Web Apps 1.0. Both specifications were intended to extend HTML. Over time, they were merged into a single specification called simply HTML5.

Reunification

While HTML5 was being developed at the WHATWG, the W3C continued working on XHTML 2. It would be inaccurate to say that it was going nowhere fast. It was going nowhere very, very slowly.

In October 2006, Sir Tim Berners-Lee wrote a blog post in which he admitted that the attempt to move the web from HTML to XML just wasn’t working. A few months later, the W3C issued a new charter for an HTML Working Group. Rather than start from scratch, they wisely decided that the work of the WHATWG should be used as the basis for any future version of HTML.

All of this stopping and starting led to a somewhat confusing situation. The W3C was simultaneously working on two different, incompatible types of markup: XHTML 2 and HTML 5 (note the space before the letter five). Meanwhile a separate organization, the WHATWG, was working on a specification called HTML5 (with no space) that would be used as a basis for one of the W3C specifications!

Any web designers trying to make sense of this situation would have had an easier time deciphering a movie marathon of Memento, Primer, and the complete works of David Lynch.

XHTML is dead: long live XHTML syntax

The fog of confusion began to clear in 2009. The W3C announced that the charter for XHTML 2 would not be renewed. The format had been as good as dead for several years; this announcement was little more than a death certificate.

Strangely, rather than passing unnoticed, the death of XHTML 2 was greeted with some mean-spirited gloating. XML naysayers used the announcement as an opportunity to deride anyone who had ever used XHTML 1—despite the fact that XHTML 1 and XHTML 2 have almost nothing in common.

Meanwhile, authors who had been writing XHTML 1 in order to enforce a stricter writing style became worried that HTML5 would herald a return to sloppy markup.

As you’ll soon see, that’s not necessarily the case. HTML5 is as sloppy or as strict as you want to make it.

The timeline of HTML5

The current state of HTML5 isn’t as confusing as it once was, but it still isn’t straightforward.

There are two groups working on HTML5. The WHATWG is creating an HTML5 specification using its process of “commit then review.” The W3C HTML Working Group is taking that specification and putting it through its process of “review then commit.” As you can imagine, it’s an uneasy alliance. Still, there seems to finally be some consensus about that pesky “space or no space?” question (it’s HTML5 with no space, just in case you were interested).

Perhaps the most confusing issue for web designers dipping their toes into the waters of HTML5 is getting an answer to the question, “when will it be ready?”

In an interview, Ian Hickson mentioned 2022 as the year he expected HTML5 to become a proposed recommendation. What followed was a wave of public outrage from some web designers. They didn’t understand what “proposed recommendation” meant, but they knew they didn’t have enough fingers to count off the years until 2022.

The outrage was unwarranted. In this case, reaching a status of “proposed recommendation” requires two complete implementations of HTML5. Considering the scope of the specification, this date is incredibly ambitious. After all, browsers don’t have the best track record of implementing existing standards. It took Internet Explorer over a decade just to add support for the abbr element.

The date that really matters for HTML5 is 2012. That’s when the specification is due to become a “candidate recommendation.” That’s standards-speak for “done and dusted.”

But even that date isn’t particularly relevant to web designers. What really matters is when browsers start supporting features. We began using parts of CSS 2.1 as soon as browsers started shipping with support for those parts. If we had waited for every browser to completely support CSS 2.1 before we started using any of it, we would still be waiting.

It’s no different with HTML5. There won’t be a single point in time at which we can declare that the language is ready to use. Instead, we can start using parts of the specification as web browsers support those features.

Remember, HTML5 isn’t a completely new language created from scratch. It’s an evolutionary rather than revolutionary change in the ongoing story of markup. If you are currently creating websites with any version of HTML, you’re already using HTML5.

Wednesday, December 8th, 2010

Bobbie Johnson dot org : Ian Hickson on HTML5: “The W3C lost sight of the fact that they have no power”

Bobbie is publishing the interviews he conducted with various HTML5 bods when he was researching his Technology Review article. First up: Hixie.

Monday, October 18th, 2010

Position Paper for the W3C Workshop on Web Applications and Compound Documents

Here's a little piece of web history: the proposal that was presented and rejected at the 2004 W3C workshop that led to the formation of the WHATWG.

Friday, April 16th, 2010

Timed tracks - WHATWG Wiki

Hixie needs your help. Document examples of augmented video (or audio) such as captioned or subtitled media.

Wednesday, January 13th, 2010

HTML5 business as usual

It’s been a strange week in HTML5. The web—and Twitter in particular—has been awash with wailing and gnashing of teeth as various people weigh in with their opinions on either the W3C or the WHATWG—depending on which camp they’re in—being irreversibly broken …exactly the kind of ludicrous over-reaction at which the internet excels.

This particular round of chicken-littling was caused by the shuffling of some spec components. The W3C HTML Working Group recently decided to split microdata into a separate specification (which I think is fair enough given RDFa’s similar status). Hixie then removed some other parts of HTML5; a move which was seen as a somewhat petulant reaction to the microdata splittage. Cue outfreakage. Before too long, most of the changes were rolled back.

So all of the shouting and arguing was more about politics and procedure than about features or semantics. That’s par for the course when it comes to the HTML Working Group at the W3C; the technical discussions are outweighed by the political and procedural wranglings. But that’s the nature of the beast. Hammering out a standard is hard. Building consensus is really hard. The chairs of the working group face an uphill struggle every single day. Still, that’s a far cry from declaring the whole thing a waste of time.

As Tantek points out, if the HTML5 shenanigans seem particularly crazy, that’s only because they are that much more public than most other processes:

The previous several revisions of HTML (including XHTML) were largely developed in W3C Members-Only mailing lists (and face-to-face meetings) which contained a lot of similar “corporate politics, egotism, squabbles and petty disagreements” - however such tussles were invisible to search engines, the general public, and of course all the professional web developers and designers (like yourself) - you never saw how the sausage was made as it were.

Tantek was responding to a post by Malarkey who advises us to keep calm and carry on. That’s sensible advice, although he gets some push-back in the comments from people concerned about a market-led approach to web standards, wherein we only care about what browsers are implementing, not what’s enshrined into a standard.

It’s easy to polarise this issue into a black and white dichotomy: implementation first vs. specifications first. The truth, as always, is much more nuanced than that, as beautifully summed up by Rob O’Callahan:

Implementations and specifications have to do a delicate dance together. You don’t want implementations to happen before the specification is finished, because people start depending on the details of implementations and that constrains the specification. However, you also don’t want the specification to be finished before there are implementations and author experience with those implementations, because you need the feedback. There is unavoidable tension here, but we just have to muddle on through … I think we’re doing OK.

I think we’re doing OK too.

Not that I’m not immune to HTML5-related temper loss. Most recently, I was miffed with the WHATWG rather than the W3C but once again, it was entirely to do with specification organisation rather than specification contents.

The WHATWG have never been comfortable with the term HTML5 to describe the work they’re doing, which began life as Web Apps 1.0. The very idea of version numbers is anathema to their philosophy so they’re quite happy for the W3C to own the term HTML5 to describe a particular set-in-stone markup spec. But they still need a word to describe the monolithic ongoing WHATWG spec.

Historically, the term HTML5 was a pretty good fit for the WHATWG spec and it corresponded exactly with the W3C spec (in fact, the W3C spec is generated from the WHATWG spec). But Hixie declared Last Call for the WHATWG HTML5 spec a while back. That means that the specification at the WHATWG and the specification at the W3C can now diverge—the WHATWG spec contains everything in HTML5 and then some. To continue to label this WHATWG spec as simply HTML5 would be misleading. So a few weeks ago, the name of the spec changed from HTML5 to WHATWG HTML (including HTML5).

Accurate as that designation may be, I became very concerned about the potential confusion it would cause. Any front-end developer reading a document titled WHATWG HTML (including HTML5) might reasonably ask Oh, which bits are HTML5? …a question to which there’s no easy answer because at the WHATWG, the term HTML5 is seen as little more than a buzzword. In that sense, they share PPK’s assertion that HTML5 means whatever you want it to mean.

I was particularly concerned that the short URL http://whatwg.org/html5 would redirect to a document that wasn’t called HTML5. I could point developers at this diagram but I’m not sure that it would make things any clearer.

Things got fairly heated in the IRC channel as I argued for either a different redirect or a better document title. I understand why the WHATWG need to transition from using the term HTML5 to simply using the term HTML to describe their all-encompassing ongoing work, but flipping that switch too soon could cause a lot pain and confusion. A gradual evolution of titles reflecting the evolution of the contents seems better to me:

  1. HTML5
  2. HTML5 (including next generation additions still in development)
  3. WHATWG HTML (including HTML5)
  4. WHATWG HTML

Hixie made the change. The title of the WHATWG specification is currently at step 2. I think this will make things a lot clearer for authors.

  • Anyone looking for the specification that will become a W3C candidate recommendation called HTML5 should look at the W3C site: http://dev.w3.org/html5/spec/.
  • Anyone looking for the ongoing evolving specification that HTML5 is a part of should look at the WHATWG site: http://whatwg.org/html5.

I’m happy that this has been cleared up and yet I hope that smart, savvy front-end developers who have read this far will think that I’ve just wasted their time. That would be a healthy reaction to reading a bunch of irrelevant guff about what specifications are called and how they are organised. Your time would be far better spent implementing the specifications and providing feedback.

That’s certainly a far better use of your time than simply shouting FAIL!

Update: And, right on cue, Mark Pilgrim updates the WHATWG blog to explain the spec name change.

Sunday, December 13th, 2009

HTML5 watch

Keeping up with HTML5 can seem like a full-time job if you’re subscribed to both the W3C public-html list and the WHATWG mailing list.

If you have to choose just one, the WHATWG list is definitely the red pill. The W3C list has a very high volume of traffic, mostly about politics and procedure. Sam Ruby deserves a medal for keeping the whole thing on an even keel.

The WHATWG list, on the other hand, can get pretty nitty-gritty in its discussions of Web Workers, Offline Storage and other technologies that are completely over my head.

The specification itself is shaping up nicely. My list of bugbears is getting shorter and shorter:

  1. I’m still not convinced that the article element is necessary, given that it is almost indistinguishable from section. Having two very similar elements is potentially very confusing for authors. It’s hard enough deciding the difference between a section and a div.
  2. The time element is still unnecessarily restrictive. I don’t just mean that it’s restrictive in the sense that you can’t mark up a month, the very definition is too narrow. I hoisted the HTML5 spec by its own petard recently, pointing out that a different portion of the spec violates the definition of time.
  3. The cite element is also too restrictively defined, and in a backwards-incompatible way to boot. I’ve written more about that over on 24 Ways.

There are much bigger issues than these still outstanding—mostly related to the accessibility of audio, video and canvas—but I’ll leave it to smarter people than me to tackle those. My issues all revolve around semantics and, let’s face it, they’re kind of piddling little problems in the grand scheme of things.

On the whole, I’d say the spec is looking mighty fine. Most of it is ready for use today.

I think the next big challenge for HTML5 lies with the tools. It’s great that we’ve got a validator but what we really need is —something like JSlint but for checking markup writing style: case sensitivity of tags, quotes around attributes, that kind of thing. Robert Nyman concurs.

Let be clear: I’m not talking about a validator that checks for polyglot documents i.e. HTML that can also be parsed as XML. I’m talking purely about writing style and personal preference; a tool that will help enforce an in-house style guide of arbitrary “best” practices.

I’ve impressed this upon Henri in IRC on a few occasions. He has explained to me that it’s not so easy to build a true syntax checker …and no, you can’t just use regular expressions.

Still, I think that there would be enormous value in having even an imperfect tool to help authors who want to write HTML5 right now but also want to enforce a strict syntax on themselves. A working rough’n’ready lint tool that catches 80% of the most common gotchas is better than a theoretical perfect tool that will work 100% of the time but that currently works 0% of the time because it doesn’t exist yet.

Anybody want to step up to challenge?

Tuesday, September 15th, 2009

The devil in the details

Looking through the list of hiccups highlighted by the HTML5 Super Friends and my own personal tally, things are progressing at a nice clip with HTML5.

That still leaves a few issues:

  • The confusion between section and article that I’ve been researching.
  • The restrictive content model of the small element not matching that element’s updated semantics.
  • The time element not allowing month or year dates i.e. YYYY-MM or YYYY.

Then there’s the issue with details and figure. The insistence on recycling the legend element leads to all sorts of problems with browsers today, as described by Remy.

This has been an ongoing discussion in the HTML5 IRC channel and on the HTML5 mailing lists. It flared up again recently and I fired off an email to the HTML Working Group yesterday:

I understand the aversion to introducing a new element … but I don’t understand why legend is being treated as the only possible existing element to recycle.

For example, dt and dd are being recycled in the new context of dialog so they no longer mean “definition title” and “definition description”. Now they can also mean (presumably) “dialog title” and “dialog description”.

If those elements are already being recycled, why not apply the same thinking to details so that dt and dd could also mean “details title” and “details description”?

To be honest, I was just spouting that out without really thinking. How about… something like—not this, obviously, not this, but what if…

Ok… Not This…

So imagine my surprise when Hixie responded:

That’s not a bad idea actually. Ok, done.

Wow. Okay.

While I was at it, I also did this for figure

Alright. Makes sense, I suppose …although the names of the elements dt and dd aren’t quite as intuitive in the context of figure as they are in details.

and removed dialog from the spec altogether.

Er …okay. Seems somewhat unrelated to the issue at hand but I guess that the dialog element has been a point of contention. It’s mentioned by the HTML5 Super Friends so that’s another one that we can tick off the list.

So how should we mark up conversations now? Here’s what the spec now suggests:

Authors who need to mark the speaker for styling purposes are encouraged to use span or b.

Um… what? The b element? Really? You have got to be kidding.

Excuse me while I channel Carrie’s mother, but I can predict exactly how authors are going to react to this advice: They’re all going to laugh at you!

@gcarothers: Jaw, floor, WHAT?! http://bit.ly/48Bhta Why in the world would #html5 suggest using <b> tags to markup names?

@akamike: There are more appropriate tags than <b> for marking up names in conversations. “The b element should be used as a last resort…” #html5

@cssquirrel: Well, if the <p><b> recommendations for dialog in #html5 persist for a week, I know what I’m drawing.

Perhaps now would be a good time to mention that the cite element is also listed by the HTML5 Super Friends. Call me crazy but I think it might just be the right element for marking up the person being cited.

<p> <cite>Costello</cite>: <q>Who's playing first?</q> </p>
<p> <cite>Abbott</cite>: <q>That's right.</q> </p>