Tags: parsing



Thursday, September 14th, 2017

Deploying ES2015+ Code in Production Today — Philip Walton

The reality is transpiling and including polyfills is quickly becoming the new norm. What’s unfortunate is this means billions of users are getting trillions of bytes sent over the wire unnecessarily to browsers that would have been perfectly capable of running the untranspiled code natively.

Phil has a solution: serve up your modern JavaScript using script type="module" and put your transpiled fallback in script nomodule.

Most developers think of <script type="module"> as way to load ES modules (and of course this is true), but <script type="module"> also has a more immediate and practical use-case—loading regular JavaScript files with ES2015+ features and knowing the browser can handle it!

Tuesday, August 22nd, 2017

Inside a super fast CSS engine: Quantum CSS (aka Stylo) ★ Mozilla Hacks – the Web developer blog

Lin gives a deep dive into Firefox’s new CSS engine specifically, but this is also an excellent primer on how browsers handle CSS in general: parsing, styling, layout, painting, compositing, and rendering.

Tuesday, April 18th, 2017

PhD Thesis: Cascading Style Sheets

Håkon wrote his doctoral thesis on CSS …which is kinda like Einstein writing a thesis on relativity. There’s some fascinating historical insight into the creation of the standards we use today.

Sunday, April 16th, 2017


The Internet Archive is now hosting early Macintosh software emulated right in your browser. That means you can play Adventure: the source of subsequent text adventures, natural language parsing, and chatbots.

Colossal Cave Adventure (also known as ADVENT, Colossal Cave, or Adventure) is a text adventure game, developed originally in 1976, by Will Crowther for the PDP-10 mainframe. The game was expanded upon in 1977, with help from Don Woods, and other programmers created variations on the game and ports to other systems in the following years.

In the game, the player controls a character through simple text commands to explore a cave rumored to be filled with wealth.

Monday, February 6th, 2017


Here’s a nice little service from Remy that works sorta like Readability. Pass it a URL in a query string and it will generate a version without all the cruft around the content.

Tuesday, January 3rd, 2017

Does Google execute JavaScript? | Stephan Boyer

Google may or may not decide to run your JavaScript, and you don’t want your business to depend on its particular inclination of the day. Do server-side/universal/isomorphic rendering just to be safe.

Saturday, October 4th, 2014

JS Parse and Execution Time - TimKadlec.com

Tim’s been running the numbers on how long it takes various browsers on various devices to parse JavaScript—in this case, jQuery. The time varies enormously depending on the device hardware.

Wednesday, September 25th, 2013

The ghost of browsers past

Even before a line of code was written for the line-mode browser simulator when we gathered together at CERN, there was a gleeful period of digital spelunking.

Brian goes browsing Demonstration data sources

We poked at the markup of the first ever website

  • What’s that NEXTID element? Turns it out it’s something specific to the NeXT operating system.
  • Why does the first iteration of HTML already contain H1 through to H6? It’s because they were lifted wholesale from a flavour of SGMLStandard Generalized Markup Language—that was already in use at CERN.

Oh, and Brian asked Robert Cailliau why they went with the term World Wide Web. “Well,” he said, “we had to call it something. And we thought we could always change it later.”

Then there was the story of the line-mode browser. It was created by Nicola Pellow, who was a student at CERN in 1990. She later worked on the Mac browser but her involvement with kickstarting the world wide web ended around 1993. She never showed up to any of the reunions.

We poked around in the (surprisingly short) source code of the line-mode browser. We found the lines that described how elements should be styled—the term “style sheet” appeared in a comment!

Proto-stylesheet Parsing the parser

If you’ve fired up the line-mode browser simulator and run some websites through it, you’ll probably see occasions where a whole bunch of JavaScript—nestled between script tags in the head of the document—gets rendered to the screen.


We could’ve hidden that JavaScript, but we made a deliberate decision to display it. That’s what the line-mode browser would have done. The script element didn’t exist back then. Heck, JavaScript didn’t exist back then. So browsers would have handled the unknown element in the standard HTML way: ignore the opening and closing tags and just render what’s in-between them. That’s still the error-handling model for unrecognised elements in HTML.

This is why we used to write our JavaScript like this:

<script language="JavaScript" type="text/javascript">
(JavaScript goes here)

The HTML comments stopped the JavaScript from being rendered to the screen in older browsers (like the line-mode browser). Using the opening HTML comment <!-- is functionally equivalent to // single-line comments in JavaScript …although you still need to prefix the closing --> comment with a //.

I remember doing this when I first started making websites in the 90s. You can see it if you view source on the first version of this website.

Later on, we all switched to XHTML so we updated the syntax to make it valid XML.

<script type="text/javascript">
(JavaScript goes here)

The <![CDATA[ part stops an XML parser from trying to parse the JavaScript. But HTML parsers would choke on that because it starts with an angle bracket. Hence the JavaScript-style // comment.

Anyway, we don’t bother with HTML or XHTML comments at the start of our script blocks anymore. And that’s why the line-mode browser simulator renders the JavaScript to the screen.

Note that the JavaScript isn’t executed. That’s thanks to a clever little hack by Remy: the line-mode browser simulator changes the type attribute of every script element to text/plain, effectively defusing them. Smart!

Sunday, September 15th, 2013

Parsing webmentions

Thanks to everyone who helped me test webmentions that I hacked together at Indie Web Camp last weekend.

Let me explain what web mentions are all about…

Basically, it’s an equivalent to pingback. Let’s say I write something here on adactio.com. Suppose that prompts you to write something in response on your own site. A web mention is a way for you to let me know that your response exists.

If you look in the head of any of my journal posts, you’ll see this link element:

<link rel="webmention" href="http://adactio.com/webmention.php" />

That’s my web mention endpoint: http://adactio.com/webmention.php …it’s kind of like a webhook: a URL that’s intended to be hit by machines rather than people. So when you publish your response to my post, you ping that URL with a POST request that sends two parameters:

  1. target: the URL of my post and
  2. source: the URL of your response.

Ideally your own CMS or blogging system would take care of doing the pinging, but until that’s more widely implemented, I’m providing this form at the end of each of my posts:

Either way, once you ping my web mention endpoint—discoverable through that link rel="webmention"—with those two parameters, I just need to confirm that your post does indeed contain a link to my post—by making a cURL request and parsing your source—and then I return a server response of 202 (Accepted).

Here’s the code for a minimum viable web mention in PHP.

That’s as far as I got at Indie Web Camp but it was enough for me to start collecting responses to posts.

Webmentions as links

The next step is to do something with the responses. After all, I’ve already got the source of each response from those cURL requests.

Barnaby has a written a nice straightforward microformats parser in PHP. I’m using that to check the cURLed source for any responses that have been marked up using h-entry. That’s one of the microformats 2 vocabularies—a much simpler way of writing structured content with microformats.

Aaron, Amber, and Barnaby all sent responses that were marked up with h-entry so now their responses appear in full.

Webmentions as comments

So there you have it. Comments are now open on every journal post on adactio.com …the only catch is that you have to write the comment on your own site. And if you want the content of your post to appear here (instead of just a link) then update your blog post template to include a handful of h-entry classes.

Feel free to use this post as a test. Mark up your blog with h-entry, write a post that links to this URL, and enter the URL of your post in the form below.

Monday, June 10th, 2013

Request Quest

A terrific quiz about browser performance from Jake. I had the pleasure of watching him present this in a bar in Amsterdam—he was like a circus carny hoodwinking the assembled geeks.

I guarantee you won’t get all of this right, and that’s a good thing: you’ll learn something. If you do get them all right, either you are Jake or you are very, very sad.

Thursday, June 6th, 2013

Deep dive into the murky waters of script loading

Jake casts a scrutinising eye over the way that browsers load and parse scripts …and looks at what we can do about it.

Sunday, July 8th, 2012


Here’s a brainbuster for ya: a single file that renders both as HTML and as a JPEG. As an HTML page, it even contains an img element with a src of …itself!

Compare the “view source” output with the generated source output to see it’s being interpreted.

Wednesday, August 17th, 2011

HTML5 Rocks - How Browsers Work: Behind the Scenes of Modern Web Browsers

Insanely in-depth look at how browsers work, right down to the nitty gritty. You’d think there’d be a lot of engineering talk, but actually a lot of it is more about linguistics and language parsing.

Monday, August 9th, 2010

Surfin’ Safari - Blog Archive » The HTML5 Parsing Algorithm

The latest Webkit nightly includes the HTML5 parsing algorithm. Now it's a race between Firefox, Safari and Chrome to see which will be first (non-beta) browser to ship with the new parser.

Friday, May 14th, 2010

Notes on HTML5 Parser History — Anne’s Weblog

Anne explains exactly why the HTML parser defined in HTML5 is A Very Good Thing for everyone.

Tuesday, May 11th, 2010

Firefox 4: the HTML5 parser – inline SVG, speed and more ✩ Mozilla Hacks – the Web developer blog

Henri Sivonen gives the lowdown on the HTML5 parser that will ship with the next version of Firefox. This is a huge development ...and yet users won't even notice it (by design).

Saturday, August 9th, 2008

Hueniverse: Beginner’s Guide to Discovery – Part II: People vs. Machines

A great explanation of how open technologies like microformats and OpenID enable greater discovery of data.

Thursday, February 21st, 2008

XFN encoding, extraction, and visualizations - Opera Developer Community

Brian has written an excellent article that not only explains how to write XFN but also how to parse it.

Wednesday, October 24th, 2007

Main Page - AskWiki

A natural language interface onto Wikipedia. More of this kind of thing, please.