Tags: parsing

24

sparkline

Sunday, December 3rd, 2017

Monday, November 27th, 2017

Exploring the Linguistics Behind Regular Expressions

Before reading this article, I didn’t understand regular expressions. But now, having read this article, I don’t understand regular expressions and I don’t understand linguistics. Progress!

Tuesday, October 31st, 2017

Netflix functions without client-side React, and it’s a good thing - JakeArchibald.com

A great bucketload of common sense from Jake:

Rather than copying bad examples from the history of native apps, where everything is delivered in one big lump, we should be doing a little with a little, then getting a little more and doing a little more, repeating until complete. Think about the things users are going to do when they first arrive, and deliver that. Especially consider those most-likely to arrive with empty caches.

And here’s a good way of thinking about that:

I’m a fan of progressive enhancement as it puts you in this mindset. Continually do as much as you can with what you’ve got.

All too often, saying “use the right tool for the job” is interpreted as “don’t use that tool!” but as Jake reminds us, the sign of a really good tool is its ability to adapt instead of demanding rigid usage:

Netflix uses React on the client and server, but they identified that the client-side portion wasn’t needed for the first interaction, so they leaned on what the browser can already do, and deferred client-side React. The story isn’t that they’re abandoning React, it’s that they’re able to defer it on the client until it’s was needed. React folks should be championing this as a feature.

Sunday, October 29th, 2017

Can You Afford It?: Real-world Web Performance Budgets – Infrequently Noted

Alex looks at the mindset and approaches you need to adopt to make a performant site. There’s some great advice in here for setting performance budgets for JavaScript.

JavaScript is the single most expensive part of any page in ways that are a function of both network capacity and device speed. For developers and decision makers with fast phones on fast networks this is a double-whammy of hidden costs.

Friday, October 27th, 2017

Transpiled for-of Loops are Bad for the Client - daverupert.com

This story is just a personal reminder for me to repeatedly question what our tools spit out. I don’t want to be the neophobe in the room but I sometimes wonder if we’re living in a collective delusion that the current toolchain is great when it’s really just morbidly complex. More JavaScript to fix JavaScript concerns the hell out of me.

Yes! Even if you’re not interested in the details of Dave’s story of JavaScript optimisation, be sure to read his conclusion.

I am responsible for the code that goes into the machine, I do not want to shirk the responsibility of what comes out. Blind faith in tools to fix our problems is a risky choice. Maybe “risky” is the wrong word, but it certainly seems that we move the cost of our compromises to the client and we, speaking from personal experience, rarely inspect the results.

Thursday, September 14th, 2017

Deploying ES2015+ Code in Production Today — Philip Walton

The reality is transpiling and including polyfills is quickly becoming the new norm. What’s unfortunate is this means billions of users are getting trillions of bytes sent over the wire unnecessarily to browsers that would have been perfectly capable of running the untranspiled code natively.

Phil has a solution: serve up your modern JavaScript using script type="module" and put your transpiled fallback in script nomodule.

Most developers think of <script type="module"> as way to load ES modules (and of course this is true), but <script type="module"> also has a more immediate and practical use-case—loading regular JavaScript files with ES2015+ features and knowing the browser can handle it!

Tuesday, August 22nd, 2017

Inside a super fast CSS engine: Quantum CSS (aka Stylo) ★ Mozilla Hacks – the Web developer blog

Lin gives a deep dive into Firefox’s new CSS engine specifically, but this is also an excellent primer on how browsers handle CSS in general: parsing, styling, layout, painting, compositing, and rendering.

Tuesday, April 18th, 2017

PhD Thesis: Cascading Style Sheets

Håkon wrote his doctoral thesis on CSS …which is kinda like Einstein writing a thesis on relativity. There’s some fascinating historical insight into the creation of the standards we use today.

Sunday, April 16th, 2017

Adventure

The Internet Archive is now hosting early Macintosh software emulated right in your browser. That means you can play Adventure: the source of subsequent text adventures, natural language parsing, and chatbots.

Colossal Cave Adventure (also known as ADVENT, Colossal Cave, or Adventure) is a text adventure game, developed originally in 1976, by Will Crowther for the PDP-10 mainframe. The game was expanded upon in 1977, with help from Don Woods, and other programmers created variations on the game and ports to other systems in the following years.

In the game, the player controls a character through simple text commands to explore a cave rumored to be filled with wealth.

Monday, February 6th, 2017

read.isthe.link

Here’s a nice little service from Remy that works sorta like Readability. Pass it a URL in a query string and it will generate a version without all the cruft around the content.

Tuesday, January 3rd, 2017

Does Google execute JavaScript? | Stephan Boyer

Google may or may not decide to run your JavaScript, and you don’t want your business to depend on its particular inclination of the day. Do server-side/universal/isomorphic rendering just to be safe.

Saturday, October 4th, 2014

JS Parse and Execution Time - TimKadlec.com

Tim’s been running the numbers on how long it takes various browsers on various devices to parse JavaScript—in this case, jQuery. The time varies enormously depending on the device hardware.

Wednesday, September 25th, 2013

The ghost of browsers past

Even before a line of code was written for the line-mode browser simulator when we gathered together at CERN, there was a gleeful period of digital spelunking.

Brian goes browsing Demonstration data sources

We poked at the markup of the first ever website

  • What’s that NEXTID element? Turns it out it’s something specific to the NeXT operating system.
  • Why does the first iteration of HTML already contain H1 through to H6? It’s because they were lifted wholesale from a flavour of SGMLStandard Generalized Markup Language—that was already in use at CERN.

Oh, and Brian asked Robert Cailliau why they went with the term World Wide Web. “Well,” he said, “we had to call it something. And we thought we could always change it later.”

Then there was the story of the line-mode browser. It was created by Nicola Pellow, who was a student at CERN in 1990. She later worked on the Mac browser but her involvement with kickstarting the world wide web ended around 1993. She never showed up to any of the reunions.

We poked around in the (surprisingly short) source code of the line-mode browser. We found the lines that described how elements should be styled—the term “style sheet” appeared in a comment!

Proto-stylesheet Parsing the parser

If you’ve fired up the line-mode browser simulator and run some websites through it, you’ll probably see occasions where a whole bunch of JavaScript—nestled between script tags in the head of the document—gets rendered to the screen.

Clearleft

We could’ve hidden that JavaScript, but we made a deliberate decision to display it. That’s what the line-mode browser would have done. The script element didn’t exist back then. Heck, JavaScript didn’t exist back then. So browsers would have handled the unknown element in the standard HTML way: ignore the opening and closing tags and just render what’s in-between them. That’s still the error-handling model for unrecognised elements in HTML.

This is why we used to write our JavaScript like this:

<script language="JavaScript" type="text/javascript">
<!--
(JavaScript goes here)
//-->
</script>

The HTML comments stopped the JavaScript from being rendered to the screen in older browsers (like the line-mode browser). Using the opening HTML comment <!-- is functionally equivalent to // single-line comments in JavaScript …although you still need to prefix the closing --> comment with a //.

I remember doing this when I first started making websites in the 90s. You can see it if you view source on the first version of this website.

Later on, we all switched to XHTML so we updated the syntax to make it valid XML.

<script type="text/javascript">
//<![CDATA[
(JavaScript goes here)
//]]>
</script>

The <![CDATA[ part stops an XML parser from trying to parse the JavaScript. But HTML parsers would choke on that because it starts with an angle bracket. Hence the JavaScript-style // comment.

Anyway, we don’t bother with HTML or XHTML comments at the start of our script blocks anymore. And that’s why the line-mode browser simulator renders the JavaScript to the screen.

Note that the JavaScript isn’t executed. That’s thanks to a clever little hack by Remy: the line-mode browser simulator changes the type attribute of every script element to text/plain, effectively defusing them. Smart!

Sunday, September 15th, 2013

Parsing webmentions

Thanks to everyone who helped me test webmentions that I hacked together at Indie Web Camp last weekend.

Let me explain what web mentions are all about…

Basically, it’s an equivalent to pingback. Let’s say I write something here on adactio.com. Suppose that prompts you to write something in response on your own site. A web mention is a way for you to let me know that your response exists.

If you look in the head of any of my journal posts, you’ll see this link element:

<link rel="webmention" href="http://adactio.com/webmention.php" />

That’s my web mention endpoint: http://adactio.com/webmention.php …it’s kind of like a webhook: a URL that’s intended to be hit by machines rather than people. So when you publish your response to my post, you ping that URL with a POST request that sends two parameters:

  1. target: the URL of my post and
  2. source: the URL of your response.

Ideally your own CMS or blogging system would take care of doing the pinging, but until that’s more widely implemented, I’m providing this form at the end of each of my posts:

Either way, once you ping my web mention endpoint—discoverable through that link rel="webmention"—with those two parameters, I just need to confirm that your post does indeed contain a link to my post—by making a cURL request and parsing your source—and then I return a server response of 202 (Accepted).

Here’s the code for a minimum viable web mention in PHP.

That’s as far as I got at Indie Web Camp but it was enough for me to start collecting responses to posts.

Webmentions as links

The next step is to do something with the responses. After all, I’ve already got the source of each response from those cURL requests.

Barnaby has a written a nice straightforward microformats parser in PHP. I’m using that to check the cURLed source for any responses that have been marked up using h-entry. That’s one of the microformats 2 vocabularies—a much simpler way of writing structured content with microformats.

Aaron, Amber, and Barnaby all sent responses that were marked up with h-entry so now their responses appear in full.

Webmentions as comments

So there you have it. Comments are now open on every journal post on adactio.com …the only catch is that you have to write the comment on your own site. And if you want the content of your post to appear here (instead of just a link) then update your blog post template to include a handful of h-entry classes.

Feel free to use this post as a test. Mark up your blog with h-entry, write a post that links to this URL, and enter the URL of your post in the form below.

Monday, June 10th, 2013

Request Quest

A terrific quiz about browser performance from Jake. I had the pleasure of watching him present this in a bar in Amsterdam—he was like a circus carny hoodwinking the assembled geeks.

I guarantee you won’t get all of this right, and that’s a good thing: you’ll learn something. If you do get them all right, either you are Jake or you are very, very sad.

Thursday, June 6th, 2013

Deep dive into the murky waters of script loading

Jake casts a scrutinising eye over the way that browsers load and parse scripts …and looks at what we can do about it.

Sunday, July 8th, 2012

Squirrel!

Here’s a brainbuster for ya: a single file that renders both as HTML and as a JPEG. As an HTML page, it even contains an img element with a src of …itself!

Compare the “view source” output with the generated source output to see it’s being interpreted.

Wednesday, August 17th, 2011

HTML5 Rocks - How Browsers Work: Behind the Scenes of Modern Web Browsers

Insanely in-depth look at how browsers work, right down to the nitty gritty. You’d think there’d be a lot of engineering talk, but actually a lot of it is more about linguistics and language parsing.

Monday, August 9th, 2010

Surfin’ Safari - Blog Archive » The HTML5 Parsing Algorithm

The latest Webkit nightly includes the HTML5 parsing algorithm. Now it's a race between Firefox, Safari and Chrome to see which will be first (non-beta) browser to ship with the new parser.

Friday, May 14th, 2010

Notes on HTML5 Parser History — Anne’s Weblog

Anne explains exactly why the HTML parser defined in HTML5 is A Very Good Thing for everyone.