Acorn is a JavaScript parser written in JavaScript.
Another one.
Just like:
Acorn produces a well-documented, widely used AST format. The same as the last two parsers in that list.
Acorn is really fast. Just like the last one in the list: Esprima.
Acorn is tiny. About half as big as Esprima, in lines of code.
Still, there's no good reason for Acorn to exist. Esprima is an excellent project, well-documented, and small enough for any practical use. It exposes an interface very similar to Acorn.
The only reason I wrote Acorn is that small, well-defined systems are
so much fun to work with, and that Esprima's web page very
triumphantly declared it was faster than parse_js
, the
implementation in UglifyJS version 1, which is a port of my own
parse-js Common Lisp library.
I just had to see if I could do better.
Turns out I can. Acorn beats Esprima, at least on Chrome, Firefox, and Opera (I didn't test other browsers) by a narrow margin—in the 5-20% range—when not storing source location data, and by a wide one—about five times faster—when storing source location data. See here for a reference. That second number is mostly due to the very unoptimized way in which Esprima manages the flow of this data—it could probably easily do better.
To even get to the point where my parser had this small speed
advantage over Esprima, I had to steal some of its tricks. Most
notably, in order to test whether a string is part of a set (of
keyword, reserved words, etcetera), Esprima uses hand-rolled
predicates that use switch
statements over the string values, with
an outer switch
over the length of the string. Something like this:
function isKeyword(word) {
switch (word.length) {
case 2:
switch (word):
case "if": case "in": case "do": return true;
}
return false;
case 3:
switch (word):
case "var": case "for": case "new": case "try": return true;
}
return false;
case 4:
/* et cetera */
}
}
I initially expected regular expressions to be faster (using the
test
method), but it turns out that they only are faster on Chrome,
and there only by a tiny margin.
But I wasn't about to write out all these boring predicates myself, so
I defined a function that, given a list of words, builds up the text
for such a predicate automatically, and then eval
s it to produce a
function.
Another big size (and probably speed) saving comes from the fact that
Acorn uses an operator precedence parser, whereas Esprima writes out
all the intermediate forms of binary operators in a long list of
functions named parseMultiplicativeExpression
,
parseAdditiveExpression
, parseShiftExpression
, and so on for all
of the ten precedence levels. Each of these has to be passed through
for each expression parsed.
I've set up a project page for Acorn, which is simply the output of docco run on the source code, and a github page for browsing the code and reporting bugs.