parse-js
is a Common Lisp package for parsing
JavaScript — ECMAScript
3, to be more precise. It is released under a zlib-style licence. For any feedback, contact me: Marijn Haverbeke.
The library can be downloaded, checked out from the git repository, or installed with asdf-install.
07-02-2013: New release. More corner-case bugs fixed.
Representation of for-in nodes changed to accomodate things
like for (x.y in z)
. Array element expressions may
now be nil, when parsing a literal like [1,,3]
.
03-01-2011: New release. Lots of conformance fixes,
driven
by CL-JavaScript
and UglifyJS
work. parse-js-string
is deprecated now
(parse-js
accepts strings), and basic support for
ECMAScript 5 has been added.
11-06-2010: Move from darcs to git for version control, update release tarball.
function parse-js (input &key ecma-version strict-semicolons reserved-words)
→ syntax-tree
Reads a program from a string or a stream, and produces an abstract syntax tree, which is a nested structure consisting of lists starting with keywords. The exact format of this structure is not very well documented, but the file as.txt gives a basic description.
The keyword arguments can be used to influence the
parsing mode. emca-version
can be 3
or
5
, and influences the standard that is followed. The
default is 3. Support for version 5 is incomplete at this time.
When strict-semicolons
is true, the parser will
complain about missing semicolons, even when they would have been
inserted by 'automatic semicolon insertion' rules. Finally, if
reserved-words
is true, the parser will complain
about 'future reserved words', such as class
being
used.
class js-parse-error
The type of errors raised when invalid input is
encountered. Inherits from simple-error
,
and has js-parse-error-line
and
js-parse-error-char
accessors that can be used
to read the location at which the error occurred.
function lex-js (stream)
→ function
A JavaScript tokeniser. The function returned can
be called repeatedly to read the next token object. See below for
a description of these objects. When the end of the stream is
reached, tokens with type :eof
are returned.
function token-type (token)
→ keyword
Reader for the type of token objects. Types are
keywords (one of :num :punc :string :operator :name :atom
:keyword :eof
).
function token-value (token)
→ value
Reader for the content of token objects. The type of this value depends on the type of the token ― it holds strings for names, for example, and numbers for number tokens.
function token-line (token)
→ number
The line on which a token was read.
function token-char (token)
→ number
The character at which a token starts.