Note!
As of now the author ran out of time. If anyone wants to help out "fixing" a small part of my private code, so it can be ported to Escaya. This parser can be completed. Just ping me in the 'TODO's issue on this repo if interested.
An blazing fast 100% spec compliant, incremental javascript parser written in Typescript
Work in progress
Escaya generates it's own
AST that is close to the ECMAScript® 2021 specs, and can be used to perform syntactic analysis (parsing) of a JavaScript program, and with
ES2015 and later a JavaScript program can be either a script or a module.
Example usage:
import { parseScript, parseModule } from './escaya';
parseScript('({x: [y] = 0} = 1)', { impliedStrict: true });
parseModule('({x: [y] = 0} = 1)');
This is the available options:
|Option
|Description
next
|Enable stage 3 support (ESNext)
disableWebCompat
|Disable web compatibility
loc
|Enable line/column location information start and end offsets to each node
cst
|Enable additional concrete syntax to each node
impliedStrict
|Enable strict mode initial enforcement
module
|Enable parsing in module goal in error recovery mode
Escaya lets you extract leading and trailing comments from a given position with either
extractCommentsScript or
extractCommentsModule. It takes the source code as it's first argument, the position within the source code where you want to extract the comments from as it's second argument. The last argument let's you decide if you want to extract leading or trailing comments -
collectCommentsScript(source, start, boolean);
Here is an example on how to get all trailing comments belonging to
bar
import { extractCommentsScript } from './escaya';
extractCommentsScript('/* MultieLine */ bar /* trailing */', 20, true);
Outputs:
[{
comment: ' trailing ',
end: 35,
newLine: false,
start: 21,
type: 'MultiLine'
}]
The AST used by
Escaya represents the structure of an ECMAScript program as a tree and is designed to stay true to the ECMAScript® 2021 specification. The AST has been designed for performance, and it nearly eliminates the chance of accidentally creating an AST that does not represent an ECMAScript program while also requiring fewer bytes than an
ESTree AST like
Babel and
Acorn produce, and
Babel parser's own AST.
The
Escaya AST doesn't try to follow the SpiderMonkey-compatible standard that
ESTree strictly follows. For example it distinguish
Identifier from
IdentifierPattern. That makes it easier to calculate the free variables of a program.
Escaya supports a simplified definition of "concrete syntax" that follows the ECMAScript® 2021 specification.
A
ParenthesisExpression has been added to represent the
( ) and everything in between. See Primary Expression - Supplemental Syntax
A
Elison node has been added to represent a splice array in 12.2.5 Array Initializer and 13.3.3 Destructuring Binding Patterns - ArrayBindingPattern.
A
Semicolon node has been used in ClassElement to represent the
; token.
Use of
parseCustomScript and
parseCustomModule let you use whatever AST format you want.
Here is an example on how to use
Babel AST
import { parseCustomScript } from './escaya';
parseCustomScript('a = b', {
Script: function (source, directives, statements) {
return {
type: 'File',
errors: [],
program: {
type: 'Program',
sourceType: 'script',
body: statements
},
directives,
comments: [],
start: 0,
end: source.length
};
}
);
});
When Escaya parser is given an input that does not represent a valid JavaScript program, it throws an exception. If parsing in recovery mode, the parser will continue parsing and produce a syntax tree that conforms to the standard ECMAScript® 2021 specs.
However, Escaya will continue to do a full parse for every keystroke. To avoid this you can enable incremental parsing. This is best demonstrated with an example.
import { recovery, update } from './escaya';
const rootNode = recovery('(foo);', 'filename.js', { module: true });
const ast = update(rootNode, '=> bar;', 'filename.js', { span: { start: 6, length: 0 }, newLength: 7 })
Now when incremental parsing has been enabled, Escaya will reuse nodes from the old tree if possible.
The options for the recovery mode are about the same as for
parseScript and
parseModule except you have to enable
{module: true} if parsing in module goal.
No options can be set during an incremental update because it's only possible to reuse a node if it was parsed in the same context that parser are currently in.
One of the design goals for Escaya has been that the abstract syntax tree (AST) shouldn't change. It should be the same either you are parsing in
normal mode or
recovery mode but there are a couple of exceptions.
For example, in
recovery mode you are creating a
RootNode instead of either a
Module or
Script. This
RootNode has additional information such as diagnostics, context masks and mutual parser flags that you carry over from the recovery mode to the incremental parsing and let you continue to parse in the same context that you are currently in, unless you set a strict directive on the
RootNode. If you do this, Escaya will parse in strict mode and you will not be able to recover any nodes from the old tree if you were first parsing in sloppy mode, because it's only possible to reuse a node if it was parsed with the same context that the parser used before.
The main difference is that EScaya's recovery mode conforms to the ECMAScript® 2021 specs , while
Acorn Loose does not.
It's not even an JavaScript parser. You can play with
Acorn Loose on
ASTExplorer and you will notice the differences.
As an example you will get a
BlockStatement if you try to parse something like
try.