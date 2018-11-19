Converts a string into an array of tokens, with useful methods for looking ahead and behind, capturing, matching, et cetera.

Usage

const Lexer = require ( 'snapdragon-lexer' ); const lexer = new Lexer(); lexer.capture( 'slash' , /^\ lexer.capture( 'text' , /^\w+ /); lexer.capture('star', / ^\* /); console.log(lexer.tokenize('foo/ * '));

API

Create a new Lexer with the given options .

Params

input {string|Object} : (optional) Input string or options. You can also set input directly on lexer.input after initializing.

: (optional) Input string or options. You can also set input directly on after initializing. options {object}

Example

const Lexer = require ( 'snapdragon-lexer' ); const lexer = new Lexer( 'foo/bar' );

Returns true if we are still at the beginning-of-string, and no part of the string has been consumed.

returns {boolean}

Returns true if lexer.string and lexer.queue are empty.

returns {boolean}

Register a handler function.

Params

type {string}

fn {function}: The handler function to register.

Example

lexer.set( 'star' , function ( ) { });

Get a registered handler function.

Params

type {string}

fn {function}: The handler function to register.

Example

lexer.set( 'star' , function ( ) { }); const star = lexer.get( 'star' );

Returns true if the lexer has a registered handler of the given type .

Params

{string} : type

: type returns {boolean}

Example

lexer.set( 'star' , function ( ) {}); console .log(lexer.has( 'star' ));

Create a new Token with the given type and value .

Params

type {string|Object} : (required) The type of token to create

: (required) The type of token to create value {string} : (optional) The captured string

: (optional) The captured string match {array} : (optional) Match results from String.match() or RegExp.exec()

: (optional) Match results from or returns {Object}: Returns an instance of snapdragon-token

Events

emits : token

Example

console .log(lexer.token({ type : 'star' , value : '*' })); console .log(lexer.token( 'star' , '*' )); console .log(lexer.token( 'star' ));

Returns true if the given value is a snapdragon-token instance.

Params

token {object}

returns {boolean}

Example

const Token = require ( 'snapdragon-token' ); lexer.isToken({}); lexer.isToken( new Token({ type : 'star' , value : '*' }));

Consume the given length from lexer.string . The consumed value is used to update lexer.state.consumed , as well as the current position.

Params

len {number}

value {string} : Optionally pass the value being consumed.

: Optionally pass the value being consumed. returns {String}: Returns the consumed value

Example

lexer.consume( 1 ); lexer.consume( 1 , '*' );

Returns a function for updating a token with lexer location information.

returns {function}

Use the given regex to match a substring from lexer.string . Also validates the regex to ensure that it starts with ^ since matching should always be against the beginning of the string, and throws if the regex matches an empty string, which can cause catastrophic backtracking.

Params

regex {regExp} : (required)

: (required) returns {Array|null}: Returns the match array from RegExp.exec or null.

Example

const lexer = new Lexer( 'foo/bar' ); const match = lexer.match( /^\w+/ ); console .log(match);

Scan for a matching substring by calling .match() with the given regex . If a match is found, 1) a token of the specified type is created, 2) match[0] is used as token.value , and 3) the length of match[0] is sliced from lexer.string (by calling .consume()).

Params

type {string}

regex {regExp}

returns {Object}: Returns a token if a match is found, otherwise undefined.

Events

emits : scan

Example

lexer.string = '/foo/' ; console .log(lexer.scan( /^\// , 'slash' )); console .log(lexer.scan( /^\w+/ , 'text' )); console .log(lexer.scan( /^\// , 'slash' ));

Capture a token of the specified type using the provide regex for scanning and matching substrings. Automatically registers a handler when a function is passed as the last argument.

Params

type {string} : (required) The type of token being captured.

: (required) The type of token being captured. regex {regExp} : (required) The regex for matching substrings.

: (required) The regex for matching substrings. fn {function} : (optional) If supplied, the function will be called on the token before pushing it onto lexer.tokens .

: (optional) If supplied, the function will be called on the token before pushing it onto . returns {Object}

Example

lexer.capture( 'text' , /^\w+ /); lexer.capture('text', / ^\w+ /, token => { if (token.value === 'foo') { / / do stuff } return token; });

Calls handler type on lexer.string .

Params

type {string} : The handler type to call on lexer.string

: The handler type to call on returns {Object}: Returns a token of the given type or undefined.

Events

emits : handle

Example

const lexer = new Lexer( '/a/b' ); lexer.capture( 'slash' , /^\ lexer.capture( 'text' , /^\w+ /); console.log(lexer.handle('text')); / /=> undefined console.log(lexer.handle('slash')); / /=> { type: 'slash', value: '/ ' } console.log(lexer.handle(' text ')); //=> { type: ' text ', value: ' a ' }

Get the next token by iterating over lexer.handlers and calling each handler on lexer.string until a handler returns a token. If no handlers return a token, an error is thrown with the substring that couldn't be lexed.

returns {Object}: Returns the first token returned by a handler, or the first character in the remaining string if options.mode is set to character .

Example

const token = lexer.advance();

Tokenizes a string and returns an array of tokens.

Params

input {string} : The string to lex.

: The string to lex. returns {Array}: Returns an array of tokens.

Example

let lexer = new Lexer({ handlers : otherLexer.handlers }) lexer.capture( 'slash' , /^\ lexer.capture( 'text' , /^\w+ /); const tokens = lexer.lex('a/ b/c '); console.log(tokens); // Results in: // [ Token { type: ' text ', value: ' a ' }, // Token { type: ' slash ', value: ' / ' }, // Token { type: ' text ', value: ' b ' }, // Token { type: ' slash ', value: ' / ' }, // Token { type: ' text ', value: ' c ' } ]

Push a token onto the lexer.queue array.

Params

token {object}

returns {Object}: Returns the given token with updated token.index .

Example

console .log(lexer.queue.length); lexer.enqueue( new Token( 'star' , '*' )); console .log(lexer.queue.length);

Shift a token from lexer.queue .

returns {Object}: Returns the given token with updated token.index .

Example

console .log(lexer.queue.length); lexer.dequeue(); console .log(lexer.queue.length);

Lookbehind n tokens.

Params

n {number}

returns {Object}

Example

const token = lexer.lookbehind( 2 );

Get the previously lexed token.

returns {Object|undefined}: Returns a token or undefined.

Example

const token = lexer.prev();

Lookahead n tokens and return the last token. Pushes any intermediate tokens onto lexer.tokens. To lookahead a single token, use .peek().

Params

n {number}

returns {Object}

Example

const token = lexer.lookahead( 2 );

Lookahead a single token.

returns {Object}: Returns a token.

Example

const token = lexer.peek();

Get the next token, either from the queue or by advancing.

returns {Object|String}: Returns a token, or (when options.mode is set to character ) either gets the next character from lexer.queue , or consumes the next charcter in the string.

Example

const token = lexer.next();

Skip n tokens or characters in the string. Skipped values are not enqueued.

Params

n {number}

returns {Object}: returns an array of skipped tokens.

Example

const token = lexer.skip( 1 );

Skip tokens while the given fn returns true.

Params

fn {function} : Return true if a token should be skipped.

: Return true if a token should be skipped. returns {Array}: Returns an array if skipped tokens.

Example

lexer.skipWhile( tok => tok.type !== 'space' );

Skip the given token types .

Params

types {string|Array} : One or more token types to skip.

: One or more token types to skip. returns {Array}: Returns an array if skipped tokens.

Example

lexer.skipWhile( tok => tok.type !== 'space' );

Skip the given token types .

Params

types {string|Array} : One or more token types to skip.

: One or more token types to skip. returns {Array}: Returns an array if skipped tokens

Example

lexer.skipType( 'space' ); lexer.skipType([ 'newline' , 'space' ]);

Pushes the given token onto lexer.tokens and calls .append() to push token.value onto lexer.stash . Disable pushing onto the stash by setting lexer.options.append or token.append to false .

Params

token {object|String}

returns {Object}: Returns the given token .

Events

emits : push

Example

console .log(lexer.tokens.length); lexer.push( new Token( 'star' , '*' )); console .log(lexer.tokens.length); console .log(lexer.stash)

Append a string to the last element on lexer.stash , or push the string onto the stash if no elements exist.

Params

value {String}

returns {String}: Returns the last value in the array.

Example

const stack = new Stack(); stack.push( 'a' ); stack.push( 'b' ); stack.push( 'c' ); stack.append( '_foo' ); stack.append( '_bar' ); console .log(stack);

Returns true if a token with the given type is on the stack.

Params

type {string} : The type to check for.

: The type to check for. returns {boolean}

Example

if (lexer.isInside( 'bracket' ) || lexer.isInside( 'brace' )) { }

Throw a formatted error message with details including the cursor position.

Params

msg {string} : Message to use in the Error.

: Message to use in the Error. node {object}

returns {undefined}

Example

lexer.set( 'foo' , function ( tok ) { if (tok.value !== 'foo' ) { throw this .state.error( 'expected token.value to be "foo"' , tok); } });

Call a plugin function on the lexer instance.

Params

fn {function}

returns {object}: Returns the lexer instance.

Example

lexer.use( function ( lexer ) { });

Static method that returns true if the given value is an instance of snapdragon-lexer .

Params

lexer {object}

returns {Boolean}

Example

const Lexer = require ( 'snapdragon-lexer' ); const lexer = new Lexer(); console .log(Lexer.isLexer(lexer)); console .log(Lexer.isLexer({}));

Static method that returns true if the given value is an instance of snapdragon-token . This is a proxy to Token#isToken .

Params

lexer {object}

returns {Boolean}

Example

const Token = require ( 'snapdragon-token' ); const Lexer = require ( 'snapdragon-lexer' ); console .log(Lexer.isToken( new Token({ type : 'foo' }))); console .log(Lexer.isToken({}));

The State class, exposed as a static property.

The Token class, exposed as a static property.

Register a handler function.

Params

type {String}

fn {Function}: The handler function to register.

Example

lexer.set( 'star' , function ( token ) { });

As an alternative to .set , the .capture method will automatically register a handler when a function is passed as the last argument.

Get a registered handler function.

Params

type {String}

fn {Function}: The handler function to register.

Example

lexer.set( 'star' , function ( ) { }); const star = handlers.get( 'star' );

Properties

Type: {boolean}

Default: true (contant)

This property is defined as a convenience, to make it easy for plugins to check for an instance of Lexer.

Type: {string}

Default: ''

The unmodified source string provided by the user.

Type: {string}

Default: ''

The source string minus the part of the string that has already been consumed.

Type: {string}

Default: ''

The part of the source string that has been consumed.

Type: {array}

Default: `[]

Array of lexed tokens.

Type: {array}

Default: [''] (instance of snapdragon-stack)

Array of captured strings. Similar to the lexer.tokens array, but stores strings instead of token objects.

Type: {array}

Default: `[]

LIFO (last in, first out) array. A token is pushed onto the stack when an "opening" character or character sequence needs to be tracked. When the (matching) "closing" character or character sequence is encountered, the (opening) token is popped off of the stack.

The stack is not used by any lexer methods, it's reserved for the user. Stacks are necessary for creating Abstract Syntax Trees (ASTs), but if you require this functionality it would be better to use a parser such as [snapdragon-parser][snapdragon-parser], with methods and other conveniences for creating an AST.

Type: {array}

Default: `[]

FIFO (first in, first out) array, for temporarily storing tokens that are created when .lookahead() is called (or a method that calls .lookhead() , such as .peek()).

Tokens are dequeued when .next() is called.

Type: {Object}

Default: { index: 0, column: 0, line: 1 }

The updated source string location with the following properties.

index - 0-index

- 0-index column - 0-index

- 0-index line - 1-index

The following plugins are available for automatically updating tokens with the location:

Options

Type: {string}

Default: undefined

The source of the input string. This is typically a filename or file path, but can also be 'string' if a string or buffer is provided directly.

If lexer.input is undefined, and options.source is a string, the lexer will attempt to set lexer.input by calling fs.readFileSync() on the value provided on options.source .

Type: {string}

Default: undefined

If options.mode is character , instead of calling handlers (which match using regex) the .advance() method will consume and return one character at a time.

Type: {string}

Default: undefined

Specify the token property to use when the .push method pushes a value onto lexer.stash. The logic works something like this:

lexer.append(token[lexer.options.value || 'value' ]);

Tokens

See the snapdragon-token documentation for more details.

Plugins

Plugins are registered with the lexer.use() method and use the following conventions.

Plugin Conventions

Plugins are functions that take an instance of snapdragon-lexer.

However, it's recommended that you always wrap your plugin function in another function that takes an options object. This allow users to pass options when using the plugin. Even if your plugin doesn't take options, it's a best practice for users to always be able to use the same signature.

Example

function plugin ( options ) { return function ( lexer ) { }; } lexer.use(plugin());

About

Related projects

You might also be interested in these projects:

