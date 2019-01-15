Stream based full-text search for Node.js and browsers. Using LevelDB as storage backend.
npm install levi
Full-text search using TF-IDF and cosine similarity plus query-time field boost options. Provided with configurable text processing pipeline: Tokenizer, Porter Stemmer and Stopwords filter.
Levi is built on LevelUP - a fast, asynchronous, transactional storage interface. By default, it uses LevelDB on Node.js and IndexedDB on browser. Also works with a variety of LevelDOWN compatible backends.
Using stream based query mechanism with Highland, Levi is designed to be memory efficient, and extensible by combining multiple scoring mechanisms.
Create a new Levi instance with a LevelUP database path or instance, or with a SublevelUP section.
var levi = require('levi')
// levi instance of database path `db`
var lv = levi('db')
.use(levi.tokenizer())
.use(levi.stemmer())
.use(levi.stopword())
Text processing pipeline
levi.tokenizer(),
levi.stemmer(),
levi.stopword() are required for indexing.
These are exposed as ginga plugins so that they can be swapped for different language configurations.
Index document identified by
key.
value can be object or string.
Use object fields for
value if you want field boost options for search.
All fields are indexed by default. Set
options.fields object to specify fields to be indexed.
Accepts optional callback function or returns a promise.
// string as value
lv.put('a', 'Lorem Ipsum is simply dummy text.', function (err) { ... })
// object fields as value
lv.put('b', {
id: 'b',
title: 'Lorem Ipsum',
body: 'Dummy text of the printing and typesetting industry.'
}, function (err) { ... })
// options.fields
lv.put('c', {
id: 'c',
title: 'Hello World',
body: 'Bla bla bla'
}, {
fields: { title: true } // index title only
}).then(...).catch(...) // returns promise if no callback function
Delete document
key from index.
Accepts optional callback function or returns a promise.
Atomic bulk-write operations put and del,
similar to LevelUP's array form of
batch()
Accepts optional callback function or returns a promise.
lv.batch([
{ type: 'put', key: 'a', value: 'Lorem Ipsum is simply dummy text.' },
{ type: 'del', key: 'b' }
], function (err) { ... })
Fetch value from the store. Works exactly like LevelUP's
get()
Accepts optional callback function or returns a promise.
Obtain a ReadStream of documents, lexicographically sorted by key.
Works exactly like LevelUP's
readStream()
The main search interface of Levi is a Node compatible highland object stream.
query can be a string or object fields.
Accepts following options:
fields control field boosts. By default every fields weight equally.
gt (greater than),
gte (greater than or equal) define the lower bound of key range to be searched.
lt (less than),
lte (less than or equal) define the upper bound of key range to be searched.
offset number, offset results. Default 0.
limit number, limit number of results. Default infinity.
expansions number, maximum expansions of prefix matching for "search as you type" behaviour. Default 0.
A "more like this" query can be done by searching with document itself.
lv.searchStream('lorem ipsum').toArray(function (results) { ... }) // highland method
lv.searchStream('lorem ipsum', {
fields: { title: 10, '*': 1 } // title field boost. '*' means any field
}).pipe(...)
lv.searchStream('lorem ipusm', {
fields: { title: 1 }, // title only
}).pipe(...)
// ltgt
lv.searchStream('lorem ipusm', {
gt: '!posts!',
lt: '!posts!~'
}).pipe(...)
// document as query
lv.searchStream({
title: 'Lorem Ipsum',
body: 'Dummy text of the printing and typesetting industry.'
}).pipe(...)
// maximum 10 expansions. 'ips' may also match 'ipso', 'ipsum' etc.
lv.searchStream('lorem ips', {
expansions: 10
}).pipe(...)
result is of form
{
key: 'b',
score: 0.5972843431749838,
value: {
id: 'b',
title: 'Lorem Ipsum',
body: 'Dummy text of the printing and typesetting industry.'
}
}
Underlying scoring mechanism of
searchStream(). Calculates relevancy score of documents against
query, lexicographically sorted by key.
Accepts options
fields,
gt,
gte,
lt,
lte,
expansions.
Useful for combining multiple criteria or scoring mechanisms to build a more advanced search functionality.
Underlying text processing pipeline of index and query, which extracts text tokens from a serializable
obj object.
Accepts optional callback function or returns a promise.
lv.pipeline({
a: 'foo bar is a placeholder name',
b: ['foo', 'bar'],
c: 167,
d: null,
e: { ghjk: ['printing'] }
}, function (err, tokens) {
// tokens
[ 'foo', 'bar', 'placehold', 'name', 'foo', 'bar', 'print' ]
})
Completely remove an existing database at
path,
which deletes the database directory on Node.js
or deletes the IndexedDB database on browser.
If you are using a custom Level backend, you need to invoke its corresponding
destroy() function to remove database properly.
Accepts optional callback function or returns a promise.
MIT