compression of key-value data

npm install efrt

if your data looks like this:

var data = { bedfordshire : 'England' , aberdeenshire : 'Scotland' , buckinghamshire : 'England' , argyllshire : 'Scotland' , bambridgeshire : 'England' , cheshire : 'England' , ayrshire : 'Scotland' , banffshire : 'Scotland' }

you can compress it like this:

import { pack } from 'efrt' var str = pack(data)

then _very!_ quickly flip it back into:

import { unpack } from 'efrt' var obj = unpack(str) obj[ 'bedfordshire' ]

efrt packs category-type data into a very compressed prefix trie format, so that redundancies in the data are shared, and nothing is repeated.

By doing this clever-stuff ahead-of-time, efrt lets you ship much more data to the client-side, without hassle or overhead.

The whole library is 8kb, the unpack half is barely 2kb.

it is based on:

Benchmarks!

get a js object into very compact form

reduce filesize/bandwidth a bunch

ensure the unpacking time is negligible

keep word-lookups on critical-path

import { pack, unpack } from 'efrt' var foods = { strawberry : 'fruit' , blueberry : 'fruit' , blackberry : 'fruit' , tomato : [ 'fruit' , 'vegetable' ], cucumber : 'vegetable' , pepper : 'vegetable' } var str = pack(foods) var obj = unpack(str) console .log(obj.tomato)

or, an Array:

if you pass it an array of strings, it just creates an object with true values:

const data = [ 'january' , 'february' , 'april' , 'june' , 'july' , 'august' , 'september' , 'october' , 'november' , 'december' ] const packd = pack(data) const sameArray = Object .keys(unpack(packd))

Reserved characters

the keys of the object are normalized. Spaces/unicode are good, but numbers, case-sensitivity, and some punctuation (semicolon, comma, exclamation-mark) are not (yet) supported.

specialChars = new RegExp ( '[0-9A-Z,;!:|¦]' )

efrt is built-for, and used heavily in compromise, to expand the amount of data it can ship onto the client-side. If you find another use for efrt, please drop us a line🎈

Performance

efrt is tuned to be very quick to unzip. It is O(1) to lookup. Packing-up the data is the slowest part, which is usually fine:

var compressed = pack(skateboarders) var trie = unpack(compressed) trie.hasOwnProperty( 'tony hawk' )

Size

efrt will pack filesize down as much as possible, depending upon the redundancy of the prefixes/suffixes in the words, and the size of the list.

list of countries - 1.5k -> 0.8k (46% compressed)

(46% compressed) all adverbs in wordnet - 58k -> 24k (58% compressed)

(58% compressed) all adjectives in wordnet - 265k -> 99k (62% compressed)

(62% compressed) all nouns in wordnet - 1,775k -> 692k (61% compressed)

but there are some things to consider:

bigger files compress further (see 🎈 birthday problem)

using efrt will reduce gains from gzip compression, which most webservers quietly use

english is more suffix-redundant than prefix-redundant, so non-english words may benefit from other styles

Assuming your data has a low category-to-data ratio, you will hit-breakeven with at about 250 keys. If your data is in the thousands, you can very be confident about saving your users some considerable bandwidth.

Use

IE9+

< script src = "https://unpkg.com/efrt@latest/builds/efrt.min.cjs" > </ script > < script > var smaller = efrt.pack([ 'larry' , 'curly' , 'moe' ]) var trie = efrt.unpack(smaller) console .log(trie[ 'moe' ]) </ script >

if you're doing the second step in the client, you can load just the CJS unpack-half of the library(~3k):

const unpack = require ( 'efrt/unpack' )

< script src = "https://unpkg.com/efrt@latest/builds/efrt-unpack.min.cjs" > </ script > < script > var trie = unpack(compressedStuff) trie.hasOwnProperty( 'miles davis' ) </ script >

Thanks to John Resig for his fun trie-compression post on his blog, and Wiktor Jakubczyc for his performance analysis work

MIT