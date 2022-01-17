CBOR is "Concise Binary Object Representation", defined by RFC 8949. Like JSON, but binary, more compact, and supporting a much broader range of data types.
cborg focuses on strictness and deterministic data representations. CBORs flexibility leads to problems where determinism matters, such as in content-addressed data where your data encoding should converge on same-bytes for same-data. cborg helps aleviate these challenges.
cborg is also fast, and is suitable for the browser (is
Uint8Array native) and Node.js.
cborg supports CBOR tags, but does not ship with them enabled by default. If you want tags, you need to plug them in to the encoder and decoder.
import { encode, decode } from 'cborg'
const decoded = decode(Buffer.from('a16474686973a26269736543424f522163796179f5', 'hex'))
console.log('decoded:', decoded)
console.log('encoded:', encode(decoded))
decoded: { this: { is: 'CBOR!', yay: true } }
encoded: Uint8Array(21) [
161, 100, 116, 104, 105, 115,
162, 98, 105, 115, 101, 67,
66, 79, 82, 33, 99, 121,
97, 121, 245
]
When installed globally via
npm (with
npm install cborg --global), the
cborg command will be available that provides some handy CBOR CLI utilities. Run with
cborg help for additional details.
cborg json2hex '<json string>'
Convert a JSON object into CBOR bytes in hexadecimal format.
$ cborg json2hex '["a", "b", 1, "😀"]'
84616161620164f09f9880
cborg hex2json [--pretty] <hex string>
Convert a hexadecimal string to a JSON format.
$ cborg hex2json 84616161620164f09f9880
["a","b",1,"😀"]
$ cborg hex2json --pretty 84616161620164f09f9880
[
"a",
"b",
1,
"😀"
]
cborg hex2diag <hex string>
Convert a hexadecimal string to a CBOR diagnostic output format which explains the byte contents.
$ cborg hex2diag 84616161620164f09f9880
84 # array(4)
61 # string(1)
61 # "a"
61 # string(1)
62 # "b"
01 # uint(1)
64 f09f # string(2)
f09f9880 # "😀"
encode(object[, options])
import { encode } from 'cborg'
const { encode } = require('cborg')
Encode a JavaScript object and return a
Uint8Array with the CBOR byte representation.
Date or a
RegExp or another exotic type, you should either form them into intermediate forms before encoding or enable a tag encoder (see Type encoders).
null,
undefined,
number,
bigint,
string,
boolean,
Array,
Object,
Map,
Buffer,
ArrayBuffer,
DataView,
Uint8Array and all other
TypedArrays (the underlying byte array of TypedArrays is encoded, so they will all round-trip as a
Uint8Array since the type information is lost).
Numbers will be encoded as integers if they don't have a fractional part (
1 and
1.0 are both considered integers, they are identical in JavaScript). Otherwise they will be encoded as floats.
Number.MAX_SAFE_INTEGER or less than
Number.MIN_SAFE_INTEGER will be encoded as floats. There is no way to safely determine whether a number has a fractional part outside of this range.
BigInts are supported by default within the 64-bit unsigned range but will be also be encoded to their smallest possible representation (so will not round-trip as a
BigInt if they are smaller than
Number.MAX_SAFE_INTEGER). Larger
BigInts require a tag (officially tags 2 and 3).
float64 option is supplied.
true,
false,
undefined and
null. "Simple values" outside of this range are intentionally not supported (pull requests welcome to enable them with an option).
float64 (boolean, default
false): do not attempt to store floats as their smallest possible form, store all floats as 64-bit
typeEncoders (object): a mapping of type name to function that can encode that type into cborg tokens. This may also be used to reject or transform types as objects are dissected for encoding. See the Type encoders section below for more information.
mapSorter (function): a function taking two arguments, where each argument is a
Token, or an array of
Tokens representing the keys of a map being encoded. Similar to other JavaScript compare functions, a
-1,
1 or
0 (which shouldn't be possible) should be returned depending on the sorting order of the keys. See the source code for the default sorting order which uses the length-first rule recommendation from RFC 7049.
decode(data[, options])
import { decode } from 'cborg'
const { decode } = require('cborg')
Decode valid CBOR bytes from a
Uint8Array (or
Buffer) and return a JavaScript object.
BigInt.
true,
false,
undefined and
null. "Simple values" outside of this range are intentionally not supported (pull requests welcome to enable them with an option).
allowIndefinite (boolean, default
true): when the indefinite length additional information (
31) is encountered for any type (arrays, maps, strings, bytes) or a "break" is encountered, an error will be thrown.
allowUndefined (boolean, default
true): when major 7, minor 23 (
undefined) is encountered, an error will be thrown. To disallow
undefined on encode, a custom type encoder for
'undefined' will need to be supplied.
coerceUndefinedToNull (boolean, default
false): when both
allowUndefined and
coerceUndefinedToNull are set to
true, all
undefined tokens (major
7 minor
23:
0xf7) will be coerced to
null tokens, such that
undefined is an allowed token but will not appear in decoded values.
allowInfinity (boolean, default
true): when an IEEE 754
Infinity or
-Infinity value is encountered when decoding a major 7, an error will be thrown. To disallow
Infinity and
-Infinity on encode, a custom type encoder for
'number' will need to be supplied.
allowNaN (boolean, default
true): when an IEEE 754
NaN value is encountered when decoding a major 7, an error will be thrown. To disallow
NaN on encode, a custom type encoder for
'number' will need to be supplied.
allowBigInt (boolean, default
true): when an integer outside of the safe integer range is encountered, an error will be thrown. To disallow
BigInts on encode, a custom type encoder for
'bigint' will need to be supplied.
strict (boolean, default
false): when decoding integers, including for lengths (arrays, maps, strings, bytes), values will be checked to see whether they were encoded in their smallest possible form. If not, an error will be thrown.
useMaps (boolean, default
false): when decoding major 5 (map) entries, use a
Map rather than a plain
Object. This will nest for any encountered map. During encode, a
Map will be interpreted as an
Object and will round-trip as such unless
useMaps is supplied, in which case, all
Maps and
Objects will round-trip as
Maps. There is no way to retain the distinction during round-trip without using a custom tag.
tags (array): a mapping of tag number to tag decoder function. By default no tags are supported. See Tag decoders.
tokenizer (object): an object with two methods,
next() which returns a
Token and
done() which returns a
boolean. Can be used to implement custom input decoding. See the source code for examples.
The
typeEncoders property to the
options argument to
encode() allows you to add additional functionality to cborg, or override existing functionality.
When converting JavaScript objects, types are differentiated using the method and naming used by @sindresorhus/is (a custom implementation is used internally for performance reasons) and an internal set of type encoders are used to convert objects to their appropriate CBOR form. Supported types are:
null,
undefined,
number,
bigint,
string,
boolean,
Array,
Object,
Map,
Buffer,
ArrayBuffer,
DataView,
Uint8Array and all other
TypedArrays (their underlying byte array is encoded, so they will all round-trip as a
Uint8Array since the type information is lost). Any object that doesn't match a type in this list will cause an error to be thrown during decode. e.g.
encode(new Date()) will throw an error because there is no internal
Date type encoder.
The
typeEncoders option is an object whose property names match to @sindresorhus/is type names. When this option is provided and a property exists for any given object's type, the function provided as the value to that property is called with the object as an argument.
If a type encoder function returns
null, the default encoder, if any, is used instead.
If a type encoder function returns an array, cborg will expect it to contain zero or more
Token objects that will be encoded to binary form.
Tokens map directly to CBOR entities. Each one has a
Type and a
value. A type encoder is responsible for turning a JavaScript object into a set of tags.
This example is available from the cborg taglib as
bigIntEncoder (
import { bigIntEncoder } as taglib from 'cborg/taglib') and implements CBOR tags 2 and 3 (bigint and negative bigint). This function would be registered using an options parameter
{ typeEncoders: { bigint: bigIntEncoder } }. All objects that have a type
bigint will pass through this function.
import { Token, Type } from './cborg.js'
function bigIntEncoder (obj) {
// check whether this BigInt could fit within a standard CBOR 64-bit int or less
if (obj >= -1n * (2n ** 64n) && obj <= (2n ** 64n) - 1n) {
return null // handle this as a standard int or negint
}
// it's larger than a 64-bit int, encode as tag 2 (positive) or 3 (negative)
return [
new Token(Type.tag, obj >= 0n ? 2 : 3),
new Token(Type.bytes, fromBigInt(obj >= 0n ? obj : obj * -1n - 1n))
]
}
function fromBigInt (i) { /* returns a Uint8Array, omitted from example */ }
This example encoder demonstrates the ability to pass-through to the default encoder, or convert to a series of custom tags. In this case we can put any arbitrarily large
BigInt into a byte array using the standard CBOR tag 2 and 3 types.
Valid
Token types for the second argument to
Token() are:
Type.uint
Type.negint
Type.bytes
Type.string
Type.array
Type.map
Type.tag
Type.float
Type.false
Type.true
Type.null
Type.undefined
Type.break
Using type encoders we can:
Tokens)
null as a pass-through)
numbers into floats)
undefined)
By default cborg does not support decoding of any tags. Where a tag is encountered during decode, an error will be thrown. If tag support is needed, they will need to be supplied as options to the
decode() function. The
tags property should contain an array where the indexes correspond to the tag numbers that are encountered during decode, and the values are functions that are able to turn the following token(s) into a JavaScript object. Each tag token in CBOR is followed by a data item, often a byte array of arbitrary length, but can be a more complex series of tokens that form a nested data item. This token is supplied to the tag decoder function.
This example is available from the cborg taglib as
bigIntDecoder and
bigNegIntDecoder (
import { bigIntDecoder, bigNegIntDecoder } as taglib from 'cborg/taglib') and implements CBOR tags 2 and 3 (bigint and negative bigint). This function would be registered using an options parameter:
const tags = []
tags[2] = bigIntDecoder
tags[3] = bigNegIntDecoder
decode(bytes, { tags })
Implementation:
function bigIntDecoder (bytes) {
let bi = 0n
for (let ii = 0; ii < bytes.length; ii++) {
bi = (bi << 8n) + BigInt(bytes[ii])
}
return bi
}
function bigNegIntDecoder (bytes) {
return -1n - bigIntDecoder(bytes)
}
cborg is designed with deterministic encoding forms as a primary feature. It is suitable for use with content addressed systems or other systems where convergence of binary forms is important. The ideal is to have strictly one way of mapping a set of data into a binary form. Unfortunately CBOR has many opportunities for flexibility, including:
1 may be encoded as
0x01,
0x1801,
0x190001,
1a00000001 or
1b0000000000000001.
1. Tags can also vary in size and still represent the same number.
NaN,
Infinity and
-Infinity to be represented in many different ways, meaning it is possible to represent the same data using many different byte forms.
By default, cborg will always encode objects to the same bytes by applying some strictness rules:
number differentiation - if a fractional part is missing and it's within the safe integer boundary, it's encoded as an integer, otherwise it's encoded as a float.
By default, cborg allows for some flexibility on decode of objects, which will present some challenges if users wish to impose strictness requirements at both serialization and deserialization. Options that can be provided to
decode() to impose some strictness requirements are:
strict: true to impose strict sizing rules for int, negative ints and lengths of lengthed objects
allowNaN: false and
allowInfinity to prevent decoding of any value that would resolve to
NaN,
Infinity or
-Infinity, using CBOR tokens or IEEE 754 representation—as long as your application can do without these symbols.
allowIndefinite: false to disallow indefinite lengthed objects and the "break" tag
Currently, there are two areas that cborg cannot impose strictness requirements (pull requests welcome!):
There are a number of forms where an object will not round-trip precisely, if this matters for an application, care should be taken, or certain types should be disallowed entirely during encode.
TypedArrays will decode as
Uint8Arrays, unless a custom tag is used.
Map and
Object will be encoded as a CBOR
map, as will any other object that inherits from
Object that can't be differentiated by the @sindresorhus/is algorithm. They will all decode as
Object by default, or
Map if
useMaps is set to
true. e.g.
{ foo: new Map() } will round-trip to
{ foo: {} } by default.
cborg can also encode and decode JSON using the same pipeline and many of the same settings. For most (but not all) cases it will be faster to use
JSON.parse() and
JSON.stringify(), however cborg provides much more control over the process to handle determinism and be more restrictive in allowable forms. It also operates natively with Uint8Arrays rather than strings which may also offer some minor efficiency or usability gains in some circumstances.
Use
import { encode, decode } from 'cborg/json' or
const { encode, decode } = require('cborg/json') to access the JSON handling encoder and decoder.
Many of the same encode and decode options available for CBOR can be used to manage JSON handling. These include strictness requirements for decode and custom tag encoders for encode. Tag encoders can't create new tags as there are no tags in JSON, but they can replace JavaScript object forms with custom JSON forms (e.g. convert a
Uint8Array to a valid JSON form rather than having the encoder throw an error). The inverse is also possible, turning specific JSON forms into JavaScript forms, by using a custom tokenizer on decode.
Special notes on options specific to the JSON:
allowBigInt option: is repurposed for the JSON decoder and defaults to
false. When
false, all numbers are decoded as
Number, possibly losing precision when encountering numbers outside of the JavaScript safe integer range. When
true numbers that have a decimal point (
., even if just
.0) are returned as a
Number, but for numbers without a decimal point and that are outside of the JavaScript safe integer range, they are returned as
BigInts. This behaviour differs from CBOR decoding which will error when decoding integer and negative integer tokens that are outside of the JavaScript safe integer range if
allowBigInt is
false.
See @ipld/dag-json for an advanced use of the cborg JSON encoder and decoder including round-tripping of
Uint8Arrays and custom JavaScript classes (IPLD
CID objects in this case).
Similar to the CBOR example above, using JSON:
import { encode, decode } from 'cborg/json'
const decoded = decode(Buffer.from('7b2274686973223a7b226973223a224a534f4e21222c22796179223a747275657d7d', 'hex'))
console.log('decoded:', decoded)
console.log('encoded:', encode(decoded))
console.log('encoded (string):', Buffer.from(encode(decoded)).toString())
decoded: { this: { is: 'JSON!', yay: true } }
encoded: Uint8Array(34) [
123, 34, 116, 104, 105, 115, 34, 58,
123, 34, 105, 115, 34, 58, 34, 74,
83, 79, 78, 33, 34, 44, 34, 121,
97, 121, 34, 58, 116, 114, 117, 101,
125, 125
]
encoded (string): {"this":{"is":"JSON!","yay":true}}
