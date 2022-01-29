unified is an interface for processing text using syntax trees. It’s what powers remark (Markdown), retext (natural language), and rehype (HTML), and allows for processing between formats.
unified enables new exciting projects like Gatsby to pull in Markdown, MDX to embed JSX, and Prettier to format it. It’s used in about 500k projects on GitHub and has about 25m downloads each month on npm: you’re probably using it. Some notable users are Node.js, Vercel, Netlify, GitHub, Mozilla, WordPress, Adobe, Facebook, Google, and many more.
unifiedjs.com and peruse its Learn section
This package is ESM only:
Node 12+ is needed to use it and it must be
imported instead of
required.
npm:
npm install unified
import {unified} from 'unified'
import remarkParse from 'remark-parse'
import remarkRehype from 'remark-rehype'
import rehypeDocument from 'rehype-document'
import rehypeFormat from 'rehype-format'
import rehypeStringify from 'rehype-stringify'
import {reporter} from 'vfile-reporter'
unified()
.use(remarkParse)
.use(remarkRehype)
.use(rehypeDocument, {title: '👋🌍'})
.use(rehypeFormat)
.use(rehypeStringify)
.process('# Hello world!')
.then(
(file) => {
console.error(reporter(file))
console.log(String(file))
},
(error) => {
// Handle your error here!
throw error
}
)
Yields:
no issues found
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>👋🌍</title>
<meta name="viewport" content="width=device-width, initial-scale=1">
</head>
<body>
<h1>Hello world!</h1>
</body>
</html>
Plugin
Preset
unified is an interface for processing text using syntax trees. Syntax trees are a representation of text understandable to programs. Those programs, called plugins, take these trees and inspect and modify them. To get to the syntax tree from text, there is a parser. To get from that back to text, there is a compiler. This is the process of a processor.
| ........................ process ........................... |
| .......... parse ... | ... run ... | ... stringify ..........|
+--------+ +----------+
Input ->- | Parser | ->- Syntax Tree ->- | Compiler | ->- Output
+--------+ | +----------+
X
|
+--------------+
| Transformers |
+--------------+
Every processor implements another processor. To create a processor, call another processor. The new processor is configured to work the same as its ancestor. But when the descendant processor is configured in the future it does not affect the ancestral processor.
When processors are exposed from a module (for example,
unified itself) they
should not be configured directly, as that would change their behavior for all
module users.
Those processors are frozen and they should be called to create a
new processor before they are used.
The syntax trees used in unified are unist nodes.
A node is a plain JavaScript objects with a
type field.
The semantics of nodes and format of syntax trees is defined by other projects.
There are several utilities for working with nodes.
The following projects process different syntax tree formats. They parse text to a syntax tree and compile that back to text. These processors can be used as is, or their parser and compiler can be mixed and matched with unified and plugins to process between different syntaxes.
The below plugins work with unified, on all syntax tree formats:
unified-diff
— Ignore messages for unchanged lines in Travis
unified-message-control
— Enable, disable, and ignore messages
See remark, rehype, and retext for their lists of plugins.
When processing a document, metadata is often gathered about that document. vfile is a virtual file format that stores data, metadata, and messages about files for unified and its plugins.
There are several utilities for working with these files.
Processors are configured with plugins or
with the
data method.
unified can integrate with the file system with
unified-engine.
CLI apps can be created with
unified-args, Gulp plugins with
unified-engine-gulp, and Atom Linters with
unified-engine-atom.
unified-stream provides a streaming interface.
The API provided by unified allows multiple files to be processed and gives access to metadata (such as lint messages):
import {unified} from 'unified'
import remarkParse from 'remark-parse'
import remarkPresetLintMarkdownStyleGuide from 'remark-preset-lint-markdown-style-guide'
import remarkRetext from 'remark-retext'
import retextEnglish from 'retext-english'
import retextEquality from 'retext-equality'
import remarkRehype from 'remark-rehype'
import rehypeStringify from 'rehype-stringify'
import {reporter} from 'vfile-reporter'
unified()
.use(remarkParse)
.use(remarkPresetLintMarkdownStyleGuide)
.use(remarkRetext, unified().use(retextEnglish).use(retextEquality))
.use(remarkRehype)
.use(rehypeStringify)
.process('*Emphasis* and _stress_, you guys!')
.then(
(file) => {
console.error(reporter(file))
console.log(String(file))
},
(error) => {
// Handle your error here!
throw error
}
)
Yields:
1:16-1:24 warning Emphasis should use `*` as a marker emphasis-marker remark-lint
1:30-1:34 warning `guys` may be insensitive, use `people`, `persons`, `folks` instead gals-man retext-equality
⚠ 2 warnings
<p><em>Emphasis</em> and <em>stress</em>, you guys!</p>
Processors can be combined in two modes.
Bridge mode transforms the syntax tree from one format (origin) to another (destination). Another processor runs on the destination tree. Finally, the original processor continues transforming the origin tree.
Mutate mode also transforms the syntax tree from one format to another. But the original processor continues transforming the destination tree.
In the previous example (“Programming interface”),
remark-retext is used in
bridge mode: the origin syntax tree is kept after retext is
done; whereas
remark-rehype is used in mutate mode: it sets a new syntax
tree and discards the origin tree.
This package exports the following identifiers:
unified.
There is no default export.
processor()
Processor describing how to process text.
Function — New unfrozen processor that is configured to work the
same as its ancestor.
When the descendant processor is configured in the future it does not affect the
ancestral processor.
The following example shows how a new processor can be created (from the remark processor) and linked to stdin(4) and stdout(4).
import {remark} from 'remark'
import concatStream from 'concat-stream'
process.stdin.pipe(
concatStream((buf) => {
process.stdout.write(remark().processSync(buf).toString())
})
)
processor.use(plugin[, options])
Configure the processor to use a plugin and optionally configure that plugin with options.
If the processor is already using this plugin, the previous plugin configuration is changed based on the options that are passed in. The plugin is not added a second time.
processor.use(plugin[, options])
processor.use(preset)
processor.use(list)
plugin (
Attacher)
options (
*, optional) — Configuration for
plugin
preset (
Object) — Object with an optional
plugins (set to
list),
and/or an optional
settings object
list (
Array) — List of plugins, presets, and pairs (
plugin and
options in an array)
processor — The processor that
use was called on.
use cannot be called on frozen processors.
Call the processor first to create a new unfrozen processor.
There are many ways to pass plugins to
.use().
The below example gives an overview.
import {unified} from 'unified'
unified()
// Plugin with options:
.use(pluginA, {x: true, y: true})
// Passing the same plugin again merges configuration (to `{x: true, y: false, z: true}`):
.use(pluginA, {y: false, z: true})
// Plugins:
.use([pluginB, pluginC])
// Two plugins, the second with options:
.use([pluginD, [pluginE, {}]])
// Preset with plugins and settings:
.use({plugins: [pluginF, [pluginG, {}]], settings: {position: false}})
// Settings only:
.use({settings: {position: false}})
processor.parse(file)
Parse text to a syntax tree.
Node — Parsed syntax tree representing
file.
parse freezes the processor if not already frozen.
parse performs the parse phase, not the run phase or other
phases.
The below example shows how
parse can be used to create a syntax tree from a
file.
import {unified} from 'unified'
import remarkParse from 'remark-parse'
const tree = unified().use(remarkParse).parse('# Hello world!')
console.log(tree)
Yields:
{
type: 'root',
children: [
{type: 'heading', depth: 1, children: [Array], position: [Position]}
],
position: {
start: {line: 1, column: 1, offset: 0},
end: {line: 1, column: 15, offset: 14}
}
}
processor.Parser
A parser handles the parsing of text to a syntax tree.
Used in the parse phase and called with a
string and
VFile representation of the text to parse.
Parser can be a function, in which case it must return a
Node: the
syntax tree representation of the given file.
Parser can also be a constructor function (a function with a
parse field, or
other fields, in its
prototype), in which case it’s constructed with
new.
Instances must have a
parse method that is called without arguments and must
return a
Node.
processor.stringify(node[, file])
Compile a syntax tree.
node (
Node) — Syntax tree to compile
file (
VFile, optional) — File, any value accepted by
vfile()
string or
Buffer (see notes) — Textual representation of the syntax
tree
stringify freezes the processor if not already frozen.
stringify performs the stringify phase, not the run phase
or other phases.
unified typically compiles by serializing: most compilers return
string (or
Buffer).
Some compilers, such as the one configured with
rehype-react,
return other values (in this case, a React tree).
If you’re using a compiler doesn’t serialize, expect different result values.
When using TypeScript, cast the type on your side.
The below example shows how
stringify can be used to serialize a syntax tree.
import {unified} from 'unified'
import rehypeStringify from 'rehype-stringify'
import {h} from 'hastscript'
const tree = h('h1', 'Hello world!')
const doc = unified().use(rehypeStringify).stringify(tree)
console.log(doc)
Yields:
<h1>Hello world!</h1>
processor.Compiler
A compiler handles the compiling of a syntax tree to text.
Used in the stringify phase and called with a
Node
and
VFile representation of syntax tree to compile.
Compiler can be a function, in which case it should return a
string: the
textual representation of the syntax tree.
Compiler can also be a constructor function (a function with a
compile
field, or other fields, in its
prototype), in which case it’s constructed with
new.
Instances must have a
compile method that is called without arguments and
should return a
string.
processor.run(node[, file][, done])
Run transformers on a syntax tree.
node (
Node) — Syntax tree to run on
file (
VFile, optional) — File, any value accepted by
vfile()
done (
Function, optional) — Callback
Promise if
done is not given.
The returned promise is rejected with a fatal error, or resolved with the
transformed syntax tree.
run freezes the processor if not already frozen.
run performs the run phase, not other phases.
function done(err[, node, file])
Callback called when transformers are done. Called with either an error or results.
err (
Error, optional) — Fatal error
node (
Node, optional) — Transformed syntax tree
file (
VFile, optional) — File
The below example shows how
run can be used to transform a syntax tree.
import {unified} from 'unified'
import remarkReferenceLinks from 'remark-reference-links'
import {u} from 'unist-builder'
const tree = u('root', [
u('paragraph', [
u('link', {href: 'https://example.com'}, [u('text', 'Example Domain')])
])
])
unified()
.use(remarkReferenceLinks)
.run(tree)
.then(
(changedTree) => console.log(changedTree),
(error) => {
// Handle your error here!
throw error
}
)
Yields:
{
type: 'root',
children: [
{type: 'paragraph', children: [Array]},
{type: 'definition', identifier: '1', title: undefined, url: undefined}
]
}
processor.runSync(node[, file])
Run transformers on a syntax tree.
An error is thrown if asynchronous plugins are configured.
node (
Node) — Syntax tree to run on
file (
VFile, optional) — File, any value accepted by
vfile()
Node — Transformed syntax tree.
runSync freezes the processor if not already frozen.
runSync performs the run phase, not other phases.
processor.process(file[, done])
Process the given file as configured on the processor.
Promise if
done is not given.
The returned promise is rejected with a fatal error, or resolved with the
processed file.
The parsed, transformed, and compiled value is exposed on
file.value or
file.result (see notes).
process freezes the processor if not already frozen.
process performs the parse, run, and stringify phases.
unified typically compiles by serializing: most compilers return
string (or
Buffer).
Some compilers, such as the one configured with
rehype-react,
return other values (in this case, a React tree).
If you’re using a compiler that serializes, the result is available at
file.value.
Otherwise, the result is available at
file.result.
The below example shows how
process can be used to process a file, whether
transformers are asynchronous or not, with promises.
import {unified} from 'unified'
import remarkParse from 'remark-parse'
import remarkRehype from 'remark-rehype'
import rehypeDocument from 'rehype-document'
import rehypeFormat from 'rehype-format'
import rehypeStringify from 'rehype-stringify'
unified()
.use(remarkParse)
.use(remarkRehype)
.use(rehypeDocument, {title: '👋🌍'})
.use(rehypeFormat)
.use(rehypeStringify)
.process('# Hello world!')
.then(
(file) => console.log(String(file)),
(error) => {
// Handle your error here!
throw error
}
)
Yields:
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>👋🌍</title>
<meta name="viewport" content="width=device-width, initial-scale=1">
</head>
<body>
<h1>Hello world!</h1>
</body>
</html>
function done(err, file)
Callback called when the process is done. Called with a fatal error, if any, and a file.
The below example shows how
process can be used to process a file, whether
transformers are asynchronous or not, with a callback.
import {unified} from 'unified'
import remarkParse from 'remark-parse'
import remarkGithub from 'remark-github'
import remarkStringify from 'remark-stringify'
import {reporter} from 'vfile-reporter'
unified()
.use(remarkParse)
.use(remarkGithub)
.use(remarkStringify)
.process('@unifiedjs')
.then(
(file) => {
console.error(reporter(file))
console.log(String(file))
},
(error) => {
// Handle your error here!
throw error
}
)
Yields:
no issues found
[**@unifiedjs**](https://github.com/unifiedjs)
processor.processSync(file|value)
Process the given file as configured on the processor.
An error is thrown if asynchronous plugins are configured.
The parsed, transformed, and compiled value is exposed on
file.value or
file.result (see notes).
processSync freezes the processor if not already frozen.
processSync performs the parse, run, and stringify
phases.
unified typically compiles by serializing: most compilers return
string (or
Buffer).
Some compilers, such as the one configured with
rehype-react,
return other values (in this case, a React tree).
If you’re using a compiler that serializes, the result is available at
file.value.
Otherwise, the result is available at
file.result.
The below example shows how
processSync can be used to process a file, if all
transformers are synchronous.
import {unified} from 'unified'
import remarkParse from 'remark-parse'
import remarkRehype from 'remark-rehype'
import rehypeDocument from 'rehype-document'
import rehypeFormat from 'rehype-format'
import rehypeStringify from 'rehype-stringify'
const processor = unified()
.use(remarkParse)
.use(remarkRehype)
.use(rehypeDocument, {title: '👋🌍'})
.use(rehypeFormat)
.use(rehypeStringify)
console.log(processor.processSync('# Hello world!').toString())
Yields:
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>👋🌍</title>
<meta name="viewport" content="width=device-width, initial-scale=1">
</head>
<body>
<h1>Hello world!</h1>
</body>
</html>
processor.data([key[, value]])
Configure the processor with information available to all plugins. Information is stored in an in-memory key-value store.
Typically, options can be given to a specific plugin, but sometimes it makes sense to have information shared with several plugins. For example, a list of HTML elements that are self-closing, which is needed during all phases of the process.
processor = processor.data(key, value)
processor = processor.data(values)
value = processor.data(key)
info = processor.data()
key (
string, optional) — Identifier
value (
*, optional) — Value to set
values (
Object, optional) — Values to set
processor — If setting, the processor that
data is called on
value (
*) — If getting, the value at
key
info (
Object) — Without arguments, the key-value store
Setting information cannot occur on frozen processors. Call the processor first to create a new unfrozen processor.
The following example show how to get and set information:
import {unified} from 'unified'
const processor = unified().data('alpha', 'bravo')
processor.data('alpha') // => 'bravo'
processor.data() // => {alpha: 'bravo'}
processor.data({charlie: 'delta'})
processor.data() // => {charlie: 'delta'}
processor.freeze()
Freeze a processor. Frozen processors are meant to be extended and not to be configured directly.
Once a processor is frozen it cannot be unfrozen. New processors working the same way can be created by calling the processor.
It’s possible to freeze processors explicitly by calling
.freeze().
Processors freeze implicitly when
.parse(),
.run(),
.runSync(),
.stringify(),
.process(),
or
.processSync() are called.
processor — The processor that
freeze was called on.
The following example,
index.js, shows how rehype prevents extensions to
itself:
import {unified} from 'unified'
import remarkParse from 'rehype-parse'
import remarkStringify from 'rehype-stringify'
export const rehype = unified().use(remarkParse).use(remarkStringify).freeze()
The below example,
a.js, shows how that processor can be used and configured.
import {rehype} from 'rehype'
import rehypeFormat from 'rehype-format'
// …
rehype()
.use(rehypeFormat)
// …
The below example,
b.js, shows a similar looking example that operates on the
frozen rehype interface because it does not call
rehype.
If this behavior was allowed it would result in unexpected behavior so an
error is thrown.
This is invalid:
import {rehype} from 'rehype'
import rehypeFormat from 'rehype-format'
// …
rehype
.use(rehypeFormat)
// …
Yields:
~/node_modules/unified/index.js:426
throw new Error(
^
Error: Cannot call `use` on a frozen processor.
Create a new processor first, by calling it: use `processor()` instead of `processor`.
at assertUnfrozen (~/node_modules/unified/index.js:426:11)
at Function.use (~/node_modules/unified/index.js:165:5)
at ~/b.js:6:4
Plugin
Plugins configure the processors they are applied on in the following ways:
Plugins are a concept.
They materialize as
attachers.
move.js:
export function move(options = {}) {
const {extname} = options
if (!extname) {
throw new Error('Missing `extname` in options')
}
return transformer
function transformer(tree, file) {
if (file.extname && file.extname !== extname) {
file.extname = extname
}
}
}
index.md:
# Hello, world!
index.js:
import {unified} from 'unified'
import remarkParse from 'remark-parse'
import remarkRehype from 'remark-rehype'
import rehypeStringify from 'rehype-stringify'
import {toVFile} from 'to-vfile'
import {reporter} from 'vfile-reporter'
import {move} from './move.js'
unified()
.use(remarkParse)
.use(remarkRehype)
.use(move, {extname: '.html'})
.use(rehypeStringify)
.process(toVFile.readSync('index.md'))
.then(
(file) => {
console.error(reporter(file))
toVFile.writeSync(file) // Written to `index.html`.
},
(error) => {
// Handle your error here!
throw error
}
)
Yields:
index.md: no issues found
…and in
index.html:
<h1>Hello, world!</h1>
function attacher([options])
Attachers are materialized plugins. An attacher is a function that can receive options and configures the processor.
Attachers change the processor, such as the parser, the compiler, configuring data, or by specifying how the syntax tree or file are handled.
The context object (
this) is set to the processor the attacher is applied on.
options (
*, optional) — Configuration
transformer — Optional.
Attachers are called when the processor is frozen, not when they are applied.
function transformer(node, file[, next])
Transformers handle syntax trees and files.
A transformer is a function that is called each time a syntax tree and file are
passed through the run phase.
If an error occurs (either because it’s thrown, returned, rejected, or passed to
next), the process stops.
The run phase is handled by
trough, see its documentation for the
exact semantics of these functions.
node (
Node) — Syntax tree to handle
file (
VFile) — File to handle
next (
Function, optional)
void — If nothing is returned, the next transformer keeps using same tree.
Error — Fatal error to stop the process
node (
Node) — New syntax tree.
If returned, the next transformer is given this new tree
Promise — Returned to perform an asynchronous operation.
The promise must be resolved (optionally with a
Node) or
rejected (optionally with an
Error)
function next(err[, tree[, file]])
If the signature of a transformer includes
next (the third
argument), the transformer may perform asynchronous operations, and must
call
next().
err (
Error, optional) — Fatal error to stop the process
node (
Node, optional) — New syntax tree.
If given, the next transformer is given this new tree
file (
VFile, optional) — New file.
If given, the next transformer is given this new file
Preset
Presets are sharable configuration. They can contain plugins and settings.
preset.js:
import remarkPresetLintRecommended from 'remark-preset-lint-recommended'
import remarkPresetLintConsistent from 'remark-preset-lint-consistent'
import remarkCommentConfig from 'remark-comment-config'
import remarkToc from 'remark-toc'
import remarkLicense from 'remark-license'
export const preset = {
settings: {bullet: '*', emphasis: '*', fences: true},
plugins: [
remarkPresetLintRecommended,
remarkPresetLintConsistent,
remarkCommentConfig,
[remarkToc, {maxDepth: 3, tight: true}],
remarkLicense
]
}
example.md:
# Hello, world!
_Emphasis_ and **importance**.
## Table of contents
## API
## License
index.js:
import {remark} from 'remark'
import {toVFile} from 'to-vfile'
import {reporter} from 'vfile-reporter'
import {preset} from './preset.js'
remark()
.use(preset)
.process(toVFile.readSync('example.md'))
.then(
(file) => {
console.error(reporter(file))
toVFile.writeSync(file)
},
(error) => {
// Handle your error here!
throw error
}
)
Yields:
example.md: no issues found
example.md now contains:
# Hello, world!
*Emphasis* and **importance**.
## Table of contents
* [API](#api)
* [License](#license)
## API
## License
[MIT](license) © [Titus Wormer](https://wooorm.com)
See
contributing.md in
unifiedjs/.github for ways
to get started.
See
support.md for ways to get help.
Ideas for new plugins and tools can be posted in
unifiedjs/ideas.
A curated list of awesome unified resources can be found in awesome unified.
This project has a code of conduct. By interacting with this repository, organization, or community you agree to abide by its terms.
Preliminary work for unified was done in 2014 for
retext and inspired by
ware.
Further incubation happened in remark.
The project was finally externalised in 2015 and published as
unified.
The project was authored by @wooorm.
Although
unified since moved its plugin architecture to
trough,
thanks to @calvinfo,
@ianstormtaylor, and others for their
work on
ware, as it was a huge initial inspiration.
I have used unified for inspecting and serializing the content of my memo app. Although I found it difficult to use because of the unknown glitches, it was easy to set up because of the straightforward documentation but it slows down the system a little bit.
I used unifiedjs to implement a markdown parser in my note taking app. The app is deployed at https://noty.surge.sh/. It was a little hard to work with it and as the text size increases the parsing took a lot of time which was unexpected. So it hangs the browser if the data is too large. The library is well documented and serves much more purposes than markdown parsing.
It is a unified collective data which I used to compile content to syntax trees and vice-versa. It also provides hundreds of packages to work on the trees in between. It is easy to use and flexible. Has concise documentation. I recommend everyone to use this. Has support for TS as well. The community around this package is very helpful when running into issues.