hast utility to transform to mdast.
Note: You probably want to use rehype-remark.
This package is ESM only:
Node 12+ is needed to use it and it must be
imported instead of
required.
npm:
npm install hast-util-to-mdast
Say we have the following
example.html:
<h2>Hello <strong>world!</strong></h2>
…and next to it,
example.js:
import {unified} from 'unified'
import rehypeParse from 'rehype-parse'
import remarkStringify from 'remark-stringify'
import {readSync} from 'to-vfile'
import {toMdast} from 'hast-util-to-mdast'
const file = readSync('example.html')
const hast = unified().use(rehypeParse).parse(file)
const mdast = toMdast(hast)
const doc = unified().use(remarkStringify).stringify(mdast)
console.log(doc)
Now, running
node example.js yields:
## Hello **world!**
This package exports the following identifiers:
toMdast,
one,
all.
There is no default export.
toMdast(tree[, options])
Transform the given hast tree to mdast.
options.handlers
Object mapping tag names or types to functions handling those
elements or nodes.
See
handlers/ for examples.
In a handler, you have access to
h, which should be used to create mdast nodes
from hast nodes.
On
h, there are fields that may be of interest.
Most interesting of them is
h.wrapText, which is
true if the mdast content
can include newlines, and
false if not (such as in headings or table cells).
options.document
Whether the given tree is a complete document.
Applies if the given
tree is a
root.
First its children are transformed to mdast.
By default, if one or more of the new mdast children are phrasing
nodes, and one or more are not, the phrasing nodes are wrapped in
paragraphs.
If
document: true, all mdast phrasing children are wrapped in paragraphs.
options.newlines
Whether to collapse to a line feed (
\n) instead of a single space (default) if
a streak of white-space in a text node contains a newline.
options.checked
Value to use when serializing a checked checkbox or radio input (
string,
default:
[x]).
options.unchecked
Value to use when serializing an unchecked checkbox or radio input (
string,
default:
[ ]).
options.quotes
List of quotes to use (
string[], default:
['"']).
Each value can be one or two characters.
When two, the first character determines the opening quote and the second the
closing quote at that level.
When one, both the opening and closing quote are that character.
The order in which the preferred quotes appear determines which quotes to use at
which level of nesting.
So, to prefer
‘’ at the first level of nesting, and
“” at the second, pass:
['‘’', '“”'].
If
<q>s are nested deeper than the given amount of quotes, the markers wrap
around: a third level of nesting when using
['«»', '‹›'] should have double
guillemets, a fourth single, a fifth double again, etc.
The algorithm supports implicit and explicit paragraphs (see HTML Standard, A. van Kesteren; et al. WHATWG § 3.2.5.4 Paragraphs), such as:
<article>
An implicit paragraph.
<h1>An explicit paragraph.</h1>
</article>
Yields:
An implicit paragraph.
# An explicit paragraph.
Some nodes are ignored and their content will not be present in
the mdast tree.
To ignore nodes, configure a handler for their tag name or type
that returns nothing.
For example, to ignore
em elements, pass
handlers: {'em': function () {}}:
<p><strong>Importance</strong> and <em>emphasis</em>.</p>
Yields:
**Importance** and .
To ignore a specific element from the HTML source, set
data-mdast to
ignore:
<p><strong>Importance</strong> and <em data-mdast="ignore">emphasis</em>.</p>
Yields:
**Importance** and .
We try our best to map any HTML (hast) to Markdown (mdast) and keep it readable.
Readability is one of Markdown’s greatest features: it’s terser than HTML, such
as allowing
# Alpha instead of
<h1>Alpha</h1>.
Another awesome feature of Markdown is that you can author HTML inside it. As we focus on readability we don’t do that, but you can by passing a handler.
Say we for example have this HTML, and want to embed the SVG inside Markdown as well:
<p>
Some text with
<svg viewBox="0 0 1 1" width="1" height="1"><rect fill="black" x="0" y="0" width="1" height="1" /></svg>
a graphic… Wait is that a dead pixel?
</p>
This can be achieved with
example.js like so:
import unified from 'unified'
import rehypeParse from 'rehype-parse'
import remarkStringify from 'remark-stringify'
import {toVFfile} from 'to-vfile'
import {toHtml} from 'hast-util-to-html'
import {toMdast} from 'hast-util-to-mdast'
const file = toVFfile.readSync('example.html')
const hast = unified().use(rehypeParse).parse(file)
const mdast = toMdast(hast, {handlers: {svg}})
const doc = unified().use(remarkStringify).stringify(mdast)
console.log(doc)
function svg(h, node) {
return h(node, 'html', toHtml(node, {space: 'svg'}))
}
Yields:
Some text with <svg viewBox="0 0 1 1" width="1" height="1"><rect fill="black" x="0" y="0" width="1" height="1"></rect></svg> a graphic… Wait is that a dead pixel?
all(h, parent)
Helper function for writing custom handlers passed to
options.handlers.
Pass it
h and a parent node (hast) and it will turn the node’s children into
an array of transformed nodes (mdast).
one(h, node, parent)
Helper function for writing custom handlers passed to
options.handlers.
Pass it
h, a
node, and its
parent (hast) and it will turn
node into
mdast content.
Use of
hast-util-to-mdast can open you up to a
cross-site scripting (XSS) attack if the hast tree is unsafe.
Use
hast-util-santize to make the hast tree safe.
