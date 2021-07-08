This is a general RTF parser. It takes a text stream and produces a document object representing the parsed document. In and of itself, this isn't super useful but it's the building block for other tools to convert RTF into other formats.
const parseRTF = require('rtf-parser')
const fs = require('fs')
parseRTF.string('{\\rtf1\\ansi\\b hi there\\b0}', (err, doc) => {
…
})
parseRTF.stream(fs.createReadStream('example.rtf'), (err, doc) => {
…
})
const parser = parseRTF((err, doc) => {
…
})
fs.createReadStream('example.rtf').pipe(parser)
RTF, unlike HTML, is NOT declarative and is instead a series of commands that mutate document state. As such, to accurately convert it you have to load into something tha tracks that state, then emit chunks of text with whatever that state was when they were emitted.
RTF, like HTML, allows (mostly) seamless degrading when you don't understand an element. As such, while this parser is still quite incomplete it is already useful
RTF fragments are supported.
\b hi there\b0 will generate a document with
hi there flagged as bold text.
The document returned is of the
RTFDocument class, see below for details.
Most notably, stylesheets, list styling and tables are not supported. List styling degrades cleanly but tables do not. There are certainly other required bits from the spec that are currently ignored.
This is the class you get back from the parse functions. It has some document global options and the paragraph objects that make up the document.
style — An object with paragraph level styling information.
content — An array of RTFSpan objects