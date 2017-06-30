pdf2table

pdf2table is a node.js library that attempts to extract tables from a pdf.

The 'tables' are extracted as an array of rows.

It uses pdf2json to extract the pdf data.

Install

You can install pdf2table using the Node Package Manager (npm):

npm install pdf2table

Simple example

var pdf2table = require ( 'pdf2table' ); var fs = require ( 'fs' ); fs.readFile( './test.pdf' , function ( err, buffer ) { if (err) return console .log(err); pdf2table.parse(buffer, function ( err, rows, rowsdebug ) { if (err) return console .log(err); console .log(rows); }); });

Note

Note that this is a simplistic implementation to extract tables. If your pdf contains other stuff that's not a table, pdf2table will still attempt to shape this data into a row. Feel free to improve and send pull requests.