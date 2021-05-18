openbase logo
csv-reader

by Daniel Cohen Gindi
1.0.8 (see all)

A CSV stream reader, with many many features, and ability to work with the largest datasets

csv-reader

A CSV stream reader, with many many features, and ability to work with the largest datasets

Included features: (can be turned on and off)

  • Support for excel-style multiline cells wrapped in quotes
  • Choosing a different delimiter instead of the comma
  • Automatic skipping empty lines
  • Automatic skipping of the first header row
  • Automatic parsing of numbers and booleans
  • Automatic trimming
  • Being a stream transformer, you can .pause() if you need some time to process the row and .resume() when you are ready to receive and process more rows.
  • Consumes and emits rows one-by-one, allowing you to process datasets in any size imaginable.
  • Automatically strips the BOM if exists (not handled automatically by node.js stream readers)

Installation:

npm install --save csv-reader

The options you can pass are:

NameTypeDefaultExplanation
delimiterString,The character that separates between cells
multilineBooleantrueAllow multiline cells, when the cell is wrapped with quotes ("...\n...")
allowQuotesBooleantrueShould quotes be treated as a special character that wraps cells etc.
skipEmptyLinesBooleanfalseShould empty lines be automatically skipped?
skipHeaderBooleanfalseShould the first header row be skipped?
asObjectBooleanfalseIf true, each row will be converted automatically to an object based on the header. This implied skipHeader=true.
parseNumbersBooleanfalseShould numbers be automatically parsed? This will parse any format supported by parseFloat including scientific notation, Infinity and NaN.
parseBooleansBooleanfalseAutomatically parse booleans (strictly lowercase true and false)
ltrimBooleanfalseAutomatically left-trims columns
rtrimBooleanfalseAutomatically right-trims columns
trimBooleanfalseIf true, then both 'ltrim' and 'rtrim' are set to true

Events:

A 'data' event will be emitted with each row, either in an array format ((string|number|boolean)[]) or an Object format (Object<string, (string|number|boolean)>), depending on the asObject option.
A preliminary 'header' event will be emitted with the first row, only in an array format, and without any interpolation to different types (string[]).
Of course other events as usual - end and error.

Usage example:


const Fs = require('fs');
const CsvReadableStream = require('csv-reader');

let inputStream = Fs.createReadStream('my_data.csv', 'utf8');

inputStream
    .pipe(new CsvReadableStream({ parseNumbers: true, parseBooleans: true, trim: true }))
    .on('data', function (row) {
        console.log('A row arrived: ', row);
    })
    .on('end', function () {
        console.log('No more rows!');
    });

A common issue with CSVs are that Microsoft Excel for some reason does not save UTF8 files. Microsoft never liked standards. In order to automagically handle the possibility of such files with ANSI encodings arriving from user input, you can use the autodetect-decoder-stream like this:


const Fs = require('fs');
const CsvReadableStream = require('csv-reader');
const AutoDetectDecoderStream = require('autodetect-decoder-stream');

let inputStream = Fs.createReadStream('my_data.csv')
    .pipe(new AutoDetectDecoderStream({ defaultEncoding: '1255' })); // If failed to guess encoding, default to 1255

// The AutoDetectDecoderStream will know if the stream is UTF8, windows-1255, windows-1252 etc.
// It will pass a properly decoded data to the CsvReader.
 
inputStream
    .pipe(new CsvReadableStream({ parseNumbers: true, parseBooleans: true, trim: true }))
    .on('data', function (row) {
        console.log('A row arrived: ', row);
    }).on('end', function () {
        console.log('No more rows!');
    });

Contributing

If you have anything to contribute, or functionality that you lack - you are more than welcome to participate in this! If anyone wishes to contribute unit tests - that also would be great :-)

Me

  • Hi! I am Daniel Cohen Gindi. Or in short- Daniel.
  • danielgindi@gmail.com is my email address.
  • That's all you need to know.

Help

If you want to buy me a beer, you are very welcome to Donate Thanks :-)

License

All the code here is under MIT license. Which means you could do virtually anything with the code. I will appreciate it very much if you keep an attribution where appropriate.

The MIT License (MIT)

Copyright (c) 2013 Daniel Cohen Gindi (danielgindi@gmail.com)

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

