jp

junk-parser

Automatically handles formatting errors in delimited text files. Supports CSV, TSV, PSV (pipe sep). NodeJS

Showing:

Popularity

Downloads/wk

1

GitHub Stars

2

Maintenance

Last Commit

5yrs ago

Contributors

1

Package

Dependencies

5

Size (min+gzip)

25.6KB

License

MIT

Type Definitions

Tree-Shakeable

No?

Categories

Readme

junk-parser - Best-fit, low-RAM CSV Parser

Build Status GitHub Stars Github Releases npm npm

I know, you're thinking "not another CSV/?SV parser!?"

"None of them even work exactly like Excel 20xx anyway!?!?"

Well, this is a different kind of parser. And a fun experiment. #DealWithIt.

It's optimized around a few assumptions - based on observed common errors.

It does localized adjustments to best-fit rows to the column count. It can also adjust columns intelligently based on detected data types.

This technique is biased towards data with more columns & more column types. Even better if the types are in amix Errors in Tuple-, or Key-Value-Pair-shaped files (with only 2-3 columns) will probably not be handled desireably.

Example data - Has column row + 5 rows on 14 lines - IDs 100-104:

id,first,last,addr,job
100,John,Doe,666 Heck Hwy,Cat Herder
101,John,Doe,123 Main St.
Denver CO 80123,Cat Whisperer
102,John,Doe,Attn: Delivery
    123 Main St.
    Denver CO 80123,Cat Whisperer
103,John,Doe,Attn: Delivery
123 Main St., Denver, Co
80122
,Cat Whisperer
104,John,Doe,Attn: Delivery
123 Main St., Denver, Co
80122

,Cat Whisperer

Currently Parses Broken

  • handles 2-line row, extra line-break
  • handles 2-line quoted row, xtra line-break
  • handles 2 extra line-breaks
  • handles 1 row on 4 lines, w/ "empty" line
  • handles 2 row, 4 lines quoted w/ trailing delimiter

image

Rate & Review

Great Documentation0
Easy to Use0
Performant0
Highly Customizable0
Bleeding Edge0
Responsive Maintainers0
Poor Documentation0
Hard to Use0
Slow0
Buggy0
Abandoned0
Unwelcoming Community0
100