par

parsz

The language engine and tool for web parsing

Showing:

Popularity

Downloads/wk

0

GitHub Stars

5

Maintenance

Last Commit

4yrs ago

Contributors

2

Package

Dependencies

6

License

MIT

Type Definitions

Tree-Shakeable

No?

Categories

Readme

pársz

- A tool for parsing the web

NPM Version

Usage

Install globally from npm/yarn

$ npm install -g parsz

View options from help menu

$ parsz --help

Use a "parselet" as a recipe/filter to parse a website.

The structure of the parselet is JSON.

Here is an example of a parselet for grabbing business data from a Yelp page:

{
  "name": "h1|trim",
  "phone": ".biz-phone|trim",
  "address": "address|trim",
  "reviews(.review)": [{
    "date": "meta[itemprop=datePublished] @content",
    "name": ".user-name a",
    "comment": ".review-content p"
  }]
}

As a module

You can also use parsz as a module:

import parsz from 'parsz';

parsz([Parselet JSON], [URL]).then(data => {
  // Do something with the data
});

Tips

This is a very general purpose and flexible tool. But here are some tips for getting started.

Grabbing a list of data

Use a reference selector in the key and an Array as the value.

{
  "users(.user)": [{
    "name": ".name",
    "age": ".age",
  }]
}

Use transformation functions on data

Add a pipe (|) and the transformation name after the data selector.

{
  "user": {
    "name": ".name|trim",
    "age": ".age|parseInt",
    "worth": ".age|parseFloat",
    "someNumber": ".age|floor",
  }
}

If anyone would like to see a certain, helpful transformation function added, please just open a issue

Grabbing an attribute

Use a (@) symbol to reference an attribute.

{
  "user": {
    "name": ".name",
    "nickname": ".name@data-nickname",
  }
}

Grabbing remote data

Use a (~) and a link selector to reference external content. The mapping (value) will be relative to that new external scope.

{
  "user": {
    "name": ".name",
    "company~(a.company)": {
      "name": ".company-name",
      "address": ".company-address",
    },
  }
}

Have fun!

Rate & Review

Great Documentation0
Easy to Use0
Performant0
Highly Customizable0
Bleeding Edge0
Responsive Maintainers0
Poor Documentation0
Hard to Use0
Slow0
Buggy0
Abandoned0
Unwelcoming Community0
100
No reviews found
Be the first to rate

Alternatives

No alternatives found

Tutorials

No tutorials found
Add a tutorial