goodtables

Frictionless is a framework to describe, extract, validate, and transform tabular data.

Showing:

Popularity

Downloads/wk

0

GitHub Stars

458

Maintenance

Last Commit

19d ago

Contributors

51

Package

Dependencies

16

License

MIT

Categories

Readme

Frictionless Framework

Build Coverage Release Citation Codebase Support

Frictionless is a framework to describe, extract, validate, and transform tabular data (DEVT Framework). It supports a great deal of data schemes and formats, as well as provides popular platforms integrations. The framework is powered by the lightweight yet comprehensive Frictionless Data Specifications.

Purpose

  • Describe your data: You can infer, edit and save metadata of your data tables. It's a first step for ensuring data quality and usability. Frictionless metadata includes general information about your data like textual description, as well as, field types and other tabular data details.
  • Extract your data: You can read your data using a unified tabular interface. Data quality and consistency are guaranteed by a schema. Frictionless supports various file schemes like HTTP, FTP, and S3 and data formats like CSV, XLS, JSON, SQL, and others.
  • Validate your data: You can validate data tables, resources, and datasets. Frictionless generates a unified validation report, as well as supports a lot of options to customize the validation process.
  • Transform your data: You can clean, reshape, and transfer your data tables and datasets. Frictionless provides a pipeline capability and a lower-level interface to work with the data.

Features

  • Open Source (MIT)
  • Powerful Python framework
  • Convenient command-line interface
  • Low memory consumption for data of any size
  • Reasonable performance on big data
  • Support for compressed files
  • Custom checks and formats
  • Fully pluggable architecture
  • The included API server
  • More than 1000+ tests

Example

$ frictionless validate data/invalid.csv
[invalid] data/invalid.csv

  row    field  code              message
-----  -------  ----------------  --------------------------------------------
             3  blank-header      Header in field at position "3" is blank
             4  duplicate-header  Header "name" in field "4" is duplicated
    2        3  missing-cell      Row "2" has a missing cell in field "field3"
    2        4  missing-cell      Row "2" has a missing cell in field "name2"
    3        3  missing-cell      Row "3" has a missing cell in field "field3"
    3        4  missing-cell      Row "3" has a missing cell in field "name2"
    4           blank-row         Row "4" is completely blank
    5        5  extra-cell        Row "5" has an extra value in field  "5"

Documentation

Please visit our documentation portal:

Rate & Review

Great Documentation0
Easy to Use0
Performant0
Highly Customizable0
Bleeding Edge0
Responsive Maintainers0
Poor Documentation0
Hard to Use0
Slow0
Buggy0
Abandoned0
Unwelcoming Community0
100