warc

Python library for reading and writing warc files

Showing:

Popularity

Downloads/wk

0

GitHub Stars

212

Maintenance

Last Commit

9yrs ago

Contributors

4

Package

Dependencies

0

License

BSD

Categories

Readme

warc: Python library to work with WARC files

.. image:: https://secure.travis-ci.org/anandology/warc.png?branch=master :alt: build status :target: http://travis-ci.org/anandology/warc

WARC (Web ARChive) is a file format for storing web crawls.

http://bibnum.bnf.fr/WARC/

This warc library makes it very easy to work with WARC files.::

import warc
f = warc.open("test.warc")
for record in f:
    print record['WARC-Target-URI'], record['Content-Length']

Documentation

The documentation of the warc library is available at http://warc.readthedocs.org/.

License

This software is licensed under GPL v2. See LICENSE_ file for details.

.. LICENSE: http://github.com/internetarchive/warc/blob/master/LICENSE

Rate & Review

Great Documentation0
Easy to Use0
Performant0
Highly Customizable0
Bleeding Edge0
Responsive Maintainers0
Poor Documentation0
Hard to Use0
Slow0
Buggy0
Abandoned0
Unwelcoming Community0
100