9 Best Vanilla Python Scraper API Libraries

List hand-picked by Openbase Experts
Learn More

tra

trafilatura

Web scraping library and command-line tool for text discovery and extraction (main content, metadata, comments)

GPLv3+
GitHub Stars
473
Weekly Downloads
0
Last Commit
2mos ago
htm

htmldate

Fast and robust date extraction from web pages, with Python or on the command-line

GPLv3+
GitHub Stars
41
Weekly Downloads
0
Last Commit
2mos ago
de

data-extractor

Combine XPath, CSS Selectors and JSONPath for Web data extracting.

MIT
GitHub Stars
23
Weekly Downloads
0
Last Commit
7mos ago
las

lassie

Web Content Retrieval for Humans™

The MIT License
GitHub Stars
535
Weekly Downloads
0
Last Commit
9mos ago
User Rating
Top Feedback
1Easy to Use
1Performant

goose3

A Python 3 compatible version of goose http://goose3.readthedocs.io/en/latest/index.html

Apache
GitHub Stars
545
Weekly Downloads
0
Last Commit
4mos ago
gaz

gazpacho

🥫 The simple, fast, and modern web scraping library

MIT
GitHub Stars
605
Weekly Downloads
0
Last Commit
1yr ago
ge

goose-extractor

Html Content / Article Extractor, web scrapping lib in Python

Apache
GitHub Stars
3.7K
Weekly Downloads
0
Last Commit
7yrs ago
new

newspaper

News, full-text, and article metadata extraction in Python 3. Advanced docs:

MIT
GitHub Stars
11.8K
Weekly Downloads
0
Last Commit
2yrs ago

twint

An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations.

MIT
GitHub Stars
12.7K
Weekly Downloads
0
Last Commit
1yr ago