openbase logo
openbase logo
CategoriesLeaderboard

9 Best Vanilla Python Scraper API Libraries

List hand-picked by Openbase Experts
Learn More

tra

trafilatura

Web scraping library and command-line tool for text discovery and extraction (main content, metadata, comments)

GPLv3+
GitHub Stars
385
Weekly Downloads
0
Last Commit
3d ago
de

data-extractor

Combine XPath, CSS Selectors and JSONPath for Web data extracting.

MIT
GitHub Stars
23
Weekly Downloads
0
Last Commit
4mos ago
htm

htmldate

Fast and robust date extraction from web pages, with Python or on the command-line

GPLv3+
GitHub Stars
33
Weekly Downloads
0
Last Commit
3mos ago
las

lassie

Web Content Retrieval for Humans™

The MIT License
GitHub Stars
535
Weekly Downloads
0
Last Commit
6mos ago
User Rating
Top Feedback
1Easy to Use
1Performant

goose3

A Python 3 compatible version of goose http://goose3.readthedocs.io/en/latest/index.html

Apache
GitHub Stars
540
Weekly Downloads
0
Last Commit
1mo ago
gaz

gazpacho

🥫 The simple, fast, and modern web scraping library

MIT
GitHub Stars
605
Weekly Downloads
0
Last Commit
10mos ago

twint

An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations.

MIT
GitHub Stars
11.9K
Weekly Downloads
0
Last Commit
1yr ago
new

newspaper

News, full-text, and article metadata extraction in Python 3. Advanced docs:

MIT
GitHub Stars
11.5K
Weekly Downloads
0
Last Commit
1yr ago
ge

goose-extractor

Html Content / Article Extractor, web scrapping lib in Python

Apache
GitHub Stars
3.7K
Weekly Downloads
0
Last Commit
7yrs ago