openbase logo
openbase logo
CategoriesLeaderboard
isb

isbot

by omrilotan
3.4.3 (see all)

💻 JavaScript module that detects bots/crawlers/spiders via the user agent

Home
npm
GitHub
CDN

Overview

DocumentationTutorialsReviewsMaintenanceDependenciesVersionsAlternatives
Showing:

Popularity

Downloads/wk

88.3K

GitHub Stars

409

Maintenance

Last Commit

9d ago

Contributors

26

Package

Dependencies

0

License

Unlicense

Type Definitions

Built-In

Tree-Shakeable

Yes?

Categories

Vanilla JavaScript User Agent Parsing

Reviews

Average Rating

5.0/52
Read All Reviews

Top Feedback

1Great Documentation
1Easy to Use
1Performant
1Responsive Maintainers

Readme

isbot 🤖/👨‍🦰

Detect bots/crawlers/spiders using the user agent string.

Usage

import isbot from 'isbot'

// Nodejs HTTP
isbot(request.getHeader('User-Agent'))

// ExpressJS
isbot(req.get('user-agent'))

// Browser
isbot(navigator.userAgent)

// User Agent string
isbot('Mozilla/5.0 (iPhone; CPU iPhone OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A5376e Safari/8536.25 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)') // true
isbot('Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36') // false

Additional functionality

Extend: Add user agent patterns

Add rules to user agent match RegExp: Array of strings

isbot('Mozilla/5.0') // false
isbot.extend([
    'istat',
    '^mozilla/\\d\\.\\d$'
])
isbot('Mozilla/5.0') // true

Exclude: Remove matches of known crawlers

Remove rules to user agent match RegExp (see existing rules in src/list.json file)

isbot('Chrome-Lighthouse') // true
isbot.exclude(['chrome-lighthouse']) // pattern is case insensitive
isbot('Chrome-Lighthouse') // false

Find: Verbose result

Return the respective match for bot user agent rule

isbot.find('Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0 DejaClick/2.9.7.2') // 'DejaClick'

Matches: Get patterns

Return all patterns that match the user agent string

isbot.matches('Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0 SearchRobot/1.0') // ['bot', 'search']

Clear:

Remove all matching patterns so this user agent string will pass

const ua = 'Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0 SearchRobot/1.0';
isbot(ua) // true
isbot.clear(ua)
isbot(ua) // false

Spawn: Create new instances

Create new instances of isbot. Instance is spawned using spawner's list as base

const one = isbot.spawn()
const two = isbot.spawn()

two.exclude(['chrome-lighthouse'])
one('Chrome-Lighthouse') // true
two('Chrome-Lighthouse') // false

Create isbot using custom list (instead of the maintained list)

const lean = isbot.spawn([ 'bot' ])
lean('Googlebot') // true
lean('Chrome-Lighthouse') // false

Definitions

  • Bot. Autonomous program imitating or replacing some aspect of a human behaviour, performing repetitive tasks much faster than human users could.
  • Good bot. Automated programs who visit websites in order to collect useful information. Web crawlers, site scrapers, stress testers, preview builders and other programs are welcomed on most websites because they serve purposes of mutual benefits.
  • Bad bot. Programs which are designed to perform malicious actions, ultimately hurting businesses. Testing credential databases, DDoS attacks, spam bots.

Clarifications

What does "isbot" do?

This package aims to identify "Good bots". Those who voluntarily identify themselves by setting a unique, preferably descriptive, user agent, usually by setting a dedicated request header.

What doesn't "isbot" do?

It does not try to recognise malicious bots or programs disguising themselves as real users.

Why would I want to identify good bots?

Recognising good bots such as web crawlers is useful for multiple purposes. Although it is not recommended to serve different content to web crawlers like Googlebot, you can still elect to

  • Flag pageviews to consider with business analysis.
  • Prefer to serve cached content and relieve service load.
  • Omit third party solutions' code (tags, pixels) and reduce costs.

    It is not recommended to whitelist requests for any reason based on user agent header only. Instead other methods of identification can be added such as reverse dns lookup.

Data sources

We use external data sources on top of our own lists to keep up to date

Crawlers user agents:

Non bot user agents:

Missing something? Please open an issue

Major releases breaking changes (full changelog)

Version 3

Remove testing for node 6 and 8

Version 2

Change return value for isbot: true instead of matched string

Version 1

No functional change

Real world data

| Execution times in milliseconds | - |

Rate & Review

Great Documentation1
Easy to Use1
Performant1
Highly Customizable0
Bleeding Edge0
Responsive Maintainers1
Poor Documentation0
Hard to Use0
Slow0
Buggy0
Abandoned0
Unwelcoming Community0
100
Nicolas de ChevignéParis4 Ratings0 Reviews
6 months ago
omrilotan24 Ratings11 Reviews
☕ I woke up like this
June 28, 2020
Great Documentation
Performant
Easy to Use
Responsive Maintainers

Alternatives

up
ua-parserA multi-language port of Browserscope's user agent parser.
GitHub Stars
2K
Weekly Downloads
4K
User Rating
5.0/ 5
1
Top Feedback
uap
user-agent-parserUAParser.js - Detect Browser, Engine, OS, CPU, and Device type/model from User-Agent data. Supports browser & node.js environment.
GitHub Stars
7K
Weekly Downloads
4K
ddj
device-detector-jsA precise user agent parser and device detector written in TypeScript
GitHub Stars
268
Weekly Downloads
38K
use
useragentUseragent parser for Node.js, ported from browserscope.org
GitHub Stars
865
Weekly Downloads
1M
dev
deviceDevice type detection library based on the useragent string. Refactored from my express-device repo.
GitHub Stars
77
Weekly Downloads
24K
See 7 Alternatives

Tutorials

No tutorials found
Add a tutorial