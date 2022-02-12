Wappalyzer

Wappalyzer identifies technologies on websites, such as CMS, web frameworks, ecommerce platforms, JavaScript libraries, analytics tools and more.

Prerequisites

Git

Node.js version 14 or higher

Yarn

Quick start

git clone https://github.com/AliasIO/wappalyzer.git cd wappalyzer yarn install yarn run link

Usage

Command line

node src/drivers/npm/cli.js https://example.com

Chrome extension

Go to about:extensions

Enable 'Developer mode'

Click 'Load unpacked'

Select src/drivers/webextension

Firefox extension

Go to about:debugging#/runtime/this-firefox

Click 'Load Temporary Add-on'

Select src/drivers/webextension/manifest.json

Specification

A long list of regular expressions is used to identify technologies on web pages. Wappalyzer inspects HTML code, as well as JavaScript variables, response headers and more.

Patterns (regular expressions) are kept in src/technologies/ . The following is an example of an application fingerprint.

Example

"Example" : { "description" : "A short description of the technology." , "cats" : [ "1" ], "cookies" : { "cookie_name" : "Example" }, "dom" : { "#example-id" : { "exists" : "" , "attributes" : { "class" : "example-class" }, "properties" : { "example-property" : "" }, "text" : "Example text content" } }, "dns" : { "MX" : [ "example\\.com" ] }, "js" : { "Example.method" : "" }, "excludes" : "Example" , "headers" : { "X-Powered-By" : "Example" }, "html" : "<link[^>]example\\.css" , "text" : "\bexample\b" , "css" : "\\.example-class" , "robots" : "Disallow: /unique-path/" , "implies" : "PHP\\;confidence:50" , "requires" : "WordPress" , "requiresCategory" : "Ecommerce" , "meta" : { "generator" : "(?:Example|Another Example)" }, "scriptSrc" : "example-([0-9.]+)\\.js\\;confidence:50\\;version:\\1" , "scripts" : "function webpackJsonpCallback\\(data\\) {" , "url" : "example\\.com" , "xhr" : "example\\.com" , "oss" : true , "saas" : true , "pricing" : [ "mid" , "freemium" , "recurring" ], "website" : "https://example.com" , }

JSON fields

Find the JSON schema at schema.json .

Required properties

Field Type Description Example cats Array One or more category IDs. [1, 6] website String URL of the application's website. "https://example.com"

Optional properties

Field Type Description Example description String A short description of the technology in British English (max. 250 characters). Write in a neutral, factual tone; not like an ad. "A short description." icon String Application icon filename. "WordPress.svg" cpe String The CPE is a structured naming scheme for applications, see the specification. "cpe:/a:apache:http_server" saas Boolean The technology is offered as a Software-as-a-Service (SaaS), i.e. hosted or cloud-based. true oss Boolean The technology has an open-source license. true pricing Array Cost indicator (based on a typical plan or average monthly price) and available pricing models. For paid products only. One of: low Less than US $100 / mo

Less than US $100 / mo mid Between US $100 - $1,000 / mo

Between US $100 - $1,000 / mo high More than US $1,000 / mo Plus any of: freemium Free plan available

Free plan available onetime One-time payments accepted

One-time payments accepted recurring Subscriptions available

Subscriptions available poa Price on asking

Price on asking payg Pay as you go (e.g. commissions or usage-based fees) ["low", "freemium"]

Implies, requires and excludes (optional)

Field Type Description Example implies String | Array The presence of one application can imply the presence of another, e.g. WordPress means PHP is also in use. "PHP" requires String | Array Similar to implies but detection only runs if the required technology has been identified. Useful for themes for a specific CMS. "WordPress" requiresCategory String | Array Similar to requires; detection only runs if a technology in the required category has been identified. "Ecommerce" excludes String | Array Opposite of implies. The presence of one application can exclude the presence of another. "Apache"

Patterns (optional)

Field Type Description Example cookies Object Cookies. { "cookie_name": "Cookie value" } dom String | Array | Object Uses a query selector to inspect element properties, attributes and text content. { "#example-id": { "property": { "example-prop": "" } } } dns Object DNS records: supports MX, TXT, SOA and NS (NPM driver only). { "MX": "example\\.com" } js Object JavaScript properties (case sensitive). Avoid short property names to prevent matching minified code. { "jQuery.fn.jquery": "" } headers Object HTTP response headers. { "X-Powered-By": "^WordPress$" } html String | Array HTML source code. Patterns must include an HTML opening tag to avoid matching plain text. For performance reasons, avoid html where possible and use dom instead. "<a [^>]*href=\"index.html" text String | Array Matches plain text. Should only be used in very specific cases where other methods can't be used. \bexample\b css String | Array CSS rules. Unavailable when a website enforces a same-origin policy. For performance reasons, only a portion of the available CSS rules are used to find matches. "\\.example-class" robots String | Array Robots.txt contents. "Disallow: /unique-path/" url String | Array Full URL of the page. "^https?//.+\\.wordpress\\.com" xhr String | Array Hostnames of XHR requests. "cdn\\.netlify\\.com" meta Object HTML meta tags, e.g. generator. { "generator": "^WordPress$" } scriptSrc String | Array URLs of JavaScript files included on the page. "jquery\\.js" scripts String | Array JavaScript source code. Inspects inline and external scripts. For performance reasons, avoid scripts where possible and use js instead. "function webpackJsonpCallback\\(data\\) {"

Patterns

Patterns are essentially JavaScript regular expressions written as strings, but with some additions.

Quirks and pitfalls

Because of the string format, the escape character itself must be escaped when using special characters such as the dot ( \\. ). Double quotes must be escaped only once ( \" ). Slashes do not need to be escaped ( / ).

). Double quotes must be escaped only once ( ). Slashes do not need to be escaped ( ). Flags are not supported. Regular expressions are treated as case-insensitive.

Capture groups ( () ) are used for version detection. In other cases, use non-capturing groups ( (?:) ).

) are used for version detection. In other cases, use non-capturing groups ( ). Use start and end of string anchors ( ^ and $ ) where possible for optimal performance.

and ) where possible for optimal performance. Short or generic patterns can cause applications to be identified incorrectly. Try to find unique strings to match.

Tags (a non-standard syntax) can be appended to patterns (and implies and excludes, separated by \\; ) to store additional information.

Tag Description Example confidence Indicates a less reliable pattern that may cause false positives. The aim is to achieve a combined confidence of 100%. Defaults to 100% if not specified. "js": { "Mage": "\\;confidence:50" } version Gets the version number from a pattern match using a special syntax. "scriptSrc": "jquery-([0-9.]+)\.js\\;version:\\1"

Version syntax

Application version information can be obtained from a pattern using a capture group. A condition can be evaluated using the ternary operator ( ?: ).