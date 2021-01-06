Finds degree of similarity between two strings, based on Dice's Coefficient, which is mostly better than Levenshtein distance.

Usage

For Node.js

Install using:

npm install string-similarity --save

In your code:

var stringSimilarity = require ( "string-similarity" ); var similarity = stringSimilarity.compareTwoStrings( "healed" , "sealed" ); var matches = stringSimilarity.findBestMatch( "healed" , [ "edward" , "sealed" , "theatre" , ]);

For browser apps

Include <script src="//unpkg.com/string-similarity/umd/string-similarity.min.js"></script> to get the latest version.

Or <script src="//unpkg.com/string-similarity@4.0.1/umd/string-similarity.min.js"></script> to get a specific version (4.0.1) in this case.

This exposes a global variable called stringSimilarity which you can start using.

< script > stringSimilarity.compareTwoStrings( 'what!' , 'who?' ); </ script >

(The package is exposed as UMD, so you can consume it as such)

API

The package contains two methods:

Returns a fraction between 0 and 1, which indicates the degree of similarity between the two strings. 0 indicates completely different strings, 1 indicates identical strings. The comparison is case-sensitive.

Arguments

string1 (string): The first string string2 (string): The second string

Order does not make a difference.

Returns

(number): A fraction from 0 to 1, both inclusive. Higher number indicates more similarity.

Examples

stringSimilarity.compareTwoStrings( "healed" , "sealed" ); stringSimilarity.compareTwoStrings( "Olive-green table for sale, in extremely good condition." , "For sale: table in very good condition, olive green in colour." ); stringSimilarity.compareTwoStrings( "Olive-green table for sale, in extremely good condition." , "For sale: green Subaru Impreza, 210,000 miles" ); stringSimilarity.compareTwoStrings( "Olive-green table for sale, in extremely good condition." , "Wanted: mountain bike with at least 21 gears." );

Compares mainString against each string in targetStrings .

Arguments

mainString (string): The string to match each target string against. targetStrings (Array): Each string in this array will be matched against the main string.

Returns

(Object): An object with a ratings property, which gives a similarity rating for each target string, a bestMatch property, which specifies which target string was most similar to the main string, and a bestMatchIndex property, which specifies the index of the bestMatch in the targetStrings array.

Examples

stringSimilarity.findBestMatch( 'Olive-green table for sale, in extremely good condition.' , [ 'For sale: green Subaru Impreza, 210,000 miles' , 'For sale: table in very good condition, olive green in colour.' , 'Wanted: mountain bike with at least 21 gears.' ]); { ratings : [ { target : 'For sale: green Subaru Impreza, 210,000 miles' , rating : 0.2558139534883721 }, { target : 'For sale: table in very good condition, olive green in colour.' , rating : 0.6060606060606061 }, { target : 'Wanted: mountain bike with at least 21 gears.' , rating : 0.1411764705882353 } ], bestMatch : { target : 'For sale: table in very good condition, olive green in colour.' , rating : 0.6060606060606061 }, bestMatchIndex : 1 }

Release Notes

Removed production dependencies

Updated to ES6 (this breaks backward-compatibility for pre-ES6 apps)

Performance improvement for compareTwoStrings(..) : now O(n) instead of O(n^2)

: now O(n) instead of O(n^2) The algorithm has been tweaked slightly to disregard spaces and word boundaries. This will change the rating values slightly but not enough to make a significant difference

Adding a bestMatchIndex to the results for findBestMatch(..) to point to the best match in the supplied targetStrings array

Refactoring: removed unused functions; used substring instead of substr

instead of Updated dependencies

Distributing as an UMD build to be used in browsers.

Update dependencies to latest versions.

Make compatible with IE and ES5. Also, update deps. (see PR56)