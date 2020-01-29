skmeans

Super fast simple k-means and k-means++ implementation for unidimiensional and multidimensional data. Works on nodejs and browser.

Installation

npm install skmeans

Usage

NodeJS

const skmeans = require ( "skmeans" ); var data = [ 1 , 12 , 13 , 4 , 25 , 21 , 22 , 3 , 14 , 5 , 11 , 2 , 23 , 24 , 15 ]; var res = skmeans(data, 3 );

Browser

< html > < head > < script src = "skmeans.js" > </ script > </ head > < body > < script > var data = [ 1 , 12 , 13 , 4 , 25 , 21 , 22 , 3 , 14 , 5 , 11 , 2 , 23 , 24 , 15 ]; var res = skmeans(data, 3 ); console .log(res); </ script > </ body > </ html >

Results

{ it : 2 , k : 3 , idxs : [ 2 , 0 , 0 , 2 , 1 , 1 , 1 , 2 , 0 , 2 , 0 , 2 , 1 , 1 , 0 ], centroids : [ 13 , 23 , 3 ] }

API

Calculates unidimiensional and multidimensional k-means clustering on data. Parameters are:

data Unidimiensional or multidimensional array of values to be clustered. for unidimiensional data, takes the form of a simple array [1,2,3.....,n]. For multidimensional data, takes a NxM array [[1,2],[2,3]....[n,m]]

Unidimiensional or multidimensional array of values to be clustered. for unidimiensional data, takes the form of a simple array [1,2,3.....,n]. For multidimensional data, takes a NxM array [[1,2],[2,3]....[n,m]] k Number of clusters

Number of clusters centroids Optional. Initial centroid values. If not provided, the algorith will try to choose an apropiate ones. Alternative values can be: "kmrand" Cluster initialization will be random, but with extra checking, so there will no be two equal initial centroids. "kmpp" The algorythm will use the k-means++ cluster initialization method.

Optional. Initial centroid values. If not provided, the algorith will try to choose an apropiate ones. Alternative values can be: iterations Optional. Maximum number of iterations. If not provided, it will be set to 10000.

Optional. Maximum number of iterations. If not provided, it will be set to 10000. distance function Optional. Custom distance function. Takes two points as arguments and returns a scalar number.

The function will return an object with the following data:

it The number of iterations performed until the algorithm has converged

The number of iterations performed until the algorithm has converged k The cluster size

The cluster size centroids The value for each centroid of the cluster

The value for each centroid of the cluster idxs The index to the centroid corresponding to each value of the data array

The index to the centroid corresponding to each value of the data array test Function to test new point membership

Examples