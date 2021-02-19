tSNE for TensorFlow.js

This library contains a improved tSNE implementation that runs in the browser.

Installation & Usage

You can use tfjs-tsne via a script tag or via NPM

Script tag

To use tfjs-tsne via script tag you need to load tfjs first. The following tags can be put into the head section of your html page to load the library.

< script src = "https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@0.14.1" > </ script > < script src = "https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-tsne" > </ script >

This library will create a tsne variable on the global scope. You can then do the following

const data = tf.randomUniform([ 2000 , 10 ]); const tsneOpt = tsne.tsne(data); tsneOpt.compute().then( () => { const coordinates = tsneOpt.coordinates(); coordinates.print(); }) ;

Via NPM

yarn add @ tensorflow / tfjs - tsne

or

npm install @ tensorflow / tfjs - tsne

Then

import * as tsne from '@tensorflow/tfjs-tsne' ; const data = tf.randomUniform([ 2000 , 10 ]); const tsneOpt = tsne.tsne(data); tsneOpt.compute().then( () => { const coordinates = tsneOpt.coordinates(); coordinates.print(); }) ;

API

Creates and returns a TSNE optimizer.

data must be a Rank 2 tensor. Shape is [numPoints, dataPointDimensions]

must be a Rank 2 tensor. Shape is [numPoints, dataPointDimensions] config is an optional object with the following params (all are optional): perplexity: number — defaults to 18. Max value is defined by hardware limitations. verbose: boolean — defaults to false exaggeration: number — defaults to 4 exaggerationIter: number — defaults to 300 exaggerationDecayIter: number — defaults to 200 momentum: number — defaults to 0.8

is an optional object with the following params (all are optional):

.compute(iterations: number): Promise

The most direct way to get a tsne projection. Automatically runs the knn preprocessing and the tsne optimization. Returns a promise to indicate when it is done.

iterations the number of iterations to run the tsne optimization for. (The number of knn steps is automatically calculated).

.iterateKnn(iterations: number): Promise

When running tsne iteratively (see section below). This runs runs the knn preprocessing for the specified number of iterations.

.iterate(iterations: number): Promise

When running tsne iteratively (see section below). This runs the tsne step for the specified number of iterations.

Gets the current x, y coordinates of the projected data as a tensor. By default the coordinates are normalized to the range 0-1.

Gets the current x, y coordinates of the projected data as a JavaScript array. By default the coordinates are normalized to the range 0-1. This function is async and returns a promise.

Computing tSNE iteratively

While the .compute method provides the most direct way to get an embedding. You can also compute the embedding iteratively and have more control over the process.

The first step is computing the KNN graph using iterateKNN.

Then you can compute the tSNE iteratively and examine the result as it evolves.

The code below shows what that would look like

const data = tf.randomUniform([ 2000 , 10 ]); const tsne = tf_tsne.tsne(data); async function iterativeTsne ( ) { const knnIterations = tsne.knnIterations(); for ( let i = 0 ; i < knnIterations; ++i){ await tsne.iterateKnn(); } const tsneIterations = 1000 ; for ( let i = 0 ; i < tsneIterations; ++i){ await tsne.iterate(); const coordinates = tsne.coordinates(); coordinates.print(); } } iterativeTsne();

Example

We also have an example of using this library to perform TSNE on the MNIST dataset here.

Limitations

This library requires WebGL 2 support and thus will not work on certain devices, mobile devices especially. Currently it best works on desktop devices.

From our current experiments we suggest limiting the data size passed to this implementation to data with a shape of [10000,100], i.e. up to 10000 points with 100 dimensions each. You can do more but it might slow down.

Above a certain number of data points the computation of the similarities becomes a bottleneck, a problem that we plan to address in the future.

Implementation

This work makes use of linear tSNE optimization for the optimization of the embedding and an optimized brute force computation of the kNN graph in the GPU.

Reference

