tfc

tfcf

tf-recsys contains collaborative filtering (CF) model based on famous SVD and SVD++ algorithm. Both of them are implemented by tensorflow in order to utilize GPU acceleration.

Showing:

Popularity

Downloads/wk

0

GitHub Stars

88

Maintenance

Last Commit

4yrs ago

Contributors

1

Package

Dependencies

0

License

MIT

Categories

Readme

tf-recsys

Overview

tf-recsys contains collaborative filtering (CF) model based on famous SVD and SVD++ algorithm. Both of them are implemented by Tensorflow in order to utilize GPU acceleration.

Installation

pip install tfcf

Note that if you want to use GPU, please pre-install Tensorflow with GPU version, that is, run

pip install tensorflow-gpu

or follow the instructions at Installing Tensorflow.

Algorithms

SVD

SVD algorithm does matrix factorization via the following formula:

SVD

LHS is the prediction rating. The objective function is summation of the L2 loss between prediction and real rating and the regularization terms. For parameter updating, the gradient descent is used to minimize objective function.

SVD++

Similar to SVD, the original SVD++ algorithm incorporate implicit feedback of users.

SVD++

The implicit feedback of user here is the set of implicit feedback of users.

In this package, we also provide dual option for SVD++, or incoporate the implicit feedback of items. The equation can be re-written as follows:

dual SVD++

where implicit feedback of item is the set of implicit feedback of items.

In our experiments, dual SVD++ outperform both original SVD++ and SVD but with slower training procedure.

Example

import numpy as np
import tensorflow as tf
from tfcf.metrics import mae
from tfcf.metrics import rmse
from tfcf.datasets import ml1m
from tfcf.config import Config
from tfcf.models.svd import SVD
from tfcf.models.svd import SVDPP
from sklearn.model_selection import train_test_split

# Note that x is a 2D numpy array, 
# x[i, :] contains the user-item pair, and y[i] is the corresponding rating.
x, y = ml1m.load_data()

x_train, x_test, y_train, y_test = train_test_split(
    x, y, test_size=0.2, random_state=0)

config = Config()
config.num_users = np.max(x[:, 0]) + 1
config.num_items = np.max(x[:, 1]) + 1
config.min_value = np.min(y)
config.max_value = np.max(y)

with tf.Session() as sess:
    # For SVD++ algorithm, if `dual` is True, then the dual term of items' 
    # implicit feedback will be added into the original SVD++ algorithm.
    # model = SVDPP(config, sess, dual=False)
    # model = SVDPP(config, sess, dual=True)
    model = SVD(config, sess)
    model.train(x_train, y_train, validation_data=(
        x_test, y_test), epochs=20, batch_size=1024)
        
    y_pred = model.predict(x_test)
    print('rmse: {}, mae: {}'.format(rmse(y_test, y_pred), mae(y_test, y_pred)))
        
    # Save model
    model = model.save_model('model/')
    
    # Load model
    # model = model.load_model('model/')

Performance

The experiments are set up on MovieLens 100K and MovieLens 1M. The results reported here are evaluated on 5-folds cross validation with random seed 0 and taken average of them. All models use default configuration. For MovieLens 100K, the batch size is 128. As for MovieLens 1M, a quite larger dataset, the batch size is 1024. With GPU acceleration, both SVD and SVD++ speed up significantly compared with Surprise, which is the implementation based on cPython. The following is the performance on GTX 1080:

MovieLens 100K

RMSEMAETime (sec/epoch)
SVD0.915720.71964< 1
SVD++0.904840.709824
Dual SVD++0.893340.700207

MovieLens 1M

RMSEMAETime (sec/epoch)
SVD0.855240.669224
SVD++0.848460.6630640
Dual SVD++0.836720.6525650

Some similar experiments can be found at MyMediaLite, Surprise and LibRec.

References

Tensorflow

MyMediaLite

Surprise

LibRec

Also see my ML2017 repo, there is a Keras implementation for SVD and SVD++ in hw6.

Contact

Issues and pull requests are welcomed. Feel free to contact me if there's any problems.

Rate & Review

Great Documentation0
Easy to Use0
Performant0
Highly Customizable0
Bleeding Edge0
Responsive Maintainers0
Poor Documentation0
Hard to Use0
Slow0
Buggy0
Abandoned0
Unwelcoming Community0
100
No reviews found
Be the first to rate

Alternatives

No alternatives found

Tutorials

No tutorials found
Add a tutorial