paddle-rec

大规模推荐模型训练工具

Showing:

Popularity

Downloads/wk

0

GitHub Stars

645

Maintenance

Last Commit

11d ago

Contributors

39

Package

Dependencies

2

License

Categories

Readme

(中文文档|简体中文|English)

什么是推荐系统?

  • 推荐系统是在互联网信息爆炸式增长的时代背景下,帮助用户高效获得感兴趣信息的关键;

  • 推荐系统也是帮助产品最大限度吸引用户、留存用户、增加用户粘性、提高用户转化率的银弹。

  • 有无数优秀的产品依靠用户可感知的推荐系统建立了良好的口碑,也有无数的公司依靠直击用户痛点的推荐系统在行业中占领了一席之地。

    可以说,谁能掌握和利用好推荐系统,谁就能在信息分发的激烈竞争中抢得先机。 但与此同时,有着许多问题困扰着推荐系统的开发者,比如:庞大的数据量,复杂的模型结构,低效的分布式训练环境,波动的在离线一致性,苛刻的上线部署要求,以上种种,不胜枚举。

什么是PaddleRec?

  • 源于飞桨生态的搜索推荐模型 一站式开箱即用工具
  • 适合初学者,开发者,研究者的推荐系统全流程解决方案
  • 包含内容理解、匹配、召回、排序、 多任务、重排序等多个任务的完整推荐搜索算法库

快速使用

环境要求

  • Python 2.7.15 / 3.5 / 3.6 / 3.7, 推荐使用python3.7,示例中的python默认表示python3.7
  • PaddlePaddle >=2.0
  • 操作系统: Windows/Mac/Linux

Windows下PaddleRec目前仅支持单机训练,分布式训练建议使用Linux环境

安装Paddle

  • gpu环境pip安装
    python -m pip install paddlepaddle-gpu==2.0.0 
    
  • cpu环境pip安装
    python -m pip install paddlepaddle # gcc8 
    
    更多版本下载可参考paddle官网下载安装

下载PaddleRec

git clone https://github.com/PaddlePaddle/PaddleRec/
cd PaddleRec

快速运行

我们以排序模型中的dnn模型为例介绍PaddleRec的一键启动。训练数据来源为Criteo数据集,我们从中截取了100条数据:

python -u tools/trainer.py -m models/rank/dnn/config.yaml # 动态图训练 
python -u tools/static_trainer.py -m models/rank/dnn/config.yaml #  静态图训练

帮助文档

项目背景

入门教程

进阶教程

FAQ

支持模型列表

方向模型单机CPU单机GPU分布式CPU分布式GPU支持版本论文
内容理解TextCnnx2.1.0[EMNLP 2014][Convolutional neural networks for sentence classication](https://www.aclweb.org/anthology/D14-1181.pdf)
内容理解TagSpacex2.1.0[EMNLP 2014][TagSpace: Semantic Embeddings from Hashtags](https://www.aclweb.org/anthology/D14-1194.pdf)
匹配DSSMx2.1.0[CIKM 2013][Learning Deep Structured Semantic Models for Web Search using Clickthrough Data](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/cikm2013_DSSM_fullversion.pdf)
匹配MultiView-Simnetx2.1.0[WWW 2015][A Multi-View Deep Learning Approach for Cross Domain User Modeling in Recommendation Systems](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/frp1159-songA.pdf)
召回TDM>=1.8.0>=1.8.01.8.5[KDD 2018][Learning Tree-based Deep Model for Recommender Systems](https://arxiv.org/pdf/1801.02294.pdf)
召回fasttextxx1.8.5[EACL 2017][Bag of Tricks for Efficient Text Classification](https://www.aclweb.org/anthology/E17-2068.pdf)
召回MINDxx2.1.0[2019][Multi-Interest Network with Dynamic Routing for Recommendation at Tmall](https://arxiv.org/pdf/1904.08030.pdf)
召回Word2Vecx2.1.0[NIPS 2013][Distributed Representations of Words and Phrases and their Compositionality](https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf)
召回SSR1.8.5[SIGIR 2016][Multtti-Rate Deep Learning for Temporal Recommendation](http://sonyis.me/paperpdf/spr209-song_sigir16.pdf)
召回Gru4Rec1.8.5[2015][Session-based Recommendations with Recurrent Neural Networks](https://arxiv.org/abs/1511.06939)
召回Youtube_dnn1.8.5[RecSys 2016][Deep Neural Networks for YouTube Recommendations](https://static.googleusercontent.com/media/research.google.com/zh-CN//pubs/archive/45530.pdf)
召回NCF2.1.0[WWW 2017][Neural Collaborative Filtering](https://arxiv.org/pdf/1708.05031.pdf)
召回GNN1.8.5[AAAI 2019][Session-based Recommendation with Graph Neural Networks](https://arxiv.org/abs/1811.00855)
召回RALM1.8.5[KDD 2019][Real-time Attention Based Look-alike Model for Recommender System](https://arxiv.org/pdf/1906.05022.pdf)
排序Logistic Regressionx2.1.0/
排序Dnn2.1.0/
排序FMx2.1.0[IEEE Data Mining 2010][Factorization machines](https://analyticsconsultores.com.mx/wp-content/uploads/2019/03/Factorization-Machines-Steffen-Rendle-Osaka-University-2010.pdf)
排序FFMx2.1.0[RECSYS 2016][Field-aware Factorization Machines for CTR Prediction](https://dl.acm.org/doi/pdf/10.1145/2959100.2959134)
排序FNNx1.8.5[ECIR 2016][Deep Learning over Multi-field Categorical Data](https://arxiv.org/pdf/1601.02376.pdf)
排序Deep Crossingx1.8.5[ACM 2016][Deep Crossing: Web-Scale Modeling without Manually Crafted Combinatorial Features](https://www.kdd.org/kdd2016/papers/files/adf0975-shanA.pdf)
排序Pnnx1.8.5[ICDM 2016][Product-based Neural Networks for User Response Prediction](https://arxiv.org/pdf/1611.00144.pdf)
排序DCNx1.8.5[KDD 2017][Deep & Cross Network for Ad Click Predictions](https://dl.acm.org/doi/pdf/10.1145/3124749.3124754)
排序NFMx1.8.5[SIGIR 2017][Neural Factorization Machines for Sparse Predictive Analytics](https://dl.acm.org/doi/pdf/10.1145/3077136.3080777)
排序AFMx1.8.5[IJCAI 2017][Attentional Factorization Machines: Learning the Weight of Feature Interactions via Attention Networks](https://arxiv.org/pdf/1708.04617.pdf)
排序DeepFMx2.1.0[IJCAI 2017][DeepFM: A Factorization-Machine based Neural Network for CTR Prediction](https://arxiv.org/pdf/1703.04247.pdf)
排序xDeepFMx2.1.0[KDD 2018][xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems](https://dl.acm.org/doi/pdf/10.1145/3219819.3220023)
排序DINx1.8.5[KDD 2018][Deep Interest Network for Click-Through Rate Prediction](https://dl.acm.org/doi/pdf/10.1145/3219819.3219823)
排序DIENx1.8.5[AAAI 2019][Deep Interest Evolution Network for Click-Through Rate Prediction](https://www.aaai.org/ojs/index.php/AAAI/article/view/4545/4423)
排序BSTx1.8.5[DLP_KDD 2019][Behavior Sequence Transformer for E-commerce Recommendation in Alibaba](https://arxiv.org/pdf/1905.06874v1.pdf)
排序AutoIntx1.8.5[CIKM 2019][AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks](https://arxiv.org/pdf/1810.11921.pdf)
排序Wide&Deepx2.1.0[DLRS 2016][Wide & Deep Learning for Recommender Systems](https://dl.acm.org/doi/pdf/10.1145/2988450.2988454)
排序FGCNN1.8.5[WWW 2019][Feature Generation by Convolutional Neural Network for Click-Through Rate Prediction](https://arxiv.org/pdf/1904.04447.pdf)
排序Fibinet1.8.5[RecSys19][FiBiNET: Combining Feature Importance and Bilinear feature Interaction for Click-Through Rate Prediction]( https://arxiv.org/pdf/1905.09433.pdf)
排序Flen1.8.5[2019][FLEN: Leveraging Field for Scalable CTR Prediction]( https://arxiv.org/pdf/1911.04690.pdf)
多任务PLE2.1.0[RecSys 2020][Progressive Layered Extraction (PLE): A Novel Multi-Task Learning (MTL) Model for Personalized Recommendations](https://dl.acm.org/doi/abs/10.1145/3383313.3412236)
多任务ESMM2.1.0[SIGIR 2018][Entire Space Multi-Task Model: An Effective Approach for Estimating Post-Click Conversion Rate](https://arxiv.org/abs/1804.07931)
多任务MMOE2.1.0[KDD 2018][Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts](https://dl.acm.org/doi/abs/10.1145/3219819.3220007)
多任务ShareBottom2.1.0[1998][Multitask learning](http://reports-archive.adm.cs.cmu.edu/anon/1997/CMU-CS-97-203.pdf)
重排序Listwisex1.8.5[2019][Sequential Evaluation and Generation Framework for Combinatorial Recommender System](https://arxiv.org/pdf/1902.00245.pdf)

社区


Release License Slack

版本历史

  • 2021.05.19 - PaddleRec v2.1.0
  • 2021.01.29 - PaddleRec v2.0.0
  • 2020.10.12 - PaddleRec v1.8.5
  • 2020.06.17 - PaddleRec v0.1.0
  • 2020.06.03 - PaddleRec v0.0.2
  • 2020.05.14 - PaddleRec v0.0.1

许可证书

本项目的发布受Apache 2.0 license许可认证。

联系我们

如有意见、建议及使用中的BUG,欢迎在GitHub Issue提交

亦可通过以下方式与我们沟通交流:

  • QQ群号码:861717190
  • 微信小助手微信号:wxid_0xksppzk5p7f22
  • 备注REC自动加群

     

PaddleRec交流QQ群               PaddleRec微信小助手

Rate & Review

Great Documentation0
Easy to Use0
Performant0
Highly Customizable0
Bleeding Edge0
Responsive Maintainers0
Poor Documentation0
Hard to Use0
Slow0
Buggy0
Abandoned0
Unwelcoming Community0
100