Python推荐系统库--Surprise理论

zoukankan html css js c++ java

Python推荐系统库--Surprise理论

Surprise

Surprise是scikit系列中的一个。Surprise的User Guide有详细的解释和说明

支持多种推荐算法

基础算法/baseline algorithms

基于近邻方法（协同过滤）/neighborhood methods

矩阵分解方法/matrix factorization-based (SVD, PMF, SVD++, NMF)

下面介绍几种算法

基础算法：

　　1. random_pred.NormalPredictor

　　说明：Algorithm predicting a random rating based on the distribution of the training set, which is assumed to be normal.

　　意思是：算法基于训练集的分布预测随机等级，假设该分布为正态分布

　　2. baseline_only.BaselineOnly

　　说明：Algorithm predicting the baseline estimate for given user and item.

　　意思是：算法预测给定用户和项目的基线估计

协同过滤算法：

　　3. knns.KNNBasic

　　说明：A basic collaborative filtering algorithm.

　　意思是：一种基本的协同过滤算法

　　4. knns.KNNWithMeans

　　说明：A basic collaborative filtering algorithm, taking into account the mean ratings of each user.

　　意思是：一个基本的协同过滤算法，考虑到每个用户的平均评分

　　5. knns.KNNBaseline

　　说明：A basic collaborative filtering algorithm taking into account a baseline rating.

　　意思是：一种基本的协同过滤算法考虑到基准评分

矩阵分解方法：

　　6. matrix_factorization.SVD

　　说明：The famous SVD algorithm, as popularized by Simon Funk during the Netflix Prize.

　　意思是：著名的SVD算法

　　7. matrix_factorization.SVDpp

　　说明：The SVD++ algorithm, an extension of SVD taking into account implicit ratings.

　　意思是：SVD++算法，SVD的一个扩展，考虑到隐式评级

　　8. matrix_factorization.NMF

　　说明：A collaborative filtering algorithm based on Non-negative Matrix Factorization.

　　意思是：一种基于非负矩阵的协同过滤算法

　　9. slope_one.SlopeOne

　　说明：A simple yet accurate collaborative filtering algorithm.

　　意思是：一种简单而准确的协同过滤算法

　　10. co_clustering.CoClustering

　　说明：A collaborative filtering algorithm based on co-clustering.

　　意思是：一种基于共聚类的系统过滤算法

其中基于近邻的方法（协同过滤）可以设定不同的度量准则

相似度度量标准

　　1. cosine

　　说明：Compute the cosine similarity between all pairs of users (or items).

　　意思是：计算所有用户对（或物品）之间的相似度

　　2. msd

　　说明：Compute the Mean Squared Difference similarity between all pairs of users (or items).

　　意思是：计算所有用户对（或物品）之间的平均平方差相似度

　　3. pearson

　　说明：Compute the Pearson correlation coefficient between all pairs of users (or items).

　　意思是：计算所有用户对（或物品）之间的皮尔逊相关系数

　　4. pearson_baseline

　　说明：Compute the (shrunk) Pearson correlation coefficient between all pairs of users (or items) using baselines for centering instead of means.

　　意思是：计算所有用户对（或物品）之间的皮尔逊相关系数（收缩），使用基线进行居中，而不是使用平均值

支持不同的评估准则

评估准则

　　1. rmse 最小均方根误差

　　2. mae 平均绝对误差

　　3. fcp 协调对的分数

参考文章：https://blog.csdn.net/mycafe_/article/details/79146764

查看全文

相关阅读:
nginx 按天生成日志
 cmder
EXCEL最大行数问题：org.apache.xmlbeans.impl.store.Saver$TextSaver.resize(Saver.java:1700)
nginx configure 错误记录
 Flume NetCat Demo
Flume
hbase
kafka安装配置
 azkaban
sqoop

原文地址：https://www.cnblogs.com/gezhuangzhuang/p/10206359.html

Python推荐系统库--Surprise理论

Surprise

支持多种推荐算法

其中基于近邻的方法（协同过滤）可以设定不同的度量准则

支持不同的评估准则