zoukankan      html  css  js  c++  java
  • 少数人的智慧

    郑昀@玩聚SR 20091105

    一、冷启动

    Greg Linden针对最新的一篇论文:"The Wisdom of the Few: A Collaborative Filtering Approach Based on Expert Opinions from the Web" (PDF,即《少数人的智慧:基于网络专家意见的协同过滤研究》) 做了如下点评

    What they do say is that using a very small pool of experts works surprisingly well.

    论文说的是,用很小一个专家池,推荐效果惊人地好。

    In particular, I think it suggests a good alternative to content-based methods for bootstrapping a recommender system.

    我认为它为一个推荐系统的自启动指出了一个很好的替代选择。

    If you can create a high quality pool of experts, even a fairly small one, you may have good results starting with that while you work to gather ratings from the broader community.

    即,选择一个高质量专家池,可以是你组建的团队,也可以是你选中的专家群,即使是相当小的一个群体,你的推荐系统也会有一个非常好的开端。少数人的智慧,此时此刻,可以解决推荐系统的冷启动问题。这也是玩聚SR最开始选择Experts Pool作为起源,一上来就有很好信息过滤器效果的原因。

    二、论文的摘要:

    为了方便理解,下面意译一下该论文:

    最近邻协同过滤(Nearest-neighbor collaborative filtering)是一个很有效的推荐方法。但它总受困于这几个问题:

    数据稀疏和噪音;冷启动问题(cold-start);可扩展性问题。

    所以论文作者提出一个新方法,一个传统协同过滤方法的变种:

    并不是对用户打分数据(User-rating data)实施最近邻算法,而是用一个专家邻居(expert neighbors)集合作为比对样本,去计算这批人与目标用户的相似度。

    这个方法至少没有太大可扩展性问题,相当于缩小了比对的基准集合。最近邻原方法可近似理解为做两两比对,计算肯定花时间,而且当新用户(尤其是某某观光团的到来会让数据噪音多得一塌糊涂)比比皆是时,没有几条数据能够让你进行相似性计算。

    作者定义专家为,在给定领域,能够产生思虑周全的、始终如一的和可靠的评估(评分)、我们可信任的独立个体。

    (原文:

    We define an expert as an individual that we can
    trust to have produced thoughtful, consistent and reliable
    evaluations (ratings) of items in a given domain.

    我们比较关注论文作者们的以下两个探讨问题的角度:

    (a) study how preferences of a large population can be pre-
    dicted by using a very small set of users;

    研究用一小群用户去预测海量用户到底有多大的可参考价值;

    (c) analyze whether professional raters are good predictors for general users;

    如果这几个角度是可行的话,那么实际上并不需要拿到一个海量用户社区的所有数据,只要锁定Experts Pool即可为用户进行推荐。

    附录:

    Greg Linden在被封的BlogSpot的原文如下:

    Wednesday, November 04, 2009

    Using only experts for recommendations
    A recent paper from SIGIR, "The Wisdom of the Few: A Collaborative Filtering Approach Based on Expert Opinions from the Web" (PDF), has a very useful exploration into the effectiveness of recommendations using only a small pool of trusted experts.
    The results suggest that using a small pool of a couple hundred experts, possibly your own experts or experts selected and mined from the web, has quite a bit of value, especially in cases where big data from a large community is unavailable.
    A brief excerpt from the paper:
    Recommending items to users based on expert opinions .... addresses some of the shortcomings of traditional CF: data sparsity, scalability, noise in user feedback, privacy, and the cold-start problem .... [Our] method's performance is comparable to traditional CF algorithms, even when using an extremely small expert set .... [of] 169 experts.
    Our approach requires obtaining a set of ... experts ... [We] crawled the Rotten Tomatoes web site –- which aggregates the opinions of movie critics from various media sources -- to obtain expert ratings of the movies in the Netflix data set.
    The authors certainly do not claim that using a small pool of experts is better than traditional collaborative filtering.
    What they do say is that using a very small pool of experts works surprisingly well. In particular, I think it suggests a good alternative to content-based methods for bootstrapping a recommender system. If you can create a high quality pool of experts, even a fairly small one, you may have good results starting with that while you work to gather ratings from the broader community.

    其他文章:

    随手小记 才知道与寻鬼之旅

    别太相信 PubSubHubbub

    随手小记 才知道系列

    语义与特征

  • 相关阅读:
    wireShark 代码分析
    Flex Chart / Charting 图表参考
    Boost笔记
    mysql的常用开发工具【建模、维护、监控】
    DSL应用集成和Rhino 3
    元编程 Metaprogramming
    Coffeescript的使用简要
    Ruby基础[Programing ruby笔记]
    编程范式/范型参考 programming paradigm
    DSL语法、组成 2
  • 原文地址:https://www.cnblogs.com/zhengyun_ustc/p/The_Wisdom_of_the_Few.html
Copyright © 2011-2022 走看看