zoukankan      html  css  js  c++  java
  • 推荐系统

    推荐系统——电影评分

    例子:预测电影评分

    有如下信息

    Movie Alice (1) Bob (2) Carol (3) Dave (4)
    Love at last 5 5 0 0
    Romance forever 5 ? ? 0
    Cute puppies of love ? 4 0 ?
    Nonstop car chases 0 0 5 4
    Swords vs. karate 0 0 5 ?

    定义

    nu = 用户的数量

    nm = 电影的数量

    r(i, j) = 1 如果用户 j 给电影 i 打分

    y(i, j) = 当 r(i, j) = 1 的情况下,用户 j 给电影 i 打的分数(0-5)

    目标:预测 ?的值(未评分用户对电影的评分)

    现假设每个电影有两个特征

    Movie Alice (1) Bob (2) Carol (3) Dave (4)

    x1

    (remoance)

    x2

    (action)

    Love at last 5 5 0 0 0.9 0
    Romance forever 5 ? ? 0 1.0 0.01
    Cute puppies of love ? 4 0 ? 0.99 0
    Nonstop car chases 0 0 5 4 0.1 1.0
    Swords vs. karate 0 0 5 ? 0 0.9

    这样就有了电影特征的训练集,比如对于电影 Love at least 的特征向量为

    [{x^{left( 1 ight)}} = left[ {egin{array}{*{20}{c}}
    {{x_0}}\
    {{x_1}}\
    {{x_2}}
    end{array}} ight] = left[ {egin{array}{*{20}{c}}
    1\
    {0.9}\
    0
    end{array}} ight]]

    对于每个用户 j ,学习其对应的单数 θ(j) ∈ R3,然后用 (θ(1))Tx(i) 预测用户 j 对于电影 i 的评分。

    用 m(j) = 代表用户 j 评分的电影的数量,则定义学习目标

    [underbrace {min }_{{ heta ^{left( j ight)}}}frac{1}{{2{m^{left( j ight)}}}}sumlimits_{i:{r^{left( {i,j} ight)}} = 1} {{{left( {{{left( {{ heta ^{left( j ight)}}} ight)}^T}left( {{x^{left( i ight)}}} ight) - {y^{left( {i,j} ight)}}} ight)}^2}}  + frac{lambda }{{2{m^{left( j ight)}}}}sumlimits_{k = 1}^n {{{left( { heta _k^{left( j ight)}} ight)}^2}} ]

    在推荐系统中会将 m(j) 去掉,因为他是常数且不会影响计算得到的 θ(j)

    [underbrace {min }_{{ heta ^{left( j ight)}}}frac{1}{2}sumlimits_{i:{r^{left( {i,j} ight)}} = 1} {{{left( {{{left( {{ heta ^{left( j ight)}}} ight)}^T}left( {{x^{left( i ight)}}} ight) - {y^{left( {i,j} ight)}}} ight)}^2}}  + frac{lambda }{2}sumlimits_{k = 1}^n {{{left( { heta _k^{left( j ight)}} ight)}^2}} ]

    定义所有用户的学习目标

    [underbrace {min }_{{ heta ^{left( 1 ight)}},...,{ heta ^{left( {{n_u}} ight)}}}frac{1}{2}sumlimits_{j = 1}^{{n_u}} {left[ {sumlimits_{i:{r^{left( {i,j} ight)}} = 1} {{{left( {{{left( {{ heta ^{left( j ight)}}} ight)}^T}left( {{x^{left( i ight)}}} ight) - {y^{left( {i,j} ight)}}} ight)}^2}}  + lambda sumlimits_{k = 1}^n {{{left( { heta _k^{left( j ight)}} ight)}^2}} } ight]} ]

     然后运用梯度下降算法得到最优的 θ

    [egin{array}{l}
    heta _k^{left( j ight)}: = heta _k^{left( j ight)} - alpha sumlimits_{i:{r^{left( {i,j} ight)}} = 1} {left( {{{left( {{ heta ^{left( j ight)}}} ight)}^T}left( {{x^{left( i ight)}}} ight) - {y^{left( {i,j} ight)}}} ight)x_k^{left( i ight)}} ---for-k=0\
    heta _k^{left( j ight)}: = heta _k^{left( j ight)} - alpha left( {sumlimits_{i:{r^{left( {i,j} ight)}} = 1} {left( {{{left( {{ heta ^{left( j ight)}}} ight)}^T}left( {{x^{left( i ight)}}} ight) - {y^{left( {i,j} ight)}}} ight)x_k^{left( i ight)} + + lambda heta _k^{left( j ight)}} } ight)---for-k≠0
    end{array}]


    推荐系统——系统过滤

    对于数据

    Movie Alice (1) Bob (2) Carol (3) Dave (4)

    x1

    (remoance)

    x2

    (action)

    Love at last 5 5 0 0 0.9 0
    Romance forever 5 ? ? 0 1.0 0.01
    Cute puppies of love ? 4 0 ? 0.99 0
    Nonstop car chases 0 0 5 4 0.1 1.0
    Swords vs. karate 0 0 5 ? 0 0.9

     一般情况下其实很难知道一个点有有“多么‘浪漫’”(比如 浪漫概率为 0.9)或者“多么‘动作’”,因此数据应该是如下形态

    Movie Alice (1) Bob (2) Carol (3) Dave (4)

    x1

    (remoance)

    x2

    (remoance)

    Love at last 5 5 0 0 ? ?
    Romance forever 5 ? ? 0 ? ?
    Cute puppies of love ? 4 0 ? ? ?
    Nonstop car chases 0 0 5 4 ? ?
    Swords vs. karate 0 0 5 ? ? ?

    但是,我们可以调查用户有多么喜欢“浪漫”电影,多么喜欢“动作”电影。因此,我们可以得到如下数据

    [{ heta ^{left( 1 ight)}} = left[ {egin{array}{*{20}{c}}
    0\
    5\
    0
    end{array}} ight],{ heta ^{left( 2 ight)}} = left[ {egin{array}{*{20}{c}}
    0\
    5\
    0
    end{array}} ight],{ heta ^{left( 3 ight)}} = left[ {egin{array}{*{20}{c}}
    0\
    0\
    5
    end{array}} ight],{ heta ^{left( 4 ight)}} = left[ {egin{array}{*{20}{c}}
    0\
    0\
    5
    end{array}} ight]]

    分析:对于电影“Love at last”,我们知道 Alice 和 Bob 喜欢这部电影,Carol 和 Dave 不喜欢这部电影;而 Alice 和 Bob 又都喜欢“浪漫电影”,Carol 和 Dave 又都不喜欢“浪漫”电影,我们可以推断这部电影是“浪漫”电影,而不是“动作电影”,即(x1 = 1.0, x2 = 0.0)

    运用公式表达就是

    [egin{array}{l}
    {left( {{ heta ^{left( 1 ight)}}} ight)^T}{x^{left( 1 ight)}} approx 5\
    {left( {{ heta ^{left( 2 ight)}}} ight)^T}{x^{left( 1 ight)}} approx 5\
    {left( {{ heta ^{left( 3 ight)}}} ight)^T}{x^{left( 1 ight)}} approx 0\
    {left( {{ heta ^{left( 4 ight)}}} ight)^T}{x^{left( 1 ight)}} approx 0
    end{array}]

    这样,在已知 θ 的情况下得到

    [{x^{left( 1 ight)}} = left[ {egin{array}{*{20}{c}}
    1\
    {1.0}\
    {0.0}
    end{array}} ight]]

    -------------------------------------------------------------------------

    这时,我们的问题就转化成已知 θ(i),..., θ(nu)

    学习 x(i)

    问题转化为

    [underbrace {min }_{{x^{left( j ight)}}}frac{1}{2}sumlimits_{i:{r^{left( {i,j} ight)}} = 1} {{{left( {{{left( {{ heta ^{left( j ight)}}} ight)}^T}left( {{x^{left( i ight)}}} ight) - {y^{left( {i,j} ight)}}} ight)}^2}}  + frac{lambda }{2}sumlimits_{k = 1}^n {{{left( {x_k^{left( j ight)}} ight)}^2}} ]

    对于所有 x(i),..., x(nm)。问题为

    [underbrace {min }_{{x^{left( 1 ight)}},...,{x^{left( {{n_m}} ight)}}}frac{1}{2}sumlimits_{j = 1}^{{n_m}} {left[ {sumlimits_{i:{r^{left( {i,j} ight)}} = 1} {{{left( {{{left( {{ heta ^{left( j ight)}}} ight)}^T}left( {{x^{left( i ight)}}} ight) - {y^{left( {i,j} ight)}}} ight)}^2}}  + lambda sumlimits_{k = 1}^n {{{left( {x_k^{left( i ight)}} ight)}^2}} } ight]} ]


    因此,对于给定 x(i),..., x(nm) 和 “电影评分”,可以评价 θ(i),..., θ(nu)

    给定 θ(i),..., θ(nu) 和 “电影评分”,可以评价 x(i),..., x(nm)。 

    当遇到问题时,可以

    随机猜测 θ-->x-->θ-->x-->θ-->x-->...

    -------------------------------------------------------------------------

    在“协同过滤”应用中,并不是θ-->x-->θ-->x-->θ-->x-->...,而是将两者结合在一起

    [Jleft( {{x^{left( 1 ight)}},...,{x^{left( {{n_m}} ight)}},{ heta ^{left( 1 ight)}},...,{ heta ^{left( {{n_u}} ight)}}} ight) = frac{1}{2}sumlimits_{left( {i,j} ight):rleft( {i,j} ight) = 1} {{{left( {{{left( {{ heta ^{left( j ight)}}} ight)}^T}left( {{x^{left( i ight)}}} ight) - {y^{left( {i,j} ight)}}} ight)}^2}}  + frac{lambda }{2}sumlimits_{i = 1}^{{n_m}} {sumlimits_{j = 1}^n {{{left( {x_k^{left( i ight)}} ight)}^2}} }  + frac{lambda }{2}sumlimits_{j = 1}^{{n_u}} {sumlimits_{j = 1}^n {{{left( { heta _k^{left( j ight)}} ight)}^2}} } ]

    [underbrace {min }_{{x^{left( 1 ight)}},...,{x^{left( {{n_m}} ight)}},{ heta ^{left( 1 ight)}},...,{ heta ^{left( {{n_u}} ight)}}}Jleft( {{x^{left( 1 ight)}},...,{x^{left( {{n_m}} ight)}},{ heta ^{left( 1 ight)}},...,{ heta ^{left( {{n_u}} ight)}}} ight)]

    -------------------------------------------------------------------------

    总结:协同过滤算法

    • 随机初始化x(i),..., x(nm), θ(i),..., θ(nu)为小的随机值
    • 运用梯度下降算法或者别的优化算法最小化 J(x(i),..., x(nm), θ(i),..., θ(nu))
    • 当用户给定他的 θ 时,就可以结合算法学习得来的 x,运用 (θ(j))Tx 预测电影的得分。

    -------------------------------------------------------------------------

    协同算法的矩阵实现

    对于数据

    Movie Alice (1) Bob (2) Carol (3) Dave (4)
    Love at last 5 5 0 0
    Romance forever 5 ? ? 0
    Cute puppies of love ? 4 0 ?
    Nonstop car chases 0 0 5 4
    Swords vs. karate 0 0 5 ?

    如果定义

    [Y = left[ {egin{array}{*{20}{c}}
    5&5&0&0\
    5&?&?&0\
    ?&4&0&?\
    0&0&5&4\
    0&0&5&0
    end{array}} ight]]

    [Pr edictedratings = left[ {egin{array}{*{20}{c}}
    {{{left( {{ heta ^{left( 1 ight)}}} ight)}^T}left( {{x^{left( 1 ight)}}} ight)}&{{{left( {{ heta ^{left( 2 ight)}}} ight)}^T}left( {{x^{left( 1 ight)}}} ight)}&.&{{{left( {{ heta ^{left( {{n_u}} ight)}}} ight)}^T}left( {{x^{left( 1 ight)}}} ight)}\
    {{{left( {{ heta ^{left( 1 ight)}}} ight)}^T}left( {{x^{left( 2 ight)}}} ight)}&{{{left( {{ heta ^{left( 2 ight)}}} ight)}^T}left( {{x^{left( 2 ight)}}} ight)}&.&{{{left( {{ heta ^{left( {{n_u}} ight)}}} ight)}^T}left( {{x^{left( 2 ight)}}} ight)}\
    .&.&.&.\
    {{{left( {{ heta ^{left( 1 ight)}}} ight)}^T}left( {{x^{left( {{n_m}} ight)}}} ight)}&{{{left( {{ heta ^{left( 2 ight)}}} ight)}^T}left( {{x^{left( {{n_m}} ight)}}} ight)}&.&{{{left( {{ heta ^{left( {{n_u}} ight)}}} ight)}^T}left( {{x^{left( {{n_m}} ight)}}} ight)}
    end{array}} ight]]

    [X = left[ {egin{array}{*{20}{c}}
    { - {{left( {{x^{left( 1 ight)}}} ight)}^T} - }\
    { - {{left( {{x^{left( 2 ight)}}} ight)}^T} - }\
    .\
    { - {{left( {{x^{left( {{n_m}} ight)}}} ight)}^T} - }
    end{array}} ight],Theta = left[ {egin{array}{*{20}{c}}
    { - {{left( {{ heta ^{left( 1 ight)}}} ight)}^T} - }\
    { - {{left( {{ heta ^{left( 2 ight)}}} ight)}^T} - }\
    .\
    { - {{left( {{ heta ^{left( {{n_u}} ight)}}} ight)}^T} - }
    end{array}} ight]]

    [Pr edictedratings = X{Theta ^T}]

    -------------------------------------------------------------------------

    发现电影的相关性

    如果你通过上述算法得到了电影的特征 x(i)。

    现在有 5 个已知的电影和它们的特征,如何判断上述特征为 x(i) 的的电影与现有的先惯性大呢?

    分别计算这五个电影与上述电影的“距离”,并寻找最小的距离的那个电影就是与这个电影相关性较大的

    [left| {{x^{left( i ight)}} - {x^{left( j ight)}}} ight|]

    -------------------------------------------------------------------------

    协同过滤算法中的均值归一化

    对于数据,如果出现用户没有对任何电影评分

    Movie Alice (1) Bob (2) Carol (3) Dave (4) Eve (5)
    Love at last 5 5 0 0 ?
    Romance forever 5 ? ? 0 ?
    Cute puppies of love ? 4 0 ? ?
    Nonstop car chases 0 0 5 4 ?
    Swords vs. karate 0 0 5 ?

    ?

    这种情况下,在最小化代价函数时

    [underbrace {min }_{{x^{left( 1 ight)}},...,{x^{left( {{n_m}} ight)}},{ heta ^{left( 1 ight)}},...,{ heta ^{left( {{n_u}} ight)}}}frac{1}{2}sumlimits_{left( {i,j} ight):rleft( {i,j} ight) = 1} {{{left( {{{left( {{ heta ^{left( j ight)}}} ight)}^T}left( {{x^{left( i ight)}}} ight) - {y^{left( {i,j} ight)}}} ight)}^2}}  + frac{lambda }{2}sumlimits_{i = 1}^{{n_m}} {sumlimits_{j = 1}^n {{{left( {x_k^{left( i ight)}} ight)}^2}} }  + frac{lambda }{2}sumlimits_{j = 1}^{{n_u}} {sumlimits_{j = 1}^n {{{left( { heta _k^{left( j ight)}} ight)}^2}} } ]

    对于公示的第一部分

    [frac{1}{2}sumlimits_{left( {i,j} ight):rleft( {i,j} ight) = 1} {{{left( {{{left( {{ heta ^{left( j ight)}}} ight)}^T}left( {{x^{left( i ight)}}} ight) - {y^{left( {i,j} ight)}}} ight)}^2}} ]

    由于没有 r(i, j) = 1 的情况,所以这部分无用

    对于

    [frac{lambda }{2}sumlimits_{j = 1}^{{n_u}} {sumlimits_{j = 1}^n {{{left( { heta _k^{left( j ight)}} ight)}^2}} } ]

    最小化这个部分的结果是(假设电影有两个特征)

    [{ heta ^{left( 5 ight)}} = left[ {egin{array}{*{20}{c}}
    0\
    0
    end{array}} ight]]

    进而导致预测结果 

    [{left( {{ heta ^{left( 5 ight)}}} ight)^T}left( {{x^{left( i ight)}}} ight) = 0]

    可以看出这样是不对的或者无意义的。

    均值归一化的做法是将 Y 减去每一行的均值

    [Y = left[ {egin{array}{*{20}{c}}
    5&5&0&0&?\
    5&?&?&0&?\
    ?&4&0&?&?\
    0&0&5&4&?\
    0&0&5&0&?
    end{array}} ight],mu = left[ {egin{array}{*{20}{c}}
    {egin{array}{*{20}{c}}
    {2.5}\
    {2.5}
    end{array}}\
    2\
    {2.25}\
    {1.25}
    end{array}} ight]]

    [Y = Y - mu = left[ {egin{array}{*{20}{c}}
    {2.5}&{2.5}&{ - 2.5}&{ - 2.5}&?\
    {2.5}&?&?&{ - 2.5}&?\
    ?&2&{ - 2}&?&?\
    { - 2.25}&{ - 2.25}&{2.75}&{1.75}&?\
    { - 1.25}&{ - 1.25}&{3.75}&{ - 1.25}&?
    end{array}} ight]]

    用新的 Y 取训练模型得到参数 x,当预测用户 j 对点电影 i 的评分时,预测结果应该是

    [{left( {{ heta ^{left( 5 ight)}}} ight)^T}left( {{x^{left( i ight)}}} ight) + {mu _i}]

    这时,对于用户 Eve 的预测结果就是其它评分的均值

    [{left( {{ heta ^{left( 5 ight)}}} ight)^T}left( x ight) + {mu _i} = left[ {egin{array}{*{20}{c}}
    {egin{array}{*{20}{c}}
    {2.5}\
    {2.5}
    end{array}}\
    2\
    {2.25}\
    {1.25}
    end{array}} ight]]

  • 相关阅读:
    jenkins X实践系列(2) —— 基于jx的DevOps实践
    K8S集群安装
    google gcr.io、k8s.gcr.io 国内镜像
    使用.NET Core+Docker 开发微服务
    APM 原理与框架选型
    统一配置中心选型对比
    【开源小软件 】Bing每日壁纸 V1.2.1
    【开源小软件 】Bing每日壁纸 让桌面壁纸保持更新
    互联网企业级监控系统 OpenFalcon
    完整的房间类游戏解决方案AiJ
  • 原文地址:https://www.cnblogs.com/qkloveslife/p/9908364.html
Copyright © 2011-2022 走看看