zoukankan      html  css  js  c++  java
  • 逻辑回归推导

    样本((x_{i}),(y_{i}))个数为(m):

    [{x_{1},x_{2},x_{3}...x_{m}} ]

    [{y_{1},y_{2},y_{3}...y_{m}} ]

    其中(x_{i})为(n-1)维向量(在最后添加一个1,和(w)的维度对齐,用于向量相乘):

    [x_{i}={x_{i1},x_{i2},x_{i3}...x_{i(n-1)},1} ]

    [y_{i}in{0,1} ]

    其中(w)为(n)维向量:

    [w={w_{1},w_{2},w_{3}...w_{n}} ]

    回归函数:

    [h_{w}(x_{i})=frac{1}{1+e^{wx_{i}}} ]

    概率分布:

    [P(y=1|x;w)=h_{w}(x) ]

    [P(y=0|x;w)=1-h_{w}(x) ]

    [P(y|x;w)=h_{w}(x)^{y}*(1-h_{w}(x))^{1-y} ]

    极大似然函数:

    [L(w)=prod_{i=1}^{m}P(y_{i}|x_{i};w) =prod_{i=1}^{m}h_{w}(x_{i})^{y_{i}}*(1-h_{w}(x_{i}))^{1-y_{i}} ]

    函数两边取对数:

    [lnL(w)=sum_{i=1}^{m}y_{i}lnh_{w}(x_{i})+(1-y_{i})ln(1-h_{w}(x_{i})) ]

    [求w->max_{lnL(w)} ]

    损失函数:

    [J(w)=-frac{1}{m}*sum_{i=1}^{m}y_{i}lnh_{w}(x_{i})+(1-y_{i})ln(1-h_{w}(x_{i})) ]

    [求w->min_{J(w)} ]

    损失函数对(w)中的每个(w_{j})求偏导数(梯度下降求最小值):

    [frac{partial J(w)}{partial w_{j}}=frac{partial}{partial w_{j}}-frac{1}{m}*sum_{i=1}^{m}y_{i}lnh_{w}(x_{i})+(1-y_{i})ln(1-h_{w}(x_{i})) ]

    [=-frac{1}{m}*sum_{i=1}^{m}frac{y_{i}}{h_{w}(x_{i})}*frac{partial h_{w}(x_{i})}{partial w_{j}}+frac{1-y_{i}}{1-h_{w}(x_{i})}*frac{partial (1-h_{w}(x_{i}))}{partial w_{j}} ]

    [=-frac{1}{m}*sum_{i=1}^{m}(frac{y_{i}}{h_{w}(x_{i})}-frac{1-y_{i}}{1-h_{w}(x_{i})})*frac{partial h_{w}(x_{i})}{partial w_{j}} ]

    [=-frac{1}{m}*sum_{i=1}^{m}(frac{y_{i}}{h_{w}(x_{i})}-frac{1-y_{i}}{1-h_{w}(x_{i})})*frac{partial h_{w}(x_{i})}{partial wx_{i}}*frac{partial wx_{i}}{partial w_{j}} ]

    [=-frac{1}{m}*sum_{i=1}^{m}(frac{y_{i}}{h_{w}(x_{i})}-frac{1-y_{i}}{1-h_{w}(x_{i})})*h_{w}(x_{i})*(1-h_{w}(x_{i}))*frac{partial wx_{i}}{partial w_{j}} ]

    [=frac{1}{m}*sum_{i=1}^{m}(h_w(x_{i})-y_{i})*x_{ij} ]

    更新(w)中的每个(w_{j})的值,其中(alpha)为学习速度:

    [w_{j}:=w_{j}-alpha*frac{partial J(w)}{partial w_{j}} ]

    批量梯度下降:使用所有样本值进行更新(w)中的每个(w_{j})的值

    [w_{j}:=w_{j}-alpha*frac{1}{m}*sum_{i=1}^{m}(h_{w}(x_{i})-y_{i})*x_{ij} ]

  • 相关阅读:
    爬虫第二篇:爬虫详解之存储数据
    数据分析第六篇:机器学习分类
    数据分析第五篇:数据库多表连接操作
    数据分析第四篇:数据清洗
    pyspark搭建使用
    YARN 调度器
    InnoDB: Error: page xxx log sequence number xx xxx InnoDB: is in the future! Current system log sequence number xx xxx.
    瞬时连接所属进程
    NTP工作原理
    kudu NTP问题优化
  • 原文地址:https://www.cnblogs.com/smallredness/p/11045121.html
Copyright © 2011-2022 走看看