zoukankan      html  css  js  c++  java
  • [Machine learning] Logistic regression

    1. Variable definitions

    m : training examples' count

    (X) : design matrix. each row of (X) is a training example, each column of (X) is a feature

    [X = egin{pmatrix} 1 & x^{(1)}_1 & ... & x^{(1)}_n \ 1 & x^{(2)}_1 & ... & x^{(2)}_n \ ... & ... & ... & ... \ 1 & x^{(n)}_1 & ... & x^{(n)}_n \ end{pmatrix}]

    [ heta = egin{pmatrix} heta_0 \ heta_1 \ ... \ heta_n \ end{pmatrix}]

    2. Hypothesis

    [x= egin{pmatrix} x_0 \ x_1 \ ... \ x_n \ end{pmatrix} ]

    [h_ heta(x) = g( heta^T x) = g(x_0 heta_0 + x_1 heta_1 + ... + x_n heta_n) = frac{1}{1 + e^{(- heta^Tx)}}, ]

    sigmoid function

    [g(z) = frac{1}{1 + e^{-z}}, ]

    g = 1 ./ (1 + e .^ (-z));
    

    3. Cost function

    [J( heta) = frac{1}{m}sum_{i=1}^m[-y^{(i)}log(h_ heta(x^{(i)})) - (1-y^{(i)})log(1 - h_ heta(x^{(i)}))], ]

    vectorization edition of Octave

    J = -(1 / m) * sum(y' * log(sigmoid(X * theta)) + (1 - y)' * log(1 - sigmoid(X * theta)));
    

    4. Goal

    find ( heta) to minimize (J( heta)), ( heta) is a vector here

    4.1 Gradient descent

    [frac{partial J( heta)}{partial heta_j} = frac{1}{m} sum_{i=1}^m(h_ heta(x^{(i)}) - y^{(i)})x^{(i)}_j, ]

    repeat until convergence{
         ( heta_j := heta_j - alpha sum_{i=1}^m (h_ heta(x^{(i)}) - y^{(i)}) x^{(i)}_j)
    }

    vectorization

    (S)

    [= egin{pmatrix} h_ heta(x^{(1)})-y^{(1)} & h_ heta(x^{(2)})-y^{(2)} & ... & h_ heta(x^{(n)}-y^{(n)}) end{pmatrix} egin{pmatrix} x^{(1)}_0 & x^{(1)}_1 & ... & x^{(1)}_3 \ x^{(2)}_0 & x^{(2)}_1 & ... & x^{(2)}_3 \ ... & ... & ... & ... \ x^{(n)}_0 & x^{(n)}_1 & ... & x^{(n)}_3 \ end{pmatrix} ]

    [= egin{pmatrix} sum_{i=1}^m(h_ heta(x^{(i)}) - y^{(i)})x^{(i)}_0 & sum_{i=1}^m(h_ heta(x^{(i)}) - y^{(i)})x^{(i)}_1 & ... & sum_{i=1}^m(h_ heta(x^{(i)}) - y^{(i)})x^{(i)}_n end{pmatrix} ]

    [ heta = heta - S^T ]

    [h_ heta(X) = g(X heta) = frac{1}{1 + e^{(-X heta)}} ]

    (X heta) is nx1, (y) is nx1

    (frac{1}{1+e^{(-X heta)}} - y) is nx1

    [frac{1}{1 + e^{(-X heta)}} - y= egin{pmatrix} h_ heta(x^{(1)})-y^{(1)} & h_ heta(x^{(2)})-y^{(2)} & ... & h_ heta(x^{(n)})-y^{(n)} end{pmatrix} ]

    [ heta = heta - alpha(frac{1}{1 + e^{(-X heta)}} - y)X ]

    5. Regularized logistic regression

    to avoid overfitting or underfitting

    Cost function

    [J( heta) = frac{1}{m}sum_{i=1}^m[-y^{(i)}log(h_ heta(x^{(i)})) - (1-y^{(i)})log(1 - h_ heta(x^{(i)}))] + frac{lambda}{2m} sum_{j=1}^m heta^2_j, ]

    Gradient descent

    [frac{partial J( heta)}{partial heta_0} = frac{1}{m} sum_{i=1}^m(h_ heta(x^{(i)}) - y^{(i)})x^{(i)}_0, ]

    [frac{partial J( heta)}{partial heta_j} = frac{1}{m} sum_{i=1}^m(h_ heta(x^{(i)}) - y^{(i)})x^{(i)}_j, (j ge 1) ]

  • 相关阅读:
    Python Revisited Day 13 (正则表达式)
    Python Revisited Day 06 (面向对象程序设计)
    Python Revisited (变量)
    Python Revisited Day 05(模块)
    Python Revisited Day 04 (控制结构与函数)
    Python Revisited Day 03 (组合数据类型)
    Numpy
    Python Revisited Day 01
    Python3使用openpyxl读写Excel文件
    Python3操作YAML文件
  • 原文地址:https://www.cnblogs.com/arcsinw/p/9176526.html
Copyright © 2011-2022 走看看