zoukankan      html  css  js  c++  java
  • [Machine Learning] Linear regression

    1. Variable definitions

    m : training examples' count

    (y) :

    (X) : design matrix. each row of (X) is a training example, each column of (X) is a feature

    [X = egin{pmatrix} 1 & x^{(1)}_1 & ... & x^{(1)}_n \ 1 & x^{(2)}_1 & ... & x^{(2)}_n \ ... & ... & ... & ... \ 1 & x^{(n)}_1 & ... & x^{(n)}_n \ end{pmatrix}]

    [ heta = egin{pmatrix} heta_0 \ heta_1 \ ... \ heta_n \ end{pmatrix}]

    2. Hypothesis

    [x= egin{pmatrix} x_0 \ x_1 \ ... \ x_n \ end{pmatrix} ]

    [h_ heta(x) = g( heta^T x) = g(x_0 heta_0 + x_1 heta_1 + ... + x_n heta_n), ]

    sigmoid function

    [g(z) = frac{1}{1 + e^{-z}}, ]

    g = 1 ./ (1 + e .^ (-z));
    

    3. Cost functioin

    [J( heta) = frac{1}{m}sum_{i=1}^m[-y^{(i)}log(h_ heta(x^{(i)})) - (1-y^{(i)})log(1 - h_ heta(x^{(i)}))], ]

    vectorization edition of Octave

    J = -(1 / m) * sum(y' * log(sigmoid(X * theta)) + (1 - y)' * log(1 - sigmoid(X * theta)));
    

    4. Goal

    find ( heta) to minimize (J( heta)), ( heta) is a vector here

    4.1 Gradient descent

    [frac{partial J( heta)}{partial heta_j} = frac{1}{m} sum_{i=1}^m(h_ heta(x^{(i)}) - y^{(i)})x^{(i)}_j, ]

    repeat until convergence{
         ( heta_j := heta_j - frac{alpha}{m } sum_{i=1}^m (h_ heta(x^{(i)}) - y^{(i)}) x^{(i)}_j)
    }

    vectorization

    [S= egin{pmatrix} h_ heta(x^{(1)})-y^{(1)} & h_ heta(x^{(2)})-y^{(2)} & ... & h_ heta(x^{(n)}-y^{(n)}) end{pmatrix} egin{pmatrix} x^{(1)}_0 & x^{(1)}_1 & ... & x^{(1)}_3 \ x^{(2)}_0 & x^{(2)}_1 & ... & x^{(2)}_3 \ ... & ... & ... & ... \ x^{(n)}_0 & x^{(n)}_1 & ... & x^{(n)}_3 \ end{pmatrix} ]

    [= egin{pmatrix} sum_{i=1}^m(h_ heta(x^{(i)}) - y^{(i)})x^{(i)}_0 & sum_{i=1}^m(h_ heta(x^{(i)}) - y^{(i)})x^{(i)}_1 & ... & sum_{i=1}^m(h_ heta(x^{(i)}) - y^{(i)})x^{(i)}_n end{pmatrix} ]

    [ heta = heta - S^T ]

    [h_ heta(X) = g(X heta) = frac{1}{1 + e^{(-X heta)}} ]

    (X heta) is nx1, (y) is nx1

    (frac{1}{1+e^{X heta}} - y) is nx1

    [frac{1}{1 + e^{(-X heta)}} - y= egin{pmatrix} h_ heta(x^{(1)})-y^{(1)} & h_ heta(x^{(2)})-y^{(2)} & ... & h_ heta(x^{(n)})-y^{(n)} end{pmatrix} ]

    [ heta = heta - alpha(frac{1}{1 + e^{(-X heta)}} - y)X ]

  • 相关阅读:
    C++学习之路(四):线程安全的单例模式
    C++学习之路(三):volatile关键字
    C++学习之路(五):复制构造函数与赋值运算符重载
    类对象作为函数参数进行值传递
    System V共享内存介绍
    关于迭代器失效
    C++学习之路(二):引用
    C++学习之路(一):const与define,结构体对齐,new/delete
    epoll内核源码分析
    Redux中间件之redux-thunk使用详解
  • 原文地址:https://www.cnblogs.com/arcsinw/p/9105812.html
Copyright © 2011-2022 走看看