样本((x_{i}),(y_{i}))个数为(m):
[{x_{1},x_{2},x_{3}...x_{m}}
]
[{y_{1},y_{2},y_{3}...y_{m}}
]
其中(x_{i})为(n-1)维向量(在最后添加一个1,和(w)的维度对齐,用于向量相乘):
[x_{i}={x_{i1},x_{i2},x_{i3}...x_{i(n-1)},1}
]
其中(w)为(n)维向量:
[w={w_{1},w_{2},w_{3}...w_{n}}
]
回归函数:
[h_{w}(x_{i})=wx_{i}
]
损失函数:
[J(w)=frac{1}{2}sum_{i=1}^{m}(h_{w}(x_{i})-y_{i})^2
]
[求w->min_{J(w)}
]
损失函数对(w)中的每个(w_{j})求偏导数:
[frac{partial J(w)}{partial w_{j}}=frac{partial}{partial w_{j}}sum_{i=1}^{m}(h_{w}(x_{i})-y_{i})^2
]
[=frac{1}{2}*2*sum_{i=1}^{m}(h_{w}(x_{i})-y_{i})*frac{partial (h_{w}(x_{i})-y_{i})}{partial w_{j}}
]
[=sum_{i=1}^{m}(h_{w}(x_{i})-y_{i})*frac{partial (wx_{i}-y_{i})}{partial w_{j}}
]
[frac{partial J(w)}{partial w_{j}}=sum_{i=1}^{m}(h_{w}(x_{i})-y_{i})*x_{ij}
]
更新(w)中的每个(w_{j})的值,其中(alpha)为学习速度:
[w_{j}:=w_{j}-alpha*frac{partial J(w)}{partial w_{j}}
]
批量梯度下降:使用所有样本值进行更新(w)中的每个(w_{j})的值
[w_{j}:=w_{j}-alpha*sum_{i=1}^{m}(h_{w}(x_{i})-y_{i})*x_{ij}
]