一:定义
简单理解:y=ax+b,x是特征值,y是标记,模型就是计算a和b的值。
二:最优模型
尽量使的y的预测值与真实值的差小,即对 y-y(i)进行求和使其值最小,即(y-y(i))^2小。
主要是最小二乘法和对a,b求偏导,得出
a = 对 (x(i)-x的平均值)*(y(i)-y的平均值) 求和/对 (x(i)-x的平均值)^2 求和
b = y的平均值-a*x的平均值
三:自定义线性回归类测试
import numpy as np class Simple_Linear_Regression: def __init__(self): self.a_ = None self.b_ = None def fit(self,x,y): num = 0.0 d = 0.0 x_mean = np.mean(x) y_mean = np.mean(y) # for x_i,y_i in zip(x,y): # num += (x_i-x_mean)*(y_i-y_mean) # d += (x_i-x_mean)**2 #向量法 num = (x-x_mean).dot(y-y_mean) d = (x-x_mean).dot(x-x_mean) self.a_ = num/d self.b_ = y_mean - self.a_ * x_mean return self def predict(self,x): x = np.array(x) y_predict = [self._pre(i) for i in x] return y_predict def _pre(self,x): return self.a_ * x + self.b_ def __repr__(self): print("Simple_Linear_Regression") x = np.array([1.,2.,3.,4.,5.]) y = np.array([1.,3.,2.,3.,5.]) s = Simple_Linear_Regression() s.fit(x,y) #参数 print(s.a_) print(s.b_) x_test = np.array([6,4,8]) y_pre = s.predict(x_test) print(y_pre)
四:图形展示
plt.scatter(x,y) plt.plot(x,s.a_*x+s.b_,color='r') plt.show()