zoukankan      html  css  js  c++  java
  • AdaBoost

    • 算法特征:
      ①. 模型级联加权(级联权重); ②. 样本特征选择; ③. 样本权重更新(关联权重)
    • 算法原理:
      Part Ⅰ:
      给定如下原始数据集:
      egin{equation}
      D = {(x^{(1)}, ar{y}^{(1)}), (x^{(2)}, ar{y}^{(2)}), cdots, (x^{(n)}, ar{y}^{(n)})}, quad ext{where }ar{y}^{(i)} in {-1, +1}
      label{eq_1}
      end{equation}
      指定级联加权弱模型之最大数量为$T$. 对于第$t$个弱模型$h_t(x)$, 其样本关联权重分布如下:
      egin{equation}
      W_t = { w_t^{(1)}, w_t^{(2)}, cdots, w_t^{(n)} }
      label{eq_2}
      end{equation}
      主要流程如下:
      步骤1. 均匀初始化样本关联权重:
      egin{equation}
      W_1 = { w_1^{(1)}, w_1^{(2)}, cdots, w_1^{(n)}}, quad ext{where }w_1^{(i)} = frac{1}{n}
      label{eq_3}
      end{equation}
      步骤2. 对于迭代次数$t = 1, 2, cdots, T$, 在训练样本上基于关联权重分布$W_t$获得弱模型$h_t(x) = pm 1$.
      注意:
      ①. 此弱模型的构建依赖于训练样本的特征选择操作;
      ②. 若弱模型支持权重样本, 重赋权(调整损失函数)以更新模型, 否则, 重采样(有放回采样)以更新模型.
      步骤3. 计算弱模型$h_t(x)$在训练样本上的加权错误率:
      egin{equation}
      epsilon_t = sum_{i=1}^nw_t^{(i)}I(h_t(x^{(i)}) eq ar{y}^{(i)})
      label{eq_4}
      end{equation}
      计算弱模型$h_t(x)$在最终决策模型中的级联权重:
      egin{equation}
      alpha_t = frac{1}{2}mathrm{ln}frac{1-epsilon_t}{epsilon_t}
      label{eq_5}
      end{equation}
      步骤4. 更新样本关联权重
      egin{equation}
      egin{split}
      W_{t+1} &= { w_{t+1}^{(1)}, w_{t+1}^{(2)}, cdots, w_{t+1}^{(n)} }   \
      w_{t+1}^{(i)} &= frac{w_{t}^{(i)}}{Z_t}mathrm{exp}(-alpha_tar{y}^{(i)}h_t(x^{(i)}))   \
      Z_t &= sum_{i=1}^{n}w_{t}^{(i)}mathrm{exp}(-alpha_tar{y}^{(i)}h_t(x^{(i)}))
      end{split}
      label{eq_6}
      end{equation}
      步骤5. 回到步骤2
      步骤6. 最终决策模型为:
      egin{equation}
      egin{split}
      f(x) &= sum_{t=1}^{T}alpha_th_t(x)   \
      h_{final}(x) &= mathrm{sign}(f(x)) = mathrm{sign}(sum_{t=1}^Talpha_th_t(x))
      end{split}
      label{eq_7}
      end{equation}
      Part Ⅱ:
      定理:
      若弱模型$h_t(x)$在训练集上加权错误率$epsilon_t < 0.5$, 则随着$T$的增加, AdaBoost最终决策模型$h_{final}(x)$在训练集上错误率$E$将越来越小, 即:
      egin{equation*}
      E = frac{1}{n}sum_{i=1}^{n}I(h_{final}(x^{(i)}) eq ar{y}^{(i)}) leq frac{1}{n}sum_{i=1}^{n}mathrm{exp}(-ar{y}^{(i)}f(x^{(i)})) = prod_{t=1}^{T}Z_t
      end{equation*}
      证明:
      egin{equation*}
      egin{split}
      E &leq frac{1}{n}sum_{i=1}^nmathrm{exp}(-ar{y}^{(i)}f(x^{(i)}))   \
      &=frac{1}{n}sum_{i=1}^nmathrm{exp}(-ar{y}^{(i)}sum_{t=1}^Talpha_th_t(x^{(i)}))   \
      &=frac{1}{n}sum_{i=1}^nmathrm{exp}(sum_{t=1}^T-alpha_tar{y}^{(i)}h_t(x^{(i)}))   \
      &=sum_{i=1}^nw_1^{(i)}prod_{t=1}^Tmathrm{exp}(-alpha_tar{y}^{(i)}h_t(x^{(i)}))   \
      &=sum_{i=1}^nw_1^{(i)}mathrm{exp}(-alpha_1ar{y}^{(i)}h_1(x^{(i)}))prod_{t=2}^Tmathrm{exp}(-alpha_tar{y}^{(i)}h_t(x^{(i)}))   \
      &=sum_{i=1}^nZ_1w_2^{(i)}prod_{t=2}^Tmathrm{exp}(-alpha_tar{y}^{(i)}h_t(x^{(i)}))   \
      &=Z_1sum_{i=1}^nw_2^{(i)}prod_{t=2}^Tmathrm{exp}(-alpha_tar{y}^{(i)}h_t(x^{(i)}))   \
      &=prod_{t=1}^TZ_t
      end{split}
      end{equation*}
      egin{equation*}
      egin{split}
      Z_t &= sum_{i=1}^{n}w_{t}^{(i)}mathrm{exp}(-alpha_tar{y}^{(i)}h_t(x^{(i)}))   \
      &=sum_{i=1; ar{y}^{(i)}=h_t(x^{(i)})}^nw_t^{(i)}mathrm{exp}(-alpha_t) + sum_{i=1; ar{y}^{(i)} eq h_t(x^{(i)})}^nw_t^{(i)}mathrm{exp}(alpha_t)   \
      &=mathrm{exp}(-alpha_t)sum_{i=1; ar{y}^{(i)}=h_t(x^{(i)})}^nw_t^{(i)} + mathrm{exp}(alpha_t)sum_{i=1; ar{y}^{(i)} eq h_t(x^{(i)})}^nw_t^{(i)}   \
      &=mathrm{exp}(-alpha_t)(1 - epsilon_t) + mathrm{exp}(alpha_t)epsilon_t   \
      &=2sqrt{epsilon_t(1 - epsilon_t)}
      end{split}
      end{equation*}
      对于弱模型$h_t(x)$, 若其在训练集上的加权错误率$epsilon_t < 0.5$, 则$Z_t < 1$. 此时, 随着$T$的增加, AdaBoost最终决策模型$h_{final}(x)$在训练集上错误率$E$将越来越小. 证毕.
    • 代码实现:
      本文以线性SVM作为弱模型, 对AdaBoost进行算法实施. SVM相关内容详见:
      Smooth Support Vector Machine - Python实现
        1 # AdaBoost之实现
        2 # 注意, 采用级联加权SVM实施之
        3 
        4 
        5 import numpy
        6 from matplotlib import pyplot as plt
        7 
        8 
        9 
       10 def spiral_point(val, center=(0, 0)):
       11     rn = 0.4 * (105 - val) / 104
       12     an = numpy.pi * (val - 1) / 25
       13 
       14     x0 = center[0] + rn * numpy.sin(an)
       15     y0 = center[1] + rn * numpy.cos(an)
       16     z0 = -1
       17     x1 = center[0] - rn * numpy.sin(an)
       18     y1 = center[1] - rn * numpy.cos(an)
       19     z1 = 1
       20 
       21     return (x0, y0, z0), (x1, y1, z1)
       22 
       23 
       24 
       25 def spiral_data(valList):
       26     dataList = list(spiral_point(val) for val in valList)
       27     data0 = numpy.array(list(item[0] for item in dataList))
       28     data1 = numpy.array(list(item[1] for item in dataList))
       29     return data0, data1
       30 
       31 
       32 
       33 class SSVM(object):
       34     
       35     def __init__(self, trainingSet, c=1, mu=1, beta=100):
       36         self.__trainingSet = trainingSet                     # 训练集数据
       37         self.__c = c                                         # 误差项权重
       38         self.__mu = mu                                       # gaussian kernel参数
       39         self.__beta = beta                                   # 光滑化参数
       40         
       41         self.__A, self.__D = self.__get_AD()
       42     
       43     
       44     def get_cls(self, x, alpha, b):
       45         A, D = self.__A, self.__D
       46         mu = self.__mu
       47 
       48         x = numpy.array(x).reshape((-1, 1))
       49         KAx = self.__get_KAx(A, x, mu)
       50         clsVal = self.__calc_hVal(KAx, D, alpha, b)
       51         if clsVal >= 0:
       52             return 1
       53         else:
       54             return -1
       55         
       56         
       57     def optimize(self, maxIter=1000, epsilon=1.e-9):
       58         '''
       59         maxIter: 最大迭代次数
       60         epsilon: 收敛判据, 梯度趋于0则收敛
       61         '''
       62         A, D = self.__A, self.__D
       63         c = self.__c
       64         mu = self.__mu
       65         beta = self.__beta
       66         
       67         alpha, b = self.__init_alpha_b((A.shape[1], 1))
       68         KAA = self.__get_KAA(A, mu)
       69         
       70         JVal = self.__calc_JVal(KAA, D, c, beta, alpha, b)
       71         grad = self.__calc_grad(KAA, D, c, beta, alpha, b)
       72         DMat = self.__init_D(KAA.shape[0] + 1)
       73         
       74         for i in range(maxIter):
       75             # print("iterCnt: {:3d},   JVal: {}".format(i, JVal))
       76             if self.__converged1(grad, epsilon):
       77                 return alpha, b, True
       78     
       79             dCurr = -numpy.matmul(DMat, grad)
       80             ALPHA = self.__calc_ALPHA_by_ArmijoRule(alpha, b, JVal, grad, dCurr, KAA, D, c, beta)
       81             
       82             delta = ALPHA * dCurr
       83             alphaNew = alpha + delta[:-1, :]
       84             bNew = b + delta[-1, -1]
       85             JValNew = self.__calc_JVal(KAA, D, c, beta, alphaNew, bNew)
       86             if self.__converged2(delta, JValNew - JVal, epsilon ** 2):
       87                 return alphaNew, bNew, True
       88             
       89             gradNew = self.__calc_grad(KAA, D, c, beta, alphaNew, bNew)
       90             DMatNew = self.__update_D_by_BFGS(delta, gradNew - grad, DMat)
       91             
       92             alpha, b, JVal, grad, DMat = alphaNew, bNew, JValNew, gradNew, DMatNew
       93         else:
       94             if self.__converged1(grad, epsilon):
       95                 return alpha, b, True
       96         return alpha, b, False
       97     
       98     
       99     def __update_D_by_BFGS(self, sk, yk, D):
      100         rk = 1 / (numpy.matmul(yk.T, sk)[0, 0] + 1.e-30)
      101 
      102         term1 = rk * numpy.matmul(sk, yk.T)
      103         term2 = rk * numpy.matmul(yk, sk.T)
      104         I = numpy.identity(term1.shape[0])
      105         term3 = numpy.matmul(I - term1, D)
      106         term4 = numpy.matmul(term3, I - term2)
      107         term5 = rk * numpy.matmul(sk, sk.T)
      108 
      109         DNew = term4 + term5
      110         return DNew
      111     
      112     
      113     def __calc_ALPHA_by_ArmijoRule(self, alphaCurr, bCurr, JCurr, gCurr, dCurr, KAA, D, c, beta, C=1.e-4, v=0.5):
      114         i = 0
      115         ALPHA = v ** i
      116         delta = ALPHA * dCurr
      117         alphaNext = alphaCurr + delta[:-1, :]
      118         bNext = bCurr + delta[-1, -1]
      119         JNext = self.__calc_JVal(KAA, D, c, beta, alphaNext, bNext)
      120         while True:
      121             if JNext <= JCurr + C * ALPHA * numpy.matmul(dCurr.T, gCurr)[0, 0]: break
      122             i += 1
      123             ALPHA = v ** i
      124             delta = ALPHA * dCurr
      125             alphaNext = alphaCurr + delta[:-1, :]
      126             bNext = bCurr + delta[-1, -1]
      127             JNext = self.__calc_JVal(KAA, D, c, beta, alphaNext, bNext)
      128         return ALPHA
      129     
      130                 
      131     def __converged1(self, grad, epsilon):
      132         if numpy.linalg.norm(grad, ord=numpy.inf) <= epsilon:
      133             return True
      134         return False
      135         
      136         
      137     def __converged2(self, delta, JValDelta, epsilon):
      138         val1 = numpy.linalg.norm(delta, ord=numpy.inf)
      139         val2 = numpy.abs(JValDelta)
      140         if val1 <= epsilon or val2 <= epsilon:
      141             return True
      142         return False
      143         
      144     
      145     def __init_D(self, n):
      146         D = numpy.identity(n)
      147         return D
      148         
      149     
      150     def __calc_grad(self, KAA, D, c, beta, alpha, b):
      151         grad_J1 = self.__calc_grad_J1(alpha)
      152         grad_J2 = self.__calc_grad_J2(KAA, D, c, beta, alpha, b)
      153         grad = grad_J1 + grad_J2
      154         return grad
      155         
      156         
      157     def __calc_grad_J2(self, KAA, D, c, beta, alpha, b):
      158         grad_J2 = numpy.zeros((KAA.shape[0] + 1, 1))
      159         Y = numpy.matmul(D, numpy.ones((D.shape[0], 1)))
      160         YY = numpy.matmul(Y, Y.T)
      161         KAAYY = KAA * YY
      162 
      163         z = 1 - numpy.matmul(KAAYY, alpha) - Y * b
      164         p = numpy.array(list(self.__p(z[i, 0], beta) for i in range(z.shape[0]))).reshape((-1, 1))
      165         s = numpy.array(list(self.__s(z[i, 0], beta) for i in range(z.shape[0]))).reshape((-1, 1))
      166         term = p * s
      167 
      168         for k in range(grad_J2.shape[0] - 1):
      169             val = -c * Y[k, 0] * numpy.sum(Y * KAA[:, k:k+1] * term)
      170             grad_J2[k, 0] = val
      171         grad_J2[-1, 0] = -c * numpy.sum(Y * term)
      172         return grad_J2
      173 
      174 
      175     def __calc_grad_J1(self, alpha):
      176         grad_J1 = numpy.vstack((alpha, [[0]]))
      177         return grad_J1
      178         
      179         
      180     def __calc_JVal(self, KAA, D, c, beta, alpha, b):
      181         J1 = self.__calc_J1(alpha)
      182         J2 = self.__calc_J2(KAA, D, c, beta, alpha, b)
      183         JVal = J1 + J2
      184         return JVal
      185         
      186         
      187     def __calc_J2(self, KAA, D, c, beta, alpha, b):
      188         tmpOne = numpy.ones((KAA.shape[0], 1))
      189         x = tmpOne - numpy.matmul(numpy.matmul(numpy.matmul(D, KAA), D), alpha) - numpy.matmul(D, tmpOne) * b
      190         p = numpy.array(list(self.__p(x[i, 0], beta) for i in range(x.shape[0])))
      191         J2 = numpy.sum(p * p) * c / 2
      192         return J2
      193 
      194 
      195     def __calc_J1(self, alpha):
      196         J1 = numpy.sum(alpha * alpha) / 2
      197         return J1
      198         
      199         
      200     def __get_KAA(self, A, mu):
      201         KAA = numpy.zeros((A.shape[1], A.shape[1]))
      202         for rowIdx in range(KAA.shape[0]):
      203             for colIdx in range(rowIdx + 1):
      204                 x1 = A[:, rowIdx:rowIdx+1]
      205                 x2 = A[:, colIdx:colIdx+1]
      206                 val = self.__calc_gaussian(x1, x2, mu)
      207                 KAA[rowIdx, colIdx] = KAA[colIdx, rowIdx] = val
      208         return KAA
      209         
      210         
      211     def __get_KAx(self, A, x, mu):
      212         KAx = numpy.zeros((A.shape[1], 1))
      213         for rowIdx in range(KAx.shape[0]):
      214             x1 = A[:, rowIdx:rowIdx+1]
      215             val = self.__calc_gaussian(x1, x, mu)
      216             KAx[rowIdx, 0] = val
      217         return KAx
      218         
      219         
      220     def __calc_hVal(self, KAx, D, alpha, b):
      221         hVal = numpy.matmul(numpy.matmul(alpha.T, D), KAx)[0, 0] + b
      222         return hVal
      223         
      224         
      225     def __calc_gaussian(self, x1, x2, mu):
      226         # val = numpy.math.exp(-mu * numpy.linalg.norm(x1 - x2) ** 2)    # 高斯核
      227         # val = (numpy.sum(x1 * x2) + 1) ** 1                            # 多项式核
      228         val = numpy.sum(x1 * x2)                                         # 线性核
      229         return val
      230         
      231         
      232     def __init_alpha_b(self, shape):
      233         '''
      234         alpha, b之初始化
      235         '''
      236         alpha, b = numpy.zeros(shape), 0
      237         return alpha, b
      238         
      239         
      240     def __get_AD(self):
      241         A = self.__trainingSet[:, :2].T
      242         D = numpy.diag(self.__trainingSet[:, 2])
      243         return A, D
      244         
      245         
      246     def __p(self, x, beta):
      247         term = x * beta
      248         if term > 10:
      249             val = x + numpy.math.log(1 + numpy.math.exp(-term)) / beta
      250         else:
      251             val = numpy.math.log(numpy.math.exp(term) + 1) / beta
      252         return val
      253 
      254 
      255     def __s(self, x, beta):
      256         term = x * beta
      257         if term > 10:
      258             val = 1 / (numpy.math.exp(-beta * x) + 1)
      259         else:
      260             term1 = numpy.math.exp(term)
      261             val = term1 / (1 + term1)
      262         return val
      263         
      264         
      265         
      266 class AdaBoost(object):
      267     
      268     def __init__(self, trainingSet):
      269         self.__trainingSet = trainingSet            # 训练数据集
      270         
      271         self.__W = self.__init_weight()             # 关联权重初始化
      272         
      273         
      274     def get_weakModels(self, T=100):
      275         '''
      276         T: 弱模型上限数量
      277         '''
      278         T = T if T >= 1 else 1
      279         W = self.__W
      280         trainingSet = self.__trainingSet
      281         
      282         weakModels = list()                               # 级联弱模型(SVM)列表
      283         alphaList = list()                                # 级联权重列表
      284         for t in range(T):
      285             print("getting the {}th weak model...".format(t))
      286             weakModel, alpha = self.__get_weakModel(trainingSet, W, 0.49)
      287             weakModels.append(weakModel)
      288             alphaList.append(alpha)
      289             
      290             W = self.__update_W(W, weakModel, alpha)      # 更新关联权重
      291         else:
      292             realErr = self.__calc_realErr(weakModels, alphaList, trainingSet)  # 计算真实错误率
      293             print("Final error rate: {}".format(realErr))
      294         return weakModels, alphaList
      295             
      296             
      297     def get_realErr(self, weakModels, alphaList, dataSet=None):
      298         '''
      299         计算AdaBoost在指定数据集上的错误率
      300         weakModels: 弱模型列表
      301         alphaList: 级联权重列表
      302         dataSet: 指定数据集
      303         '''
      304         if dataSet is None:
      305             dataSet = self.__trainingSet
      306         
      307         realErr = self.__calc_realErr(weakModels, alphaList, dataSet)
      308         return realErr
      309         
      310         
      311     def get_cls(self, x, weakModels, alphaList):
      312         hVal = self.__calc_hVal(x, weakModels, alphaList)
      313         if hVal >= 0:
      314             return 1
      315         else:
      316             return -1
      317         
      318         
      319     def __calc_realErr(self, weakModels, alphaList, dataSet):
      320         cnt = 0
      321         num = dataSet.shape[0]
      322         for sample in dataSet:
      323             x, y_ = sample[:-1], sample[-1]
      324             y = self.get_cls(x, weakModels, alphaList)
      325             if y_ != y:
      326                 cnt += 1
      327         err = cnt / num
      328         return err
      329         
      330         
      331     def __calc_hVal(self, x, weakModels, alphaList):
      332         if len(weakModels) == 0:
      333             raise Exception("Weak model list is empty!")
      334             
      335         hVal = 0
      336         for (ssvmObj, ssvmRet), alpha in zip(weakModels, alphaList):
      337             hVal += ssvmObj.get_cls(x, ssvmRet[0], ssvmRet[1]) * alpha
      338         return hVal
      339             
      340             
      341     def __update_W(self, W, weakModel, alpha):
      342         ssvmObj, ssvmRet = weakModel
      343         trainingSet = self.__trainingSet
      344         
      345         WNew = list()
      346         for sample in trainingSet:
      347             x, y_ = sample[:-1], sample[-1]
      348             val = numpy.math.exp(-alpha * y_ * ssvmObj.get_cls(x, ssvmRet[0], ssvmRet[1]))
      349             WNew.append(val)
      350         WNew = numpy.array(WNew) * W
      351         WNew = WNew / numpy.sum(WNew)
      352         return WNew
      353             
      354     
      355     def __get_weakModel(self, trainingSet, W, maxEpsilon=0.5, maxIter=100):
      356         '''
      357         获取弱模型及级联权重
      358         W: 关联权重
      359         maxEpsilon: 最大加权错误率
      360         maxIter: 最大迭代次数
      361         '''
      362         roulette = self.__build_roulette(W)
      363         for idx in range(maxIter):
      364             dataSet = self.__get_dataSet(trainingSet, roulette)
      365             weakModel = self.__build_weakModel(dataSet)
      366 
      367             epsilon = self.__calc_weightedErr(trainingSet, weakModel, W)
      368             if epsilon == 0:
      369                 raise Exception("The model is not weak enough with epsilon = 0")
      370             elif epsilon < maxEpsilon:
      371                 alpha = self.__calc_alpha(epsilon)
      372                 return weakModel, alpha
      373         else:
      374             raise Exception("Fail to get weak model after {} iterations!".format(maxIter))
      375             
      376             
      377     def __calc_alpha(self, epsilon):
      378         '''
      379         计算级联权重
      380         '''
      381         alpha = numpy.math.log(1 / epsilon - 1) / 2
      382         return alpha
      383             
      384             
      385     def __calc_weightedErr(self, trainingSet, weakModel, W):
      386         '''
      387         计算加权错误率
      388         '''
      389         ssvmObj, (alpha, b, tab) = weakModel
      390         
      391         epsilon = 0
      392         for idx, w in enumerate(W):
      393             x, y_ = trainingSet[idx, :-1], trainingSet[idx, -1]
      394             y = ssvmObj.get_cls(x, alpha, b)
      395             if y_ != y:
      396                 epsilon += w
      397         return epsilon
      398         
      399         
      400     def __build_weakModel(self, dataSet):
      401         '''
      402         构造SVM弱模型
      403         '''
      404         ssvmObj = SSVM(dataSet, c=0.1, mu=250, beta=100)
      405         ssvmRet = ssvmObj.optimize()
      406         return (ssvmObj, ssvmRet)
      407         
      408     
      409     def __get_dataSet(self, trainingSet, roulette):
      410         randomDart = numpy.sort(numpy.random.uniform(0, 1, trainingSet.shape[0]))
      411         dataSet = list()
      412         idxRoulette = idxDart = 0
      413         while idxDart < len(randomDart):
      414             if randomDart[idxDart] > roulette[idxRoulette]:
      415                 idxRoulette += 1
      416             else:
      417                 dataSet.append(trainingSet[idxRoulette])
      418                 idxDart += 1
      419         return numpy.array(dataSet)
      420     
      421         
      422     def __build_roulette(self, W):
      423         roulette = list()
      424         val = 0
      425         for ele in W:
      426             val += ele
      427             roulette.append(val)
      428         return roulette
      429         
      430         
      431     def __init_weight(self):
      432         num = self.__trainingSet.shape[0]
      433         W = numpy.ones(num) / num
      434         return W
      435         
      436         
      437 
      438 class AdaBoostPlot(object):
      439     
      440     @staticmethod
      441     def data_plot(trainingData0, trainingData1):
      442         fig = plt.figure(figsize=(5, 5))
      443         ax1 = plt.subplot()
      444         
      445         ax1.scatter(trainingData1[:, 0], trainingData1[:, 1], c="red", marker="o", s=10, label="Positive")
      446         ax1.scatter(trainingData0[:, 0], trainingData0[:, 1], c="blue", marker="o", s=10, label="Negative")
      447         
      448         ax1.set(xlim=(-0.5, 0.5), ylim=(-0.5, 0.5), xlabel="$x_1$", ylabel="$x_2$")
      449         ax1.legend(fontsize="x-small")
      450         
      451         fig.savefig("data.png", dpi=100)
      452         # plt.show()
      453         plt.close()
      454         
      455         
      456     @staticmethod
      457     def pred_plot(trainingData0, trainingData1, adaObj, weakModels, alphaList):
      458         x = numpy.linspace(-0.5, 0.5, 500)
      459         y = numpy.linspace(-0.5, 0.5, 500)
      460         x, y = numpy.meshgrid(x, y)
      461         z = numpy.zeros(x.shape)
      462         for rowIdx in range(x.shape[0]):
      463             print("on the {}th row".format(rowIdx))
      464             for colIdx in range(x.shape[1]):
      465                 z[rowIdx, colIdx] = adaObj.get_cls((x[rowIdx, colIdx], y[rowIdx, colIdx]), weakModels, alphaList)
      466         
      467         errList = list()
      468         for idx in range(len(weakModels)):
      469             tmpWeakModels = weakModels[:idx+1]
      470             tmpAlphaList = alphaList[:idx+1]
      471             realErr = adaObj.get_realErr(tmpWeakModels, tmpAlphaList)
      472             print("idx = {}; realErr = {}".format(idx, realErr))
      473             errList.append(realErr)
      474         
      475         fig = plt.figure(figsize=(10, 3))
      476         ax1 = plt.subplot(1, 2, 1)
      477         ax2 = plt.subplot(1, 2, 2)
      478         
      479         ax1.plot(numpy.arange(len(errList))+1, errList, linestyle="--", marker=".")
      480         ax1.set(xlabel="T", ylabel="error rate")
      481         
      482         ax2.contourf(x, y, z, levels=[-1.5, 0, 1.5], colors=["blue", "red"], alpha=0.3)
      483         ax2.scatter(trainingData1[:, 0], trainingData1[:, 1], c="red", marker="o", s=10, label="Positive")
      484         ax2.scatter(trainingData0[:, 0], trainingData0[:, 1], c="blue", marker="o", s=10, label="Negative")
      485         ax2.set(xlim=(-0.5, 0.5), ylim=(-0.5, 0.5), xlabel="$x_1$", ylabel="$x_2$")
      486         ax2.legend(loc="upper left", fontsize="x-small")
      487         fig.tight_layout()
      488         fig.savefig("pred.png", dpi=100)
      489         # plt.show()
      490         plt.close()
      491         
      492         
      493         
      494 if __name__ == "__main__":
      495     # 生成训练数据集
      496     trainingValList = numpy.arange(1, 101, 1)
      497     trainingData0, trainingData1 = spiral_data(trainingValList)
      498     trainingSet = numpy.vstack((trainingData0, trainingData1))
      499     
      500     adaObj = AdaBoost(trainingSet)
      501     weakModels, alphaList = adaObj.get_weakModels(200)
      502     
      503     AdaBoostPlot.data_plot(trainingData0, trainingData1)
      504     AdaBoostPlot.pred_plot(trainingData0, trainingData1, adaObj, weakModels, alphaList)
      View Code

      笔者所用训练数据集分布如下:

      很显然, 此数据集非线性可分, 直接采用线性SVM将获得较差的分类效果.
    • 结果展示:
      左侧为训练集上错误率$E$随弱模型数量$T$的变化情况, 右侧为AdaBoost在此训练集上的最终分类效果. 可以看到, 相较于单一线性SVM, AdaBoost通过级联多个线性SVM, 使其在训练集上的错误率由初始的0.45降至0.12, 极大程度上增强了弱模型的表达能力.
    • 使用建议:
      ①. 注意区分级联权重$alpha$与关联权重$w$;
      ②. 注意区分错误率$E$与加权错误率$epsilon$.
    • 参考文档:
      Boosting之AdaBoost算法
  • 相关阅读:
    [Iterview English] Dimission and Employ
    委托(delegate)
    tensorflow(二十八):Keras自定义层,继承layer,model
    (三)任务型对话系统简介
    tensorflow(二十九):模型的保存
    tensorflow(二十七):Keras一句话训练fit
    python(五):argparse 模块
    tensorflow(二十六):Keras计算准确率和损失
    NLP(十):pytorch实现中文文本分类
    tensorflow(三十):keras自定义网络实战
  • 原文地址:https://www.cnblogs.com/xxhbdk/p/13585110.html
Copyright © 2011-2022 走看看