zoukankan      html  css  js  c++  java
  • 《机器学习实战》学习笔记第五章 —— Logistic回归

    一.有关笔记:

    1..吴恩达机器学习笔记(二) —— Logistic回归

    2.吴恩达机器学习笔记(十一) —— Large Scale Machine Learning

    二.Python源码(不带正则项):

     1 # coding:utf-8
     2 
     3 '''
     4 Created on Oct 27, 2010
     5 Logistic Regression Working Module
     6 @author: Peter
     7 '''
     8 from numpy import *
     9 
    10 def sigmoid(inX):
    11     return 1.0 / (1 + exp(-inX))
    12 
    13 def gradAscent(dataMatIn, classLabels):
    14     dataMatrix = mat(dataMatIn)  # convert to NumPy matrix
    15     labelMat = mat(classLabels).transpose()  # convert to NumPy matrix
    16     m, n = shape(dataMatrix)
    17     alpha = 0.001
    18     maxCycles = 500
    19     weights = ones((n, 1))
    20     for k in range(maxCycles):  # heavy on matrix operations
    21         h = sigmoid(dataMatrix * weights)  # matrix mult
    22         error = (labelMat - h)  # vector subtraction
    23         weights = weights + alpha * dataMatrix.transpose() * error  # matrix mult
    24     return weights
    25 
    26 def stocGradAscent0(dataMatrix, classLabels,numIter=150):
    27     m, n = shape(dataMatrix)
    28     alpha = 0.01
    29     weights = ones(n)  # initialize to all ones
    30     for j in range(numIter):
    31         for i in range(m):
    32             h = sigmoid(sum(dataMatrix[i] * weights))
    33             error = classLabels[i] - h
    34             weights = weights + alpha * error * dataMatrix[i]
    35     return weights
    36 
    37 def stocGradAscent1(dataMatrix, classLabels, numIter=150):
    38     m, n = shape(dataMatrix)
    39     weights = ones(n)  # initialize to all ones
    40     for j in range(numIter):
    41         dataIndex = range(m)
    42         for i in range(m):
    43             alpha = 4 / (1.0 + j + i) + 0.0001  # apha decreases with iteration, does not
    44             randIndex = int(random.uniform(0, len(dataIndex)))  # go to 0 because of the constant
    45             h = sigmoid(sum(dataMatrix[randIndex] * weights))
    46             error = classLabels[randIndex] - h
    47             weights = weights + alpha * error * dataMatrix[randIndex]
    48             del (dataIndex[randIndex])
    49     return weights
    50 
    51 def classifyVector(inX, weights):
    52     prob = sigmoid(sum(inX * weights))
    53     if prob > 0.5:
    54         return 1.0
    55     else:
    56         return 0.0
    57 
    58 def colicTest():
    59     frTrain = open('horseColicTraining.txt')
    60     frTest = open('horseColicTest.txt')
    61     trainingSet = []
    62     trainingLabels = []
    63     for line in frTrain.readlines():
    64         currLine = line.strip().split('	')
    65         lineArr = []
    66         for i in range(21):
    67             lineArr.append(float(currLine[i]))
    68         trainingSet.append(lineArr)
    69         trainingLabels.append(float(currLine[21]))
    70     trainWeights = stocGradAscent1(array(trainingSet), trainingLabels,500)
    71     errorCount = 0; numTestVec = 0.0
    72     for line in frTest.readlines():
    73         numTestVec += 1.0
    74         currLine = line.strip().split('	')
    75         lineArr = []
    76         for i in range(21):
    77             lineArr.append(float(currLine[i]))
    78         if int(classifyVector(array(lineArr), trainWeights)) != int(currLine[21]):
    79             errorCount += 1
    80     errorRate = (float(errorCount) / numTestVec)
    81     print "the error rate of this test is: %f" % errorRate
    82     return errorRate
    83 
    84 def multiTest():
    85     numTests = 10; errorSum = 0.0
    86     for k in range(numTests):
    87         errorSum += colicTest()
    88     print "after %d iterations the average error rate is: %f" % (numTests, errorSum / float(numTests))
    89 
    90 if __name__=="__main__":
    91     multiTest()

    三.Batch gradient descent、Stochastic gradient descent、Mini-batch gradient descent 的性能比较

    1.Batch gradient descent

     1 def gradAscent(dataMatIn, classLabels):
     2     dataMatrix = mat(dataMatIn)  # convert to NumPy matrix
     3     labelMat = mat(classLabels).transpose()  # convert to NumPy matrix
     4     m, n = shape(dataMatrix)
     5     alpha = 0.001
     6     maxCycles = 500
     7     weights = ones((n, 1))
     8     for k in range(maxCycles):  # heavy on matrix operations
     9         h = sigmoid(dataMatrix * weights)  # matrix mult
    10         error = (labelMat - h)  # vector subtraction
    11         weights = weights + alpha * dataMatrix.transpose() * error  # matrix mult
    12     return weights
    View Code

    其运行结果:

    错误率为:28.4%

    2.Stochastic gradient descent

     1 def stocGradAscent0(dataMatrix, classLabels,numIter=150):
     2     m, n = shape(dataMatrix)
     3     alpha = 0.01
     4     weights = ones(n)  # initialize to all ones
     5     for j in range(numIter):
     6         for i in range(m):
     7             h = sigmoid(sum(dataMatrix[i] * weights))
     8             error = classLabels[i] - h
     9             weights = weights + alpha * error * dataMatrix[i]
    10     return weights
    View Code

    迭代次数为150时,错误率为:46.3%

    迭代次数为500时,错误率为:32.8%

    迭代次数为800时,错误率为:38.8%

    3.Mini-batch gradient descent

     1 def stocGradAscent1(dataMatrix, classLabels, numIter=150):
     2     m, n = shape(dataMatrix)
     3     weights = ones(n)  # initialize to all ones
     4     for j in range(numIter):
     5         dataIndex = range(m)
     6         for i in range(m):
     7             alpha = 4 / (1.0 + j + i) + 0.0001  # apha decreases with iteration, does not
     8             randIndex = int(random.uniform(0, len(dataIndex)))  # go to 0 because of the constant
     9             h = sigmoid(sum(dataMatrix[randIndex] * weights))
    10             error = classLabels[randIndex] - h
    11             weights = weights + alpha * error * dataMatrix[randIndex]
    12             del (dataIndex[randIndex])
    13     return weights
    View Code

    迭代次数为150时,错误率为:37.8%

    迭代次数为500时,错误率为:35.2%

    迭代次数为800时,错误率为:37.3%

    4.综上:

    1.在训练数据集较小且特征较少的时候,使用Batch gradient descent的效果是最好的。但如果不能满足这个条件,则可使用Mini-batch gradient descent,并设置合适的迭代次数。

    2.对于Stochastic gradient descent 和 Mini-batch gradient descent 而言,并非迭代次数越多效果越好。不知为何?

  • 相关阅读:
    【不错的文章收藏了】我的Delphi开发经验谈
    【转载】Delphi异常处理try except语句和try finally语句用法以及区别
    Asp防止网页频繁刷新和强制不缓存的方法
    JavaScript的document和window对象详解
    Delphi下获取系统默认的UserAgent的方法
    【转载】ACCESS技巧集(DELPHI AND SQL)
    c#中const与readonly区别
    C#中virtual 方法和abstract方法的区别
    static(C# 参考)
    按键精灵
  • 原文地址:https://www.cnblogs.com/DOLFAMINGO/p/9455453.html
Copyright © 2011-2022 走看看