zoukankan      html  css  js  c++  java
  • 【作业三】林轩田机器学习基石

    关注了Q18~Q20三道编程作业题。这三道题都与Logistic Regression相关。

    Q18~19是用全量梯度下降实现Logistic Regression;Q20要求用随机梯度下降实现Logistic Regression。


    import sys
    import numpy as np
    import math
    from random import *
    # read input data ( train or test )
    def read_input_data(path):
        x = []
        y = []
        for line in open(path).readlines():
            items = line.strip().split(' ')
            tmp_x = []
            for i in range(0,len(items)-1): tmp_x.append(float(items[i]))
        return np.array(x),np.array(y)
    # calculate graident of Ein
    def calculate_gradient(w, x, y):
        s = np.dot(w, x.transpose())*y
        theta = 1.0/(1+np.exp(s))
        gradient_all = theta.reshape(-1,1)*(-1)*y.reshape(-1,1)*x
        gradient_average = np.sum(gradient_all, axis=0)
        return gradient_average / gradient_all.shape[0]
    # update W combine with gradient result and learning rate (ita)
    def update_W(w, ita, gradient):
        return w - ita*gradient
    # test result
    def calculate_Eout(w, x, y):
        scores = 1/(1+np.exp((-1)*np.dot(w, x.transpose())))
        predicts = np.where(scores>=0.5,1.0,-1.0)
        Eout = sum(predicts!=y)
        return (Eout*1.0) / predicts.shape[0]
    if __name__ == '__main__':
        # read train data
        x,y = read_input_data("train.dat")
        # add '1' column
        x = np.hstack((np.ones(x.shape[0]).reshape(-1,1),x))
        ## fix learning rate gradient descent
        T1 = 2000
        ita1 = 0.01
        w1 = np.zeros(x.shape[1])
        for i in range(0,T1):
            gradient = calculate_gradient(w1, x, y)
            w1 = update_W(w1, ita1, gradient)
        ## fix learning rate stochastic gradient descent
        T2 = 20
        ita2 = 0.1
        w2 = np.zeros(x.shape[1])
        for i in range(0,T2):
            x_n = x[i%x.shape[0]]
            y_n = y[i%y.shape[0]]
            gradient = calculate_gradient(w2, x_n, y_n)
            w2 = update_W(w2, ita2, gradient)
        # test
        test_x,test_y = read_input_data("test.dat")
        test_x = np.hstack((np.ones(test_x.shape[0]).reshape(-1,1),test_x))
        Eout1 = calculate_Eout(w1, test_x, test_y)
        Eout2 = calculate_Eout(w2, test_x, test_y)
        print Eout1
        print Eout2

    程序效率比较高,主要得益于python numpy非常便捷的各种矩阵操作运算。


    (1)熟悉了Logistic Regression的梯度求解公式:

    (2)体会了learning rate的作用。如果learning rate很小,可能需要迭代很多次才能得到满意的结果(把Q18的迭代次数调整到20W次,可以达到0.22的error rate)

      但是,之前的经验是,learning rate不敢选太大(一般0.01就挺大了)。learning rate这个真是技术活儿,跟算法有关,跟实际的数据也有关。

  • 相关阅读:
    VScode出现无法打开“X”: 找不到文件(file:///XXXX) 的解决办法
    (补题 POJ 3013) Big Christmas Tree
    (补题 cf 1140)Detective Book
  • 原文地址:https://www.cnblogs.com/xbf9xbf/p/4605599.html
Copyright © 2011-2022 走看看