zoukankan      html  css  js  c++  java
  • 神经网络

     

    浅层神经网络:

    神经网络的输出

     

     

    矩阵公式:输出=激活函数(输入x权重+偏差)

     

    多层感知器识别手写数字:

    关键点:

    1. input:[None,784]
    2. output:[None,10]
    3. 隐藏层:256
    4. 如何随机初始化参数
    5. loss函数如何计算
     

    随机初始化

    weight:
    np.random.randn() or np.random.uniform() # 正态分布打破对称性
    bias:
    初始化为0是可行的。

    深层神经网络:

    深层神经网络

     

     

    为什么深层的网络在很多问题上比浅层的好?

     
    1. 前几层学习低层次简单特征
    2. 后几层结合多个简单特征,探测复杂特征
     

    深层的网络隐藏单元数量相对较少,隐藏层数目较多,如果浅层的网络想要达到同样的 计算结果则需要指数级增长的单元数量才能达到。

     

    参数VS超参数

     
    1. 学习率
    2. 梯度下降循环数量
    3. 隐藏层数
    4. 隐藏层单元数目
    5. 激活函数选择
     

    应用深度学习领域,一个很大程度基于经验的过程,凭经验的过程通俗来说,就是试直到你找到合适的数值。

    改善深层神经网络:

    关于训练集、验证集、测试集的划分。

    大数据时代,测试集的主要目的是正确评估分类器的性能,
    所以,如果拥有百万数据,我们只需要 1000 条数据,便足以评估单个分类器,并且准确评估该分类器的性能.
    98%,1%,1%.

     

    验证集和测试集要确保同一分布

    数据归一化:

    机器学习模型使用梯度下降法求最优解时,归一化往往非常有必要,否则很难收敛甚至不能收敛,一般归一化操作有两种:

    1.最值归一化

     

    2. 均值标准差归一化

    交叉验证集:

    正则化:

    1. 岭回归和lasso回归

    2. dropout

    3. 数据扩增

    4. early stopping(提早停止训练神经网络)

    为什么正则化可以减少过拟合?

    直观上理解就是如果正则化设置得足够大,权重矩阵

    被设置为接近于 0 的值,直观
    理解就是把多隐藏单元的权重设为 0,于是基本上消除了这些隐藏单元的影响。

    梯度消失、梯度爆炸:

    解决:随机初始化神经网络参数。

    如何初始化神经网络权重参数?

    relu激活函数:

    w[i] = np.random.randn(shape)*np.sqrt(2/n[i-1]) # n[i-1]:上一层的输入特征数,w[i]这一层的权重系数

    #### np.sqrt(1/n[i-1])

    #### np.sqrt(2/(n[i-1]+n[i]))   

    梯度调试(只在调试的时候使用):

    采用双边误差检验时,我们使用双边误差,(f(θ+x)-f(θ-x))/2x,因为单边误差(f(θ+x)/x)不够准确。

    如果不正确,程序可能有bug需要你去解决。。。

    Adam优化算法:

    系统地组织超参调试过程的技巧:

    学习率α>隐藏层节点数>mini_batch size >隐藏层数>学习次数

     大量阅读别人的案例。

    自己实现的DNN封装:

    import tensorflow as tf
    import numpy as np
    import matplotlib.pyplot as plt
    from sklearn.metrics import r2_score
    from sklearn.model_selection import train_test_split
    
    
    
    class MlpClassifier():
        """DNN Classifier 二分类"""
        def __init__(self, hiddenNodes, hiddenDeep=3):
            """隐藏层节点数,隐藏层层数"""
            self.hiddenNodes = hiddenNodes
            self.hiddenDeep = hiddenDeep
      
        def fit(self, trainX, trainY, AdamStep, learnRate=0.1,testX=None,testY=None):
            """trainY must be one-hot"""
            trainX,validationX,trainY,validationY = train_test_split(trainX,trainY,test_size=0.1)
            self.input_ = trainX.shape[1]
            self.output_ = 2
            self.trainX_ = trainX
            self.trainY_ = trainY
            self.AdamStep = AdamStep
            self.learnRate = learnRate
      
            dataInput = tf.placeholder(tf.float32, shape=(None, self.input_))
            labelInput = tf.placeholder(tf.float32, shape=(None, self.output_))
            var = locals()
            for i in range(1, self.hiddenDeep + 1):          
                if self.hiddenDeep == 1:
                    """深度为1时特殊"""
                    var["w" + str(i)] = tf.Variable(tf.random_uniform((self.input_, 1)) * tf.sqrt(2 / self.input_))
                    var["b" + str(i)] = tf.Variable(tf.zeros((1, 1)))
                    var["layer" + str(i)] = tf.nn.relu(tf.add(tf.matmul(dataInput, var["w" + str(i)]), var["b" + str(i)]))
                    break
      
                if i == 1:
                    var["w" + str(i)] = tf.Variable(tf.random_uniform((self.input_, self.hiddenNodes)) * tf.sqrt(2 / self.input_))
                    var["b" + str(i)] = tf.Variable(tf.zeros((1, self.hiddenNodes)))
                    var["layer" + str(i)] = tf.nn.relu(tf.add(tf.matmul(dataInput, var["w" + str(i)]), var["b" + str(i)]))
                elif i == self.hiddenDeep:
                    var["w" + str(i)] = tf.Variable(tf.random_uniform((self.hiddenNodes, 1)) * tf.sqrt(2 / self.hiddenNodes))
                    var["b" + str(i)] = tf.Variable(tf.zeros((1, self.output_)))
                    var["layer" + str(i)] = tf.nn.relu(tf.add(tf.matmul(var["layer" + str(i - 1)], var["w" + str(i)]), var["b" + str(i)]))
                else:
                    var["w" + str(i)] = tf.Variable(
                        tf.random_uniform((self.hiddenNodes, self.hiddenNodes)) * tf.sqrt(2 / self.hiddenNodes))
                    var["b" + str(i)] = tf.Variable(tf.zeros((1, self.hiddenNodes)))
                    var["layer" + str(i)] = tf.nn.relu(tf.add(tf.matmul(var["layer" + str(i - 1)], var["w" + str(i)]), var["b" + str(i)]))
                      
            result = tf.nn.softmax(var["layer" + str(self.hiddenDeep)],axis=1) # [None,2]
            loss = tf.reduce_sum(-labelInput*tf.log(result)) # 交叉熵损失函数
            train = tf.train.AdamOptimizer(learnRate).minimize(loss)
      
              
            # session
            ratios = []
            validations = []
            with tf.Session() as sess:
                sess.run(tf.global_variables_initializer())
                sess.run(tf.local_variables_initializer())
                for i in range(AdamStep):
                    sess.run(train,feed_dict={dataInput:trainX,labelInput:trainY})
                    trainYHat = sess.run(result,feed_dict={dataInput:trainX})
                    ratio = np.sum(np.argmax(trainY,axis=1)==np.argmax(trainYHat,axis=1))/trainY.shape[0]
                    ratios.append(ratio)
                      
                    validationYHat = sess.run(result,feed_dict={dataInput:validationX})
                    validation = np.sum(np.argmax(validationY,axis=1)==np.argmax(validationYHat,axis=1))/validationY.shape[0]
                    validations.append(validation)
                      
                # predict
                if testY is not None and testX is not None:
                    YHat = sess.run(result, feed_dict={dataInput: testX})
                    ratio = np.sum(np.argmax(testY,axis=1)==np.argmax(YHat,axis=1))/testY.shape[0]
                    return ratio
                  
                elif testY is None and testX is not None:
                    YHat = sess.run(result, feed_dict={dataInput: testX})
                    return np.argmax(YHat,axis=1)
                  
                elif testY is None and testX is None:
                  
                    x = [i for i in range(1,AdamStep+1)]  
                    plt.plot(x,ratios,color="blue",label="train")
                    plt.plot(x,validations,color="yellow",label="validation")
                    plt.ylim(0,1)
                    plt.legend()
                    plt.show()
                else:
                    print("输入格式错误")
                  
        def predict(self,testX,AdamStep):
            YHat = self.fit(self.trainX_,self.trainY_,AdamStep,self.learnRate,testX=testX)
            return YHat
      
        def score(self,testX,testY,AdamStep):
            score = self.fit(self.trainX_,self.trainY_,AdamStep,self.learnRate,testX=testX,testY=testY)
            return score
    
     
    class MlpRegression():
        """DNN Regression"""
       
        def __init__(self, hiddenNodes, hiddenDeep=3):
            """隐藏层节点数,隐藏层层数"""
            self.hiddenNodes = hiddenNodes
            self.hiddenDeep = hiddenDeep
       
        def fit(self, trainX, trainY, AdamStep, learnRate=0.1,testX=None,testY=None):
            """trainY must be one-hot"""
            trainX,validationX,trainY,validationY = train_test_split(trainX,trainY,test_size=0.1)
            self.input_ = trainX.shape[1]
            self.output_ = 1
            self.trainX_ = trainX
            self.trainY_ = trainY
            self.AdamStep = AdamStep
            self.learnRate = learnRate
       
            dataInput = tf.placeholder(tf.float32, shape=(None, self.input_))
            labelInput = tf.placeholder(tf.float32, shape=(None, self.output_))
            var = locals()
            for i in range(1, self.hiddenDeep + 1):         
                if self.hiddenDeep == 1:
                    """深度为1时特殊"""
                    var["w" + str(i)] = tf.Variable(tf.random_uniform((self.input_, 1)) * tf.sqrt(2 / self.input_))
                    var["b" + str(i)] = tf.Variable(tf.zeros((1, 1)))
                    var["layer" + str(i)] = tf.nn.relu(tf.add(tf.matmul(dataInput, var["w" + str(i)]), var["b" + str(i)]))
                    break
       
                if i == 1:
                    var["w" + str(i)] = tf.Variable(tf.random_uniform((self.input_, self.hiddenNodes)) * tf.sqrt(2 / self.input_))
                    var["b" + str(i)] = tf.Variable(tf.zeros((1, self.hiddenNodes)))
                    var["layer" + str(i)] = tf.nn.relu(tf.add(tf.matmul(dataInput, var["w" + str(i)]), var["b" + str(i)]))
                elif i == self.hiddenDeep:
                    var["w" + str(i)] = tf.Variable(tf.random_uniform((self.hiddenNodes, 1)) * tf.sqrt(2 / self.hiddenNodes))
                    var["b" + str(i)] = tf.Variable(tf.zeros((1, self.output_)))
                    var["layer" + str(i)] = tf.nn.relu(tf.add(tf.matmul(var["layer" + str(i - 1)], var["w" + str(i)]), var["b" + str(i)]))
                else:
                    var["w" + str(i)] = tf.Variable(
                        tf.random_uniform((self.hiddenNodes, self.hiddenNodes)) * tf.sqrt(2 / self.hiddenNodes))
                    var["b" + str(i)] = tf.Variable(tf.zeros((1, self.hiddenNodes)))
                    var["layer" + str(i)] = tf.nn.relu(tf.add(tf.matmul(var["layer" + str(i - 1)], var["w" + str(i)]), var["b" + str(i)]))
                     
            result = var["layer"+str(self.hiddenDeep)]  
            loss = tf.reduce_mean(tf.square(labelInput-result)) # 均方根损失函数       
            train = tf.train.AdamOptimizer(learnRate).minimize(loss)
       
               
            # session
            ratios = []
            validations = []
            with tf.Session() as sess:
                sess.run(tf.global_variables_initializer())
                sess.run(tf.local_variables_initializer())
                for i in range(AdamStep):
                    sess.run(train,feed_dict={dataInput:trainX,labelInput:trainY})
                    trainYHat = sess.run(result,feed_dict={dataInput:trainX})
                    ratio = r2_score(trainY,trainYHat)
                    ratios.append(ratio)
                       
                    validationYHat = sess.run(result,feed_dict={dataInput:validationX})
                    validation = r2_score(validationY,validationYHat)
                    validations.append(validation)
                       
                # predict
                if testY is not None and testX is not None:
                    YHat = sess.run(result, feed_dict={dataInput: testX})
                    ratio = r2_score(testY,YHat)
                    return ratio
                   
                elif testY is None and testX is not None:
                    YHat = sess.run(result, feed_dict={dataInput: testX})
                    return YHat
                   
                elif testY is None and testX is None:
                   
                    x = [i for i in range(1,AdamStep+1)] 
                    plt.plot(x,ratios,color="blue",label="train")
                    plt.plot(x,validations,color="yellow",label="validation")
                    plt.ylim(0,1)
                    plt.legend()
                    plt.show()
                else:
                    print("输入格式错误")
                   
        def predict(self,testX,AdamStep):
            YHat = self.fit(self.trainX_,self.trainY_,AdamStep,self.learnRate,testX=testX)
            return YHat
       
        def score(self,testX,testY,AdamStep):
            score = self.fit(self.trainX_,self.trainY_,AdamStep,self.learnRate,testX=testX,testY=testY)
            return score
    

      

  • 相关阅读:
    【BZOJ 4581】【Usaco2016 Open】Field Reduction
    【BZOJ 4582】【Usaco2016 Open】Diamond Collector
    【BZOJ 4580】【Usaco2016 Open】248
    【BZOJ 3754】Tree之最小方差树
    【51Nod 1501】【算法马拉松 19D】石头剪刀布威力加强版
    【51Nod 1622】【算法马拉松 19C】集合对
    【51Nod 1616】【算法马拉松 19B】最小集合
    【51Nod 1674】【算法马拉松 19A】区间的价值 V2
    【BZOJ 2541】【Vijos 1366】【CTSC 2000】冰原探险
    【BZOJ 1065】【Vijos 1826】【NOI 2008】奥运物流
  • 原文地址:https://www.cnblogs.com/zenan/p/9341137.html
Copyright © 2011-2022 走看看