zoukankan      html  css  js  c++  java
  • 神经网络

     

    浅层神经网络:

    神经网络的输出

     

     

    矩阵公式:输出=激活函数(输入x权重+偏差)

     

    多层感知器识别手写数字:

    关键点:

    1. input:[None,784]
    2. output:[None,10]
    3. 隐藏层:256
    4. 如何随机初始化参数
    5. loss函数如何计算
     

    随机初始化

    weight:
    np.random.randn() or np.random.uniform() # 正态分布打破对称性
    bias:
    初始化为0是可行的。

    深层神经网络:

    深层神经网络

     

     

    为什么深层的网络在很多问题上比浅层的好?

     
    1. 前几层学习低层次简单特征
    2. 后几层结合多个简单特征,探测复杂特征
     

    深层的网络隐藏单元数量相对较少,隐藏层数目较多,如果浅层的网络想要达到同样的 计算结果则需要指数级增长的单元数量才能达到。

     

    参数VS超参数

     
    1. 学习率
    2. 梯度下降循环数量
    3. 隐藏层数
    4. 隐藏层单元数目
    5. 激活函数选择
     

    应用深度学习领域,一个很大程度基于经验的过程,凭经验的过程通俗来说,就是试直到你找到合适的数值。

    改善深层神经网络:

    关于训练集、验证集、测试集的划分。

    大数据时代,测试集的主要目的是正确评估分类器的性能,
    所以,如果拥有百万数据,我们只需要 1000 条数据,便足以评估单个分类器,并且准确评估该分类器的性能.
    98%,1%,1%.

     

    验证集和测试集要确保同一分布

    数据归一化:

    机器学习模型使用梯度下降法求最优解时,归一化往往非常有必要,否则很难收敛甚至不能收敛,一般归一化操作有两种:

    1.最值归一化

     

    2. 均值标准差归一化

    交叉验证集:

    正则化:

    1. 岭回归和lasso回归

    2. dropout

    3. 数据扩增

    4. early stopping(提早停止训练神经网络)

    为什么正则化可以减少过拟合?

    直观上理解就是如果正则化设置得足够大,权重矩阵

    被设置为接近于 0 的值,直观
    理解就是把多隐藏单元的权重设为 0,于是基本上消除了这些隐藏单元的影响。

    梯度消失、梯度爆炸:

    解决:随机初始化神经网络参数。

    如何初始化神经网络权重参数?

    relu激活函数:

    w[i] = np.random.randn(shape)*np.sqrt(2/n[i-1]) # n[i-1]:上一层的输入特征数,w[i]这一层的权重系数

    #### np.sqrt(1/n[i-1])

    #### np.sqrt(2/(n[i-1]+n[i]))   

    梯度调试(只在调试的时候使用):

    采用双边误差检验时,我们使用双边误差,(f(θ+x)-f(θ-x))/2x,因为单边误差(f(θ+x)/x)不够准确。

    如果不正确,程序可能有bug需要你去解决。。。

    Adam优化算法:

    系统地组织超参调试过程的技巧:

    学习率α>隐藏层节点数>mini_batch size >隐藏层数>学习次数

     大量阅读别人的案例。

    自己实现的DNN封装:

    import tensorflow as tf
    import numpy as np
    import matplotlib.pyplot as plt
    from sklearn.metrics import r2_score
    from sklearn.model_selection import train_test_split
    
    
    
    class MlpClassifier():
        """DNN Classifier 二分类"""
        def __init__(self, hiddenNodes, hiddenDeep=3):
            """隐藏层节点数,隐藏层层数"""
            self.hiddenNodes = hiddenNodes
            self.hiddenDeep = hiddenDeep
      
        def fit(self, trainX, trainY, AdamStep, learnRate=0.1,testX=None,testY=None):
            """trainY must be one-hot"""
            trainX,validationX,trainY,validationY = train_test_split(trainX,trainY,test_size=0.1)
            self.input_ = trainX.shape[1]
            self.output_ = 2
            self.trainX_ = trainX
            self.trainY_ = trainY
            self.AdamStep = AdamStep
            self.learnRate = learnRate
      
            dataInput = tf.placeholder(tf.float32, shape=(None, self.input_))
            labelInput = tf.placeholder(tf.float32, shape=(None, self.output_))
            var = locals()
            for i in range(1, self.hiddenDeep + 1):          
                if self.hiddenDeep == 1:
                    """深度为1时特殊"""
                    var["w" + str(i)] = tf.Variable(tf.random_uniform((self.input_, 1)) * tf.sqrt(2 / self.input_))
                    var["b" + str(i)] = tf.Variable(tf.zeros((1, 1)))
                    var["layer" + str(i)] = tf.nn.relu(tf.add(tf.matmul(dataInput, var["w" + str(i)]), var["b" + str(i)]))
                    break
      
                if i == 1:
                    var["w" + str(i)] = tf.Variable(tf.random_uniform((self.input_, self.hiddenNodes)) * tf.sqrt(2 / self.input_))
                    var["b" + str(i)] = tf.Variable(tf.zeros((1, self.hiddenNodes)))
                    var["layer" + str(i)] = tf.nn.relu(tf.add(tf.matmul(dataInput, var["w" + str(i)]), var["b" + str(i)]))
                elif i == self.hiddenDeep:
                    var["w" + str(i)] = tf.Variable(tf.random_uniform((self.hiddenNodes, 1)) * tf.sqrt(2 / self.hiddenNodes))
                    var["b" + str(i)] = tf.Variable(tf.zeros((1, self.output_)))
                    var["layer" + str(i)] = tf.nn.relu(tf.add(tf.matmul(var["layer" + str(i - 1)], var["w" + str(i)]), var["b" + str(i)]))
                else:
                    var["w" + str(i)] = tf.Variable(
                        tf.random_uniform((self.hiddenNodes, self.hiddenNodes)) * tf.sqrt(2 / self.hiddenNodes))
                    var["b" + str(i)] = tf.Variable(tf.zeros((1, self.hiddenNodes)))
                    var["layer" + str(i)] = tf.nn.relu(tf.add(tf.matmul(var["layer" + str(i - 1)], var["w" + str(i)]), var["b" + str(i)]))
                      
            result = tf.nn.softmax(var["layer" + str(self.hiddenDeep)],axis=1) # [None,2]
            loss = tf.reduce_sum(-labelInput*tf.log(result)) # 交叉熵损失函数
            train = tf.train.AdamOptimizer(learnRate).minimize(loss)
      
              
            # session
            ratios = []
            validations = []
            with tf.Session() as sess:
                sess.run(tf.global_variables_initializer())
                sess.run(tf.local_variables_initializer())
                for i in range(AdamStep):
                    sess.run(train,feed_dict={dataInput:trainX,labelInput:trainY})
                    trainYHat = sess.run(result,feed_dict={dataInput:trainX})
                    ratio = np.sum(np.argmax(trainY,axis=1)==np.argmax(trainYHat,axis=1))/trainY.shape[0]
                    ratios.append(ratio)
                      
                    validationYHat = sess.run(result,feed_dict={dataInput:validationX})
                    validation = np.sum(np.argmax(validationY,axis=1)==np.argmax(validationYHat,axis=1))/validationY.shape[0]
                    validations.append(validation)
                      
                # predict
                if testY is not None and testX is not None:
                    YHat = sess.run(result, feed_dict={dataInput: testX})
                    ratio = np.sum(np.argmax(testY,axis=1)==np.argmax(YHat,axis=1))/testY.shape[0]
                    return ratio
                  
                elif testY is None and testX is not None:
                    YHat = sess.run(result, feed_dict={dataInput: testX})
                    return np.argmax(YHat,axis=1)
                  
                elif testY is None and testX is None:
                  
                    x = [i for i in range(1,AdamStep+1)]  
                    plt.plot(x,ratios,color="blue",label="train")
                    plt.plot(x,validations,color="yellow",label="validation")
                    plt.ylim(0,1)
                    plt.legend()
                    plt.show()
                else:
                    print("输入格式错误")
                  
        def predict(self,testX,AdamStep):
            YHat = self.fit(self.trainX_,self.trainY_,AdamStep,self.learnRate,testX=testX)
            return YHat
      
        def score(self,testX,testY,AdamStep):
            score = self.fit(self.trainX_,self.trainY_,AdamStep,self.learnRate,testX=testX,testY=testY)
            return score
    
     
    class MlpRegression():
        """DNN Regression"""
       
        def __init__(self, hiddenNodes, hiddenDeep=3):
            """隐藏层节点数,隐藏层层数"""
            self.hiddenNodes = hiddenNodes
            self.hiddenDeep = hiddenDeep
       
        def fit(self, trainX, trainY, AdamStep, learnRate=0.1,testX=None,testY=None):
            """trainY must be one-hot"""
            trainX,validationX,trainY,validationY = train_test_split(trainX,trainY,test_size=0.1)
            self.input_ = trainX.shape[1]
            self.output_ = 1
            self.trainX_ = trainX
            self.trainY_ = trainY
            self.AdamStep = AdamStep
            self.learnRate = learnRate
       
            dataInput = tf.placeholder(tf.float32, shape=(None, self.input_))
            labelInput = tf.placeholder(tf.float32, shape=(None, self.output_))
            var = locals()
            for i in range(1, self.hiddenDeep + 1):         
                if self.hiddenDeep == 1:
                    """深度为1时特殊"""
                    var["w" + str(i)] = tf.Variable(tf.random_uniform((self.input_, 1)) * tf.sqrt(2 / self.input_))
                    var["b" + str(i)] = tf.Variable(tf.zeros((1, 1)))
                    var["layer" + str(i)] = tf.nn.relu(tf.add(tf.matmul(dataInput, var["w" + str(i)]), var["b" + str(i)]))
                    break
       
                if i == 1:
                    var["w" + str(i)] = tf.Variable(tf.random_uniform((self.input_, self.hiddenNodes)) * tf.sqrt(2 / self.input_))
                    var["b" + str(i)] = tf.Variable(tf.zeros((1, self.hiddenNodes)))
                    var["layer" + str(i)] = tf.nn.relu(tf.add(tf.matmul(dataInput, var["w" + str(i)]), var["b" + str(i)]))
                elif i == self.hiddenDeep:
                    var["w" + str(i)] = tf.Variable(tf.random_uniform((self.hiddenNodes, 1)) * tf.sqrt(2 / self.hiddenNodes))
                    var["b" + str(i)] = tf.Variable(tf.zeros((1, self.output_)))
                    var["layer" + str(i)] = tf.nn.relu(tf.add(tf.matmul(var["layer" + str(i - 1)], var["w" + str(i)]), var["b" + str(i)]))
                else:
                    var["w" + str(i)] = tf.Variable(
                        tf.random_uniform((self.hiddenNodes, self.hiddenNodes)) * tf.sqrt(2 / self.hiddenNodes))
                    var["b" + str(i)] = tf.Variable(tf.zeros((1, self.hiddenNodes)))
                    var["layer" + str(i)] = tf.nn.relu(tf.add(tf.matmul(var["layer" + str(i - 1)], var["w" + str(i)]), var["b" + str(i)]))
                     
            result = var["layer"+str(self.hiddenDeep)]  
            loss = tf.reduce_mean(tf.square(labelInput-result)) # 均方根损失函数       
            train = tf.train.AdamOptimizer(learnRate).minimize(loss)
       
               
            # session
            ratios = []
            validations = []
            with tf.Session() as sess:
                sess.run(tf.global_variables_initializer())
                sess.run(tf.local_variables_initializer())
                for i in range(AdamStep):
                    sess.run(train,feed_dict={dataInput:trainX,labelInput:trainY})
                    trainYHat = sess.run(result,feed_dict={dataInput:trainX})
                    ratio = r2_score(trainY,trainYHat)
                    ratios.append(ratio)
                       
                    validationYHat = sess.run(result,feed_dict={dataInput:validationX})
                    validation = r2_score(validationY,validationYHat)
                    validations.append(validation)
                       
                # predict
                if testY is not None and testX is not None:
                    YHat = sess.run(result, feed_dict={dataInput: testX})
                    ratio = r2_score(testY,YHat)
                    return ratio
                   
                elif testY is None and testX is not None:
                    YHat = sess.run(result, feed_dict={dataInput: testX})
                    return YHat
                   
                elif testY is None and testX is None:
                   
                    x = [i for i in range(1,AdamStep+1)] 
                    plt.plot(x,ratios,color="blue",label="train")
                    plt.plot(x,validations,color="yellow",label="validation")
                    plt.ylim(0,1)
                    plt.legend()
                    plt.show()
                else:
                    print("输入格式错误")
                   
        def predict(self,testX,AdamStep):
            YHat = self.fit(self.trainX_,self.trainY_,AdamStep,self.learnRate,testX=testX)
            return YHat
       
        def score(self,testX,testY,AdamStep):
            score = self.fit(self.trainX_,self.trainY_,AdamStep,self.learnRate,testX=testX,testY=testY)
            return score
    

      

  • 相关阅读:
    Mybatis获取插入记录的自增长ID
    mybatisGenerator 代码自动生成报错 Result Maps collection already contains value for BaseResultMap
    <c:if test="value ne, eq, lt, gt,...."> 用法
    大话设计模式之----状态模式
    php文件加锁 lock_sh ,lock_ex
    in_array 判断问题的疑惑解决。
    我是一只IT小小鸟观后感
    《世界是数字的》
    我是一只IT小小鸟
    解压缩
  • 原文地址:https://www.cnblogs.com/zenan/p/9341137.html
Copyright © 2011-2022 走看看