zoukankan      html  css  js  c++  java
  • 使用L2正则化和平均滑动模型的LeNet-5MNIST手写数字识别模型

    使用L2正则化和平均滑动模型的LeNet-5MNIST手写数字识别模型

    觉得有用的话,欢迎一起讨论相互学习~

    我的微博我的github我的B站

    参考文献Tensorflow实战Google深度学习框架
    实验平台:
    Tensorflow1.4.0
    python3.5.0
    MNIST数据集将四个文件下载后放到当前目录下的MNIST_data文件夹下
    L2正则化
    Dropout
    滑动平均方法

    定义模型框架与前向传播

    import tensorflow as tf
    
    # 配置神经网络的参数
    INPUT_NODE = 784
    OUTPUT_NODE = 10
    
    IMAGE_SIZE = 28
    NUM_CHANNELS = 1
    NUM_LABELS = 10
    # 第一层卷积层的尺寸和深度
    CONV1_DEEP = 32
    CONV1_SIZE = 5
    # 第二层卷积层的尺寸和深度
    CONV2_DEEP = 64
    CONV2_SIZE = 5
    # 全连接层的节点个数
    FC_SIZE = 512
    
    
    # 定义卷积神经网络的前向传播过程,这里添加了一个参数train,用于区分训练过程和测试过程。
    # 这里使用dropout方法,dropout方法可以进一步提升模型可靠性并防止过拟合,dropout只在训练过程中使用。
    def inference(input_tensor, train, regularizer):
        # 通过使用不同的命名空间来隔离变量,可以使每一层的变量命名只需要考虑在当前层的作用,而不需要考虑重名的问题
        with tf.variable_scope('layer1-conv1'):
            conv1_weights = tf.get_variable(
                "weight", [CONV1_SIZE, CONV1_SIZE, NUM_CHANNELS, CONV1_DEEP],
                initializer=tf.truncated_normal_initializer(stddev=0.1))
            conv1_biases = tf.get_variable("bias", [CONV1_DEEP], initializer=tf.constant_initializer(0.0))
            conv1 = tf.nn.conv2d(input_tensor, conv1_weights, strides=[1, 1, 1, 1], padding='SAME')
            relu1 = tf.nn.relu(tf.nn.bias_add(conv1, conv1_biases))
    
        with tf.name_scope("layer2-pool1"):
            pool1 = tf.nn.max_pool(relu1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding="SAME")
    
        with tf.variable_scope("layer3-conv2"):
            conv2_weights = tf.get_variable(
                "weight", [CONV2_SIZE, CONV2_SIZE, CONV1_DEEP, CONV2_DEEP],
                initializer=tf.truncated_normal_initializer(stddev=0.1))
            conv2_biases = tf.get_variable("bias", [CONV2_DEEP], initializer=tf.constant_initializer(0.0))
            conv2 = tf.nn.conv2d(pool1, conv2_weights, strides=[1, 1, 1, 1], padding='SAME')
            relu2 = tf.nn.relu(tf.nn.bias_add(conv2, conv2_biases))
    
        with tf.name_scope("layer4-pool2"):
            pool2 = tf.nn.max_pool(relu2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
            # pool2.getshape函数可以得到第四层输出矩阵的维度而不需要手工计算。
            # 注意因为每一层神经网络的输入输出都为一个batch矩阵,所以这里得到的维度也包含了一个batch中数据的个数。
            pool_shape = pool2.get_shape().as_list()
            # 计算将矩阵拉直成向量后的长度,这个长度就是矩阵的长宽及深度的乘积,注意这里的pool_shape[0]为一个batch中数据的个数
            nodes = pool_shape[1]*pool_shape[2]*pool_shape[3]
            # 通过tf.shape函数将第四层的输出变成一个batch的向量
            reshaped = tf.reshape(pool2, [pool_shape[0], nodes])
    
        # dropout一般只在全连接层而不是卷积层或者池化层使用
        with tf.variable_scope('layer5-fc1'):
            fc1_weights = tf.get_variable("weight", [nodes, FC_SIZE],
                                          initializer=tf.truncated_normal_initializer(stddev=0.1))
            # 只有全连接层的权重需要加入正则化
            if regularizer != None: tf.add_to_collection('losses', regularizer(fc1_weights))
            fc1_biases = tf.get_variable("bias", [FC_SIZE], initializer=tf.constant_initializer(0.1))
    
            fc1 = tf.nn.relu(tf.matmul(reshaped, fc1_weights) + fc1_biases)
            # 如果train标签为真,则引入dropout函数使输出层一半的神经元失活
            if train: fc1 = tf.nn.dropout(fc1, 0.5)
    
        with tf.variable_scope('layer6-fc2'):
            fc2_weights = tf.get_variable("weight", [FC_SIZE, NUM_LABELS],
                                          initializer=tf.truncated_normal_initializer(stddev=0.1))
            if regularizer != None: tf.add_to_collection('losses', regularizer(fc2_weights))
            fc2_biases = tf.get_variable("bias", [NUM_LABELS], initializer=tf.constant_initializer(0.1))
            logit = tf.matmul(fc1, fc2_weights) + fc2_biases
    
        return logit
    

    训练基于LeNet的MNIST模型

    import tensorflow as tf
    from tensorflow.examples.tutorials.mnist import input_data
    import LeNet5_infernece
    import os
    import numpy as np
    
    # #### 1. 定义神经网络相关的参数
    
    BATCH_SIZE = 100  # 批处理数量大小
    LEARNING_RATE_BASE = 0.01  # 基础学习率
    LEARNING_RATE_DECAY = 0.99  # 学习率衰减速率
    REGULARIZATION_RATE = 0.0001  # 正则化参数
    TRAINING_STEPS = 6000  # 训练周期数
    MOVING_AVERAGE_DECAY = 0.99  # 平均滑动步长
    
    
    # #### 2. 定义训练过程
    
    def train(mnist):
        # 定义输出为4维矩阵的placeholder
        x = tf.placeholder(tf.float32, [
            BATCH_SIZE,
            LeNet5_infernece.IMAGE_SIZE,
            LeNet5_infernece.IMAGE_SIZE,
            LeNet5_infernece.NUM_CHANNELS],
                           name='x-input')
        # y_表示正确的标签
        y_ = tf.placeholder(tf.float32, [None, LeNet5_infernece.OUTPUT_NODE], name='y-input')
    
        # 定义L2正则化
        regularizer = tf.contrib.layers.l2_regularizer(REGULARIZATION_RATE)
        y = LeNet5_infernece.inference(x, False, regularizer)  # 表示不使用dropout,但是使用正则化
        global_step = tf.Variable(0, trainable=False)
    
        # 定义损失函数、学习率、滑动平均操作以及训练过程。
        variable_averages = tf.train.ExponentialMovingAverage(MOVING_AVERAGE_DECAY, global_step)
        # 使用平均滑动模型
        variables_averages_op = variable_averages.apply(tf.trainable_variables())
        # 定以交叉熵函数
        cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=y, labels=tf.argmax(y_, 1))
        cross_entropy_mean = tf.reduce_mean(cross_entropy)
        # 将权重的L2正则化部分加到损失函数中
        loss = cross_entropy_mean + tf.add_n(tf.get_collection('losses'))
        # 定义递减的学习率
        learning_rate = tf.train.exponential_decay(
            LEARNING_RATE_BASE,
            global_step,
            mnist.train.num_examples/BATCH_SIZE, LEARNING_RATE_DECAY,
            staircase=True)
    
        train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss, global_step=global_step)
        # with tf.control_dependencies([train_step, variables_averages_op]):
        #     train_op = tf.no_op(name='train')
        # 在反向传播梯度下降的过程中更新变量的滑动平均值
        train_op = tf.group(train_step, variables_averages_op)
        # 初始化TensorFlow持久化类。
        saver = tf.train.Saver()
        with tf.Session() as sess:
            tf.global_variables_initializer().run()
            for i in range(TRAINING_STEPS):
                xs, ys = mnist.train.next_batch(BATCH_SIZE)
    
                reshaped_xs = np.reshape(xs, (
                    BATCH_SIZE,
                    LeNet5_infernece.IMAGE_SIZE,
                    LeNet5_infernece.IMAGE_SIZE,
                    LeNet5_infernece.NUM_CHANNELS))
                _, loss_value, step = sess.run([train_op, loss, global_step], feed_dict={x: reshaped_xs, y_: ys})
    
                if i%1000 == 0:
                    print("After %d training step(s), loss on training batch is %g."%(step, loss_value))
    
    
    # #### 3. 主程序入口
    
    def main(argv=None):
        mnist = input_data.read_data_sets("./MNIST_data", one_hot=True)
        train(mnist)
    
    
    if __name__ == '__main__':
        main()
    
  • 相关阅读:
    typro常用快捷键
    02: kali-linux破解密码运行脚本并隐藏进程
    01:kali安装使用
    01: 模拟挖矿黑客攻击过程
    12: docker-compose部署django+nginx+uwsgi+celery+redis+mysql
    11: Django + gunicorn + Nginx 的生产环境部署
    博客说明
    计算机中原码,反码,补码之间的关系
    修改linux下yum镜像源为国内镜像
    webp图片技术调研最终结论(完全真实数据可自行分析)
  • 原文地址:https://www.cnblogs.com/cloud-ken/p/9319341.html
Copyright © 2011-2022 走看看