zoukankan      html  css  js  c++  java
  • 多层感知机与简易CNN的TensorFlow实现

    下文使用TensorFlow实现了一个多层感知机和一个简单的卷积神经网络模型,并应用于数据集MNIST。

    所有代码以及所使用的的数据集文件可以到作者的GitHub上下载,GitHub上提供的Jupyter Notebook文

    件包含代码以及详细注释(代码中使用的每个函数的作用、参数说明)。

    import tensorflow as tf
    from tensorflow import keras
    print(tf.__version__)# 2.0.0

    使用的TensorFlow版本为2.0.0

    首先获取数据集:

    from tensorflow.keras.datasets import mnist
    (train_data, train_label), (test_data, test_label) = mnist.load_data('./mnist.npz')

    这里需要注意下载数据集时可能会出现HTTP连接超时的问题,可能需要VPN,也可以自行下载mnist.npz文件

    再将数据放到C:UsersAdministrator.kerasdatasets文件夹下。

    多层感知机的实现:

    # 定义模型
    model = tf.keras.models.Sequential([
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(256, activation='relu'),
        tf.keras.layers.Dense(10, activation='softmax')
    ])

    模型结构:

    print(model.summary())
    """
    Model: "sequential"
    _________________________________________________________________
    Layer (type)                 Output Shape              Param #   
    =================================================================
    flatten (Flatten)            (None, 784)               0         
    _________________________________________________________________
    dense (Dense)                (None, 256)               200960    
    _________________________________________________________________
    dense_1 (Dense)              (None, 10)                2570      
    =================================================================
    Total params: 203,530
    Trainable params: 203,530
    Non-trainable params: 0
    _________________________________________________________________
    None
    """

    对数据做归一化处理,并设定模型超参数,训练模型:

    # 将输入数据归一化
    train_data = train_data / 255.0
    test_data = test_data / 255.0
    
    model.compile(optimizer=tf.keras.optimizers.SGD(lr=0.5), 
                 loss='sparse_categorical_crossentropy', 
                 metrics=['accuracy'])
    
    model.fit(train_data, train_label, epochs=5,
                  batch_size=256,
                  validation_data=(test_data, test_label),
                  validation_freq=1)

    训练结果:

    Train on 60000 samples, validate on 10000 samples
    Epoch 1/5
    60000/60000 [==============================] - 16s 259us/sample - loss: 0.3641 - accuracy: 0.8926 - val_loss: 0.2121 - val_accuracy: 0.9351
    Epoch 2/5
    60000/60000 [==============================] - 4s 63us/sample - loss: 0.1652 - accuracy: 0.9523 - val_loss: 0.1375 - val_accuracy: 0.9580
    Epoch 3/5
    60000/60000 [==============================] - 4s 63us/sample - loss: 0.1199 - accuracy: 0.9658 - val_loss: 0.1091 - val_accuracy: 0.9674
    Epoch 4/5
    60000/60000 [==============================] - 5s 85us/sample - loss: 0.0952 - accuracy: 0.9726 - val_loss: 0.1082 - val_accuracy: 0.9658
    Epoch 5/5
    60000/60000 [==============================] - 4s 70us/sample - loss: 0.0788 - accuracy: 0.9775 - val_loss: 0.0947 - val_accuracy: 0.9702
    <tensorflow.python.keras.callbacks.History at 0x23036b99320>

    除此之外,作者通过给上述多层感知机模型添加全连接层以及改变全连接层的尺寸,并观察了这些操作对训练结果

    的影响。由于此文的目的是为了提供一个多层感知机的实现示例,因此不再展开,具体代码以及实验结果可以在作

    者GitHub上看到。

    简易CNN实现:

    model5 = tf.keras.models.Sequential([
        tf.keras.layers.Conv2D(filters=6, 
                               kernel_size=5, 
                               activation='relu', 
                               input_shape=(28, 28, 1)),
        tf.keras.layers.MaxPool2D(pool_size=2, strides=2),
        tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(256, activation='relu'),
        tf.keras.layers.Dense(10, activation='softmax')
    ])

    模型结构:

    Model: "sequential_7"
    _________________________________________________________________
    Layer (type)                 Output Shape              Param #   
    =================================================================
    conv2d_4 (Conv2D)            (None, 24, 24, 6)         156       
    _________________________________________________________________
    max_pooling2d_4 (MaxPooling2 (None, 12, 12, 6)         0         
    _________________________________________________________________
    flatten_4 (Flatten)          (None, 864)               0         
    _________________________________________________________________
    dense_16 (Dense)             (None, 256)               221440    
    _________________________________________________________________
    dense_17 (Dense)             (None, 10)                2570      
    =================================================================
    Total params: 224,166
    Trainable params: 224,166
    Non-trainable params: 0
    _________________________________________________________________
    None

    在训练模型前需要将训练数据的shape更改一下:

    train_data = tf.reshape(train_data, (-1, 28, 28, 1))
    test_data = tf.reshape(test_data, (-1, 28, 28, 1))

    设置超参数并训练模型:

    model5.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
                 loss='sparse_categorical_crossentropy', 
                 metrics=['accuracy'])
    
    model5.fit(train_data, train_label, epochs=5, validation_split=0.1)

    训练结果:

    Train on 54000 samples, validate on 6000 samples
    Epoch 1/5
    54000/54000 [==============================] - 34s 622us/sample - loss: 0.2047 - accuracy: 0.9400 - val_loss: 0.0763 - val_accuracy: 0.9797
    Epoch 2/5
    54000/54000 [==============================] - 32s 594us/sample - loss: 0.0688 - accuracy: 0.9792 - val_loss: 0.0605 - val_accuracy: 0.9833
    Epoch 3/5
    54000/54000 [==============================] - 32s 600us/sample - loss: 0.0479 - accuracy: 0.9846 - val_loss: 0.0476 - val_accuracy: 0.9870
    Epoch 4/5
    54000/54000 [==============================] - 32s 593us/sample - loss: 0.0338 - accuracy: 0.9892 - val_loss: 0.0566 - val_accuracy: 0.9855
    Epoch 5/5
    54000/54000 [==============================] - 35s 649us/sample - loss: 0.0258 - accuracy: 0.9916 - val_loss: 0.0522 - val_accuracy: 0.9858
    <tensorflow.python.keras.callbacks.History at 0x230380d9518>

    在训练model5之前,作者使用了同样结构的model4,但是优化器选用的SGD,学习率设为0.9,在最后训练完成后

    发现整个模型的准确率=0.1就像没有训练过的随机初始化的模型一样,因此作者将模型的优化器修改为Adam,并将

    学习率设为0.001,即model4,训练完成后模型准确率为0.98。之后作者仅修改学习率,优化器仍然使用SGD,发现

    模型训练完成后的准确率虽然没有优化器寻味Adam的版本好,但也有0.96。可以看出模型的超参数选取十分重要。

     
  • 相关阅读:
    sqli-labs(30)
    sqli-labs(29)
    sqli-labs29-31关Background-6 服务器(两层)架构
    HA高可用的搭建
    克隆虚拟机,如何将克隆虚拟的网卡设置为eth0
    mysql1主多从配置
    关于mysql binlog日志的格式说明
    mysql主从同步
    tomcat的安装
    获取系统的IP
  • 原文地址:https://www.cnblogs.com/lnlin/p/14116304.html
Copyright © 2011-2022 走看看