zoukankan      html  css  js  c++  java
  • 深度学习实践系列(3)- 使用Keras搭建notMNIST的神经网络

    前期回顾:

    深度学习实践系列(1)- 从零搭建notMNIST逻辑回归模型

    深度学习实践系列(2)- 搭建notMNIST的深度神经网络

    在第二篇系列中,我们使用了TensorFlow搭建了第一个深度神经网络,并且尝试了很多优化方式去改进神经网络学习的效率和提高准确性。在这篇文章,我们将要使用一个强大的神经网络学习框架Keras配合TensorFlow重新搭建一个深度神经网络。

    什么是Keras?

    官方对于Keras的定义如下:

    Keras: Deep Learning library for Theano and TensorFlow

    You have just found Keras.

    Keras is a high-level neural networks API, written in Python and capable of running on top of either TensorFlow orTheano. It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research.

    Use Keras if you need a deep learning library that:

    • Allows for easy and fast prototyping (through user friendliness, modularity, and extensibility).
    • Supports both convolutional networks and recurrent networks, as well as combinations of the two.
    • Runs seamlessly on CPU and GPU.“

    知乎上面对其的评价:如何评价深度学习框架Keras?

    今年1月Keras被添加到TensorFlow被作为默认API:Keras 将被添加到谷歌 TensorFlow 成为默认 API

    总结下来有以下几点:

    1. Keras是基于TensorFlow和Theano更高级的封装框架,因此提供很多现成的功能更容易实现

    2. 灵活度不够,并且由于封装对于外界是黑盒,所以定制也较难

    3. 社区非常活跃,并且获得TensorFlow的认可,因此TensorFlow+Keras会成为初学者上手很好的一个平台

    使用Keras搭建神经网络

    环境准备

    安装TensorFlow:See installation instructions.

    安装Keras

    sudo pip install keras

    依赖包引入

    引入了numpy, tensorflow, keras, six.moves

    from __future__ import print_function
    import numpy as np
    import tensorflow as tf
    from six.moves import cPickle as pickle
    from six.moves import range
    np.random.seed(1337)  # for reproducibility
    
    from keras.datasets import mnist
    from keras.models import Sequential
    from keras.layers.core import Dense, Dropout, Activation
    from keras.optimizers import SGD
    from keras.utils import np_utils
    from keras.optimizers import RMSprop
    from keras.optimizers import Adam
    from keras.regularizers import l2

    读取数据

    nb_classes = 10
    
    pickle_file = 'notMNIST.pickle'
    
    with open(pickle_file, 'rb') as f:
      save = pickle.load(f)
      X_train = save['train_dataset']
      y_train = save['train_labels']
      X_valid = save['valid_dataset']
      y_valid = save['valid_labels']
      X_test = save['test_dataset']
      y_test = save['test_labels']
      del save  # hint to help gc free up memory
      print('Training set', X_train.shape, y_train.shape)
      print('Validation set', X_valid.shape, y_valid.shape)
      print('Test set', X_test.shape, y_test.shape)
    
    X_train = X_train.reshape(200000, 784)
    X_valid = X_valid.reshape(10000, 784)
    X_test = X_test.reshape(10000, 784)
    X_train = X_train.astype('float32')
    X_valid = X_valid.astype('float32')
    X_test = X_test.astype('float32')
    X_train /= 255
    X_valid /= 255
    X_test /= 255
    print(X_train.shape[0], 'train samples')
    print(X_valid.shape[0], 'valid samples')
    print(X_test.shape[0], 'test samples')
    
    # convert class vectors to binary class matrices
    Y_train = np_utils.to_categorical(y_train, nb_classes)
    Y_valid = np_utils.to_categorical(y_valid, nb_classes)
    Y_test = np_utils.to_categorical(y_test, nb_classes)

    从以前系列中获得的训练文件”notMNIST.pickle“读取出数据,分为三组:training, validation, test。

    原始的数据X_train的类型是(200000, 28, 28),将其转化成(200000, 784),用于训练输入。

    X_train = X_train.reshape(200000, 784)

    将数据类型转化成float32类型

    X_train = X_train.astype('float32')

    对数据进行Normalization,使得所有数据都在[0,1]的范围内

    X_train /= 255

    将数据进行转化成binary class matrix,用于后续训练。(Converts a class vector (integers) to binary class matrix.)

    Y_train = np_utils.to_categorical(y_train, nb_classes)

    设计神经网络模型

    model = Sequential()
    model.add(Dense(512, input_shape=(784,)))
    model.add(Activation('relu'))
    model.add(Dropout(0.5))
    model.add(Dense(512))
    model.add(Activation('relu'))
    model.add(Dropout(0.5))
    
    model.add(Dense(10))
    model.add(Activation('softmax'))
    
    model.summary()

    Sequential是Keras的一种串行的层级模型。上述的模型解释如下:

    1. 输入层:大小为784的数据集

    2. hidden layer 1: 512个节点,使用ReLUs激活函数,0.5 Dropout

    3. hidden layer 2: 512个节点,使用ReLUs激活函数,0.5 Dropout

    4. 输出层: 10个节点,使用softmax

    训练神经网络模型

    batch_size = 128
    nb_epoch = 20
    model.compile(loss='categorical_crossentropy', #optimizer=SGD(lr=0.01), optimizer=Adam(), metrics=['accuracy']) history = model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epoch, verbose=1, validation_data=(X_valid, Y_valid)) score = model.evaluate(X_test, Y_test, verbose=0) print('Test score:', score[0]) print('Test accuracy:', score[1])

    compile函数里面的几个参数:

    1. loss='categorical_crossentropy': 使用crossentropy作为loss function

    2. optimizer=Adam(): SGD的高级版本,可以动态调整learning rate, 可以结合动量惯性

    model.fix通过training data和validation data进行训练

    model.evaluate通过在测试数据上进行计算得出最终的准确性

    训练过程中的输入如下:

    Epoch 1/20
    200000/200000 [==============================] - 19s - loss: 0.6951 - acc: 0.7995 - val_loss: 0.5094 - val_acc: 0.8473
    Epoch 2/20
    200000/200000 [==============================] - 19s - loss: 0.5073 - acc: 0.8486 - val_loss: 0.4573 - val_acc: 0.8596
    Epoch 3/20
    200000/200000 [==============================] - 22s - loss: 0.4646 - acc: 0.8601 - val_loss: 0.4253 - val_acc: 0.8668
    Epoch 4/20
    200000/200000 [==============================] - 22s - loss: 0.4356 - acc: 0.8680 - val_loss: 0.3980 - val_acc: 0.8784
    Epoch 5/20
    200000/200000 [==============================] - 20s - loss: 0.4159 - acc: 0.8736 - val_loss: 0.3851 - val_acc: 0.8810
    Epoch 6/20
    200000/200000 [==============================] - 18s - loss: 0.3990 - acc: 0.8788 - val_loss: 0.3735 - val_acc: 0.8850
    Epoch 7/20
    200000/200000 [==============================] - 19s - loss: 0.3868 - acc: 0.8819 - val_loss: 0.3615 - val_acc: 0.8869
    Epoch 8/20
    200000/200000 [==============================] - 19s - loss: 0.3768 - acc: 0.8846 - val_loss: 0.3576 - val_acc: 0.8872
    Epoch 9/20
    200000/200000 [==============================] - 19s - loss: 0.3674 - acc: 0.8875 - val_loss: 0.3506 - val_acc: 0.8929
    Epoch 10/20
    200000/200000 [==============================] - 18s - loss: 0.3610 - acc: 0.8889 - val_loss: 0.3417 - val_acc: 0.8939
    Epoch 11/20
    200000/200000 [==============================] - 19s - loss: 0.3542 - acc: 0.8911 - val_loss: 0.3392 - val_acc: 0.8967
    Epoch 12/20
    200000/200000 [==============================] - 20s - loss: 0.3476 - acc: 0.8928 - val_loss: 0.3350 - val_acc: 0.8966
    Epoch 13/20
    200000/200000 [==============================] - 19s - loss: 0.3419 - acc: 0.8940 - val_loss: 0.3334 - val_acc: 0.8977
    Epoch 14/20
    200000/200000 [==============================] - 19s - loss: 0.3381 - acc: 0.8952 - val_loss: 0.3288 - val_acc: 0.9008
    Epoch 15/20
    200000/200000 [==============================] - 20s - loss: 0.3326 - acc: 0.8971 - val_loss: 0.3286 - val_acc: 0.8994
    Epoch 16/20
    200000/200000 [==============================] - 20s - loss: 0.3273 - acc: 0.8989 - val_loss: 0.3248 - val_acc: 0.9001
    Epoch 17/20
    200000/200000 [==============================] - 19s - loss: 0.3237 - acc: 0.8996 - val_loss: 0.3246 - val_acc: 0.8998
    Epoch 18/20
    200000/200000 [==============================] - 20s - loss: 0.3198 - acc: 0.9003 - val_loss: 0.3180 - val_acc: 0.9028
    Epoch 19/20
    200000/200000 [==============================] - 18s - loss: 0.3181 - acc: 0.9009 - val_loss: 0.3209 - val_acc: 0.9015
    Epoch 20/20
    200000/200000 [==============================] - 18s - loss: 0.3131 - acc: 0.9022 - val_loss: 0.3155 - val_acc: 0.9028
    Test score: 0.139541323698
    Test accuracy: 0.957

    最终获得了大概95.7%的准确率,大家也可以不断去调整神经网络的结构,看看是否可以提高准确率,祝大家玩得开心。

    附上最终的完整代码:

    from __future__ import print_function
    import numpy as np
    import tensorflow as tf
    from six.moves import cPickle as pickle
    from six.moves import range
    import numpy as np
    np.random.seed(1337)  # for reproducibility
    
    from keras.datasets import mnist
    from keras.models import Sequential
    from keras.layers.core import Dense, Dropout, Activation
    from keras.optimizers import SGD
    from keras.utils import np_utils
    from keras.optimizers import RMSprop
    from keras.optimizers import Adam
    from keras.regularizers import l2
    
    batch_size = 128
    nb_classes = 10
    nb_epoch = 20
    
    pickle_file = 'notMNIST.pickle'
    
    with open(pickle_file, 'rb') as f:
      save = pickle.load(f)
      X_train = save['train_dataset']
      y_train = save['train_labels']
      X_valid = save['valid_dataset']
      y_valid = save['valid_labels']
      X_test = save['test_dataset']
      y_test = save['test_labels']
      del save  # hint to help gc free up memory
      print('Training set', X_train.shape, y_train.shape)
      print('Validation set', X_valid.shape, y_valid.shape)
      print('Test set', X_test.shape, y_test.shape)
    
    X_train = X_train.reshape(200000, 784)
    X_valid = X_valid.reshape(10000, 784)
    X_test = X_test.reshape(10000, 784)
    X_train = X_train.astype('float32')
    X_valid = X_valid.astype('float32')
    X_test = X_test.astype('float32')
    X_train /= 255
    X_valid /= 255
    X_test /= 255
    print(X_train.shape[0], 'train samples')
    print(X_valid.shape[0], 'valid samples')
    print(X_test.shape[0], 'test samples')
    
    # convert class vectors to binary class matrices
    Y_train = np_utils.to_categorical(y_train, nb_classes)
    Y_valid = np_utils.to_categorical(y_valid, nb_classes)
    Y_test = np_utils.to_categorical(y_test, nb_classes)
    
    model = Sequential()
    model.add(Dense(512, input_shape=(784,)))
    model.add(Activation('relu'))
    model.add(Dropout(0.5))
    model.add(Dense(512))
    model.add(Activation('relu'))
    model.add(Dropout(0.5))
    
    model.add(Dense(10))
    model.add(Activation('softmax'))
    
    model.summary()
    
    model.compile(loss='categorical_crossentropy',
                  #optimizer=SGD(lr=0.01),
                  optimizer=Adam(),
                  metrics=['accuracy'])
    
    history = model.fit(X_train, Y_train,
                        batch_size=batch_size, nb_epoch=nb_epoch,
                        verbose=1, validation_data=(X_valid, Y_valid))
    score = model.evaluate(X_test, Y_test, verbose=0)
    print('Test score:', score[0])
    print('Test accuracy:', score[1])
  • 相关阅读:
    剑桥雅思写作高分范文ESSAY64
    剑桥雅思写作高分范文ESSAY63
    剑桥雅思写作高分范文ESSAY62
    剑桥雅思写作高分范文ESSAY61
    Python特点
    解释器
    python开发时总会碰到的问题
    python redis
    python连接数据库的方法
    数据库中的主键、外键、索引的区别
  • 原文地址:https://www.cnblogs.com/wdsunny/p/6672937.html
Copyright © 2011-2022 走看看