zoukankan      html  css  js  c++  java
  • python手写图片识别MNIST

    MNIST(Modified National Institute of Standards and Technology)

    MNIST被称作是计算机视觉的新手村,相当于神经网络CNN版的helloword,也是TensorFlow的初体验。提供的数据集是28*28的灰度矩阵,要分析并识别出对应原来手写图片的数字。

    载入数据集

        train = pd.read_csv('./input/train.csv')
        test = pd.read_csv('./input/test.csv')
    

    训练集数字总览

        # 数字出现总数求和,柱状图
        g = sns.countplot(Y_train)
        plt.show()
    
    各个数字出现的总数大致相等,没有极端情况 ####原始数据处理 因为训练集是28*28的灰度矩阵,取值范围是0-255的整数,数字越大对应的像素点越暗,因此/255转化成float ```python X_train = X_train / 255.0 test = test / 255.0
    X_train = X_train.values.reshape(-1, 28, 28, 1)
    test = test.values.reshape(-1, 28, 28, 1)
    Y_train = to_categorical(Y_train, num_classes=10)
    
    ####CNN建模
    因为训练集是28*28的灰度矩阵,取值范围是0-255的整数,数字越大对应的像素点越暗,因此/255转化成float
    ```python
        model_begin = datetime.now()
        print(str(model_begin) + " model begin")
    
        model = Sequential()
        model.add(Conv2D(filters=32, kernel_size=(5, 5), padding='Same',
                         activation='relu', input_shape=(28, 28, 1)))
        model.add(Conv2D(filters=32, kernel_size=(5, 5), padding='Same',
                         activation='relu'))
        model.add(MaxPool2D(pool_size=(2, 2)))
        model.add(Dropout(0.25))
    
        model.add(Conv2D(filters=64, kernel_size=(3, 3), padding='Same',
                         activation='relu'))
        model.add(Conv2D(filters=64, kernel_size=(3, 3), padding='Same',
                         activation='relu'))
        model.add(MaxPool2D(pool_size=(2, 2), strides=(2, 2)))
        model.add(Dropout(0.25))
    
        model.add(Flatten())
        model.add(Dense(256, activation='relu'))
        model.add(Dropout(0.5))
        model.add(Dense(10, activation="softmax"))
    
        optimizer = RMSprop(lr=0.001, rho=0.9, epsilon=1e-08, decay=0.0)
    
        model.compile(optimizer=optimizer, loss="categorical_crossentropy", metrics=["accuracy"])
    
        learning_rate_reduction = ReduceLROnPlateau(monitor='val_acc',
                                                    patience=3,
                                                    verbose=1,
                                                    factor=0.5,
                                                    min_lr=0.00001)
        # epochs=1 ,- 340s - loss: 0.4151 - acc: 0.8693 - val_loss: 0.0748 - val_acc: 0.9779
        # epochs=10,- 309s - loss: 0.0633 - acc: 0.9823 - val_loss: 0.0222 - val_acc: 0.9945
        epochs = 1
        batch_size = 86
    
        datagen = ImageDataGenerator(
            featurewise_center=False,  
            samplewise_center=False,  
            featurewise_std_normalization=False,  
            samplewise_std_normalization=False,  
            zca_whitening=False,  
            rotation_range=10,  
            zoom_range=0.1,  
            width_shift_range=0.1,  
            height_shift_range=0.1,  
            horizontal_flip=False,  
            vertical_flip=False)  
    
        datagen.fit(X_train)
    
        history = model.fit_generator(datagen.flow(X_train, Y_train, batch_size=batch_size),
                                      epochs=epochs, validation_data=(X_val, Y_val),
                                      verbose=2, steps_per_epoch=X_train.shape[0] // batch_size
                                      , callbacks=[learning_rate_reduction])
    

    训练集误差分析

        plt.imshow(cm, interpolation='nearest', cmap=cmap)
        plt.title(title)
        plt.colorbar()
        tick_marks = np.arange(len(classes))
        plt.xticks(tick_marks, classes, rotation=45)
        plt.yticks(tick_marks, classes)
    
        if normalize:
            cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
    
        thresh = cm.max() / 2.
        for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
            plt.text(j, i, cm[i, j],
                     horizontalalignment="center",
                     color="white" if cm[i, j] > thresh else "black")
    
        plt.tight_layout()
        plt.ylabel('True label')
        plt.xlabel('Predicted label')
    
        plt.savefig('./output_cnn/matrix.png')
        plt.show()
    
    x轴是预测的数字,y轴是真实的数字。可以看出把5预测成6,3预测成8的情况较多,可能是因为这几对数字形状相近,在手写的情况下存在一定的误导 ####查看预测错误的数字的真是图片 ```python n = 0 nrows = 3 ncols = 3 fig, ax = plt.subplots(nrows, ncols, sharex=True, sharey=True) for row in range(nrows): for col in range(ncols): error = errors_index[n] ax[row, col].imshow((img_errors[error]).reshape((28, 28))) ax[row, col].set_title("Predicted label :{} True label :{}".format(pred_errors[error], obs_errors[error])) n += 1
    plt.savefig('./output_cnn/errors.png')
    plt.show()
    
    <img style="500px;height:350px" src="https://img2018.cnblogs.com/blog/841731/201905/841731-20190519142832900-1437250228.png" align=center />
    可以看出部分手写数字比较潦草,人眼看的话,也可能存在错误的情况
    ####输出预测结果
    ```python
        nresults = model.predict(test)
        results = np.argmax(results, axis=1)
        results = pd.Series(results, name="Label")
        submission = pd.concat([pd.Series(range(1, 28001), name="ImageId"), results], axis=1)
        submission.to_csv("./output_cnn/mnist_cnn.csv", index=False)
    

    输出日志

    2019-05-12 18:43:05.861004 digit-recongizer begin
    2019-05-12 18:43:09.434510 model begin
    Epoch 1/1
    2019-05-12 18:43:10.537447: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
     - 306s - loss: 0.4166 - acc: 0.8679 - val_loss: 0.0808 - val_acc: 0.9726
    2019-05-12 18:48:16.955573 error begin
    2019-05-12 18:48:25.481250 matrix begin
    2019-05-12 18:48:26.292335 display_errors begin
    2019-05-12 18:48:27.402511 predict begin
    2019-05-12 18:49:28.578289 digit-recongizer end
    

    上传Kaggle预测结果集

    第二次修改epochs = 10

    Using TensorFlow backend.
    2019-05-19 13:10:45.624923 digit-recongizer begin
    2019-05-19 13:10:49.557691 model begin
    Epoch 1/10
    2019-05-19 13:10:51.337148: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
     - 311s - loss: 0.4112 - acc: 0.8695 - val_loss: 0.0765 - val_acc: 0.9771
    Epoch 2/10
     - 294s - loss: 0.1281 - acc: 0.9622 - val_loss: 0.0400 - val_acc: 0.9860
    Epoch 3/10
     - 298s - loss: 0.0940 - acc: 0.9717 - val_loss: 0.0367 - val_acc: 0.9895
    Epoch 4/10
     - 318s - loss: 0.0785 - acc: 0.9765 - val_loss: 0.0317 - val_acc: 0.9895
    Epoch 5/10
     - 303s - loss: 0.0701 - acc: 0.9798 - val_loss: 0.0384 - val_acc: 0.9888
    Epoch 6/10
     - 301s - loss: 0.0678 - acc: 0.9799 - val_loss: 0.0315 - val_acc: 0.9910
    Epoch 7/10
     - 291s - loss: 0.0635 - acc: 0.9811 - val_loss: 0.0342 - val_acc: 0.9898
    Epoch 8/10
     - 293s - loss: 0.0585 - acc: 0.9830 - val_loss: 0.0312 - val_acc: 0.9921
    Epoch 9/10
     - 292s - loss: 0.0606 - acc: 0.9829 - val_loss: 0.0202 - val_acc: 0.9943
    Epoch 10/10
     - 309s - loss: 0.0633 - acc: 0.9823 - val_loss: 0.0222 - val_acc: 0.9945
    2019-05-19 14:01:01.464350 error begin
    2019-05-19 14:01:09.997218 matrix begin
    2019-05-19 14:01:10.969481 display_errors begin
    2019-05-19 14:01:13.028658 predict begin
    2019-05-19 14:02:23.559788 digit-recongizer end
    

    可以看到随着epochs的增加,准确度在缓慢提升,不过花的时间也是越来越长

    查看系统资源

    mbp几乎在cpu满负荷的情况下跑了1个小时,epochs每一个轮次大药5分钟,10次接近一小时

    上传Kaggle预测结果集

    准确率达到了0.992,暂时先这样,后面再看有没有其他的调参优化方法

    完整代码,数据集下载

    githup源码

  • 相关阅读:
    UOJ168. 【UR #11】元旦老人与丛林
    luogu3308,LOJ 2196 [SDOI2014]LIS
    CF1349F2. Slime and Sequences (Hard Version)
    6210. wsm
    欧拉数学习小记
    CF1508F. Optimal Encoding
    CF1508C. Complete the MST
    联合省选2021 游记
    一. Docker介绍
    Elasticsearch
  • 原文地址:https://www.cnblogs.com/wanli002/p/10888379.html
Copyright © 2011-2022 走看看