zoukankan      html  css  js  c++  java
  • 【tf.keras】在 cifar 上训练 AlexNet,数据集过大导致 OOM

    cifar-10 每张图片的大小为 32×32,而 AlexNet 要求图片的输入是 224×224(也有说 227×227 的,这是 224×224 的图片进行大小为 2 的 zero padding 的结果),所以一种做法是将 cifar-10 数据集的图片 resize 到 224×224。(当然,更好的做法是修改输入层大小,并且适当对 filter 大小进行修改,可以参考 cifar10_cnn.py,虽然 cifar10_cnn.py 中的网络不是 AlexNet。)

    此时遇到的问题是,cifar-10 resize 到 224×224 时,32G 内存都将无法完全加载所有数据,在归一化那一步(即每个像素点除以 255)就将发生 OOM(out of memory)。

    那么此时的做法有:
    1)将 resize 作为模型的一部分,如设置一个 layer 来对一个 batch 的图像进行 resize,这样 32×32 的 cifar-10 仍然可以完全加载到内存中;
    2)一种通用的方法,每次只加载一部分数据到内存中,其余数据等到需要的时候再加载到内存。

    注:本文 AlexNet 结构与 PyTorch 中一致。AlexNet in pytorch/vision

    方法 1:加上一个 Lambda 层,对输入图片进行 resize

    import tensorflow as tf
    from tensorflow.keras import layers
    from tensorflow.python.keras import backend as K
    
    K.clear_session()
    config = tf.ConfigProto()
    config.gpu_options.allow_growth = True  # 不全部占满显存, 按需分配
    K.set_session(tf.Session(config=config))
    
    # 超参数
    learning_rate = 0.001
    epochs = 120
    batch_size = 32
    
    cifar10 = tf.keras.datasets.cifar10
    (x_train, y_train), (x_test, y_test) = cifar10.load_data()
    
    x_train = x_train.astype(np.float32)
    x_test = x_test.astype(np.float32)
    
    x_train = x_train / 255
    x_test = x_test / 255
    
    model = tf.keras.models.Sequential([
        # Lambda 层,对输入图片进行 resize,以下是将图片扩大了 7 倍
        # resize 时,默认使用最近邻插值,想要用其它插值方式,需要直接修改 K.resize_images 方法的源代码。
        layers.Lambda(lambda img: K.resize_images(img, 7, 7, data_format='channels_last'), input_shape=(32, 32, 3)),
        layers.ZeroPadding2D(padding=(2, 2)),
        layers.Conv2D(64, (11, 11), strides=(4, 4), padding='valid', activation='relu',
                      kernel_initializer='he_uniform'),
        layers.MaxPooling2D(pool_size=(3, 3), strides=(2, 2)),
    
        layers.Conv2D(192, (5, 5), strides=(1, 1), padding='same', activation='relu',
                      kernel_initializer='he_uniform'),
        layers.MaxPooling2D(pool_size=(3, 3), strides=(2, 2)),
    
        layers.Conv2D(384, (3, 3), strides=(1, 1), padding='same', activation='relu',
                      kernel_initializer='he_uniform'),
        layers.Conv2D(256, (3, 3), strides=(1, 1), padding='same', activation='relu',
                      kernel_initializer='he_uniform'),
        layers.Conv2D(256, (3, 3), strides=(1, 1), padding='same', activation='relu',
                      kernel_initializer='he_uniform'),
        layers.MaxPooling2D(pool_size=(3, 3), strides=(2, 2)),
    
        layers.Flatten(),
    
        layers.Dense(4096, activation='relu', kernel_initializer='he_uniform'),
        layers.Dropout(drop_rate),
        layers.Dense(4096, activation='relu', kernel_initializer='he_uniform'),
        layers.Dropout(drop_rate),
        layers.Dense(num_classes, activation='softmax', kernel_initializer='he_uniform')
    ])
    
    model.summary()
    
    model.compile(optimizer='adam',
                  loss='sparse_categorical_crossentropy',
                  metrics=['accuracy'])
    
    model.fit(x_train, y_train,
              epochs=epochs,
              batch_size=batch_size,
              verbose=2,
              validation_data=(x_val, y_val))
    

    方法 2:使用 tensorflow.keras.utils.Sequence,构造一个 data generator

    import tensorflow as tf
    from tensorflow.keras import layers
    from tensorflow.python.keras import backend as K
    from tensorflow.keras.utils import Sequence
    
    from sklearn.model_selection import StratifiedShuffleSplit
    
    import cv2
    import os
    import numpy as np
    import h5py
    import time
    
    class CIFAR10Sequence(Sequence):
        def __init__(self, x_set, y_set, batch_size):
            """
            :param x_set: hdf5
            :param y_set: hdf5
            :param batch_size: int
            """
            self.x, self.y = x_set, y_set
            self.batch_size = batch_size
    
        def __len__(self):
            return int(np.ceil(len(self.x) / float(self.batch_size)))
    
        def __getitem__(self, idx):
            batch_x = self.x[idx * self.batch_size:(idx + 1) * self.batch_size]
            batch_y = self.y[idx * self.batch_size:(idx + 1) * self.batch_size]
    
            batch_x = batch_x.astype(np.float32)
            batch_x = batch_x / 255
    
            return batch_x, batch_y
    
    
    def _resized_data():
        """
        将 resize 后的 cifar-10 保存到 'data/cifar-10.h5'
        图片大小: [224, 224, 3]
        :return: None
        """
        cifar10 = tf.keras.datasets.cifar10
    
        (x_train, y_train), (x_test, y_test) = cifar10.load_data()
    
        start_time = time.clock()
    
        x_train = np.array([cv2.resize(img, (224, 224), interpolation=cv2.INTER_CUBIC) for img in x_train])
        x_test = np.array([cv2.resize(img, (224, 224), interpolation=cv2.INTER_CUBIC) for img in x_test])
    
        # initialize
        x_val = np.array([])
        y_val = np.array([])
    
        sss = StratifiedShuffleSplit(n_splits=1, test_size=0.1, random_state=32)
        for train_index, val_index in sss.split(x_train, y_train):
            print("TRAIN:", train_index, "VAL:", val_index)
            x_train, x_val = x_train[train_index], x_train[val_index]
            y_train, y_val = y_train[train_index], y_train[val_index]
    
        end_time = time.clock()
        print('Time consuming of resizing: ', (end_time - start_time))
    
        # 写文件
        filename = 'data/cifar-10.h5'
        h5f = h5py.File(filename, 'w')
        h5f.create_dataset('x_train', data=x_train)
        h5f.create_dataset('y_train', data=y_train)
        h5f.create_dataset('x_val', data=x_val)
        h5f.create_dataset('y_val', data=y_val)
        h5f.create_dataset('x_test', data=x_test)
        h5f.create_dataset('y_test', data=y_test)
        h5f.close()
    
    
    def load_resized_data(filename='data/cifar-10.h5'):
        if not os.path.exists(filename):
            _resized_data()
        
        # 不要关闭 h5 文件,否则将无法读取数据,这一步并不会直接将数据加载到内存中
        # h5 文件支持切片读取,而且也很快
        h5f = h5py.File(filename, 'r')
        x_train = h5f['x_train']
        y_train = h5f['y_train']
        x_val = h5f['x_val']
        y_val = h5f['y_val']
        x_test = h5f['x_test']
        y_test = h5f['y_test']
    
        return (x_train, y_train), (x_val, y_val), (x_test, y_test)
    
    
    K.clear_session()
    config = tf.ConfigProto()
    config.gpu_options.allow_growth = True  # 不全部占满显存, 按需分配
    K.set_session(tf.Session(config=config))
    
    # 超参数
    learning_rate = 0.001
    epochs = 120
    batch_size = 32
    
    (x_train, y_train), (x_val, y_val), (x_test, y_test) = load_resized_data()
    
    x_val = x_val.astype(np.float32)
    x_test = x_test.astype(np.float32)
    
    x_val = x_val / 255
    x_test = x_test / 255
    
    model = tf.keras.models.Sequential([
        layers.ZeroPadding2D(padding=(2, 2), input_shape=(224, 224, 3)),
        layers.Conv2D(64, (11, 11), strides=(4, 4), padding='valid', activation='relu',
                      kernel_initializer='he_uniform'),
        layers.MaxPooling2D(pool_size=(3, 3), strides=(2, 2)),
    
        layers.Conv2D(192, (5, 5), strides=(1, 1), padding='same', activation='relu',
                      kernel_initializer='he_uniform'),
        layers.MaxPooling2D(pool_size=(3, 3), strides=(2, 2)),
    
        layers.Conv2D(384, (3, 3), strides=(1, 1), padding='same', activation='relu',
                      kernel_initializer='he_uniform'),
        layers.Conv2D(256, (3, 3), strides=(1, 1), padding='same', activation='relu',
                      kernel_initializer='he_uniform'),
        layers.Conv2D(256, (3, 3), strides=(1, 1), padding='same', activation='relu',
                      kernel_initializer='he_uniform'),
        layers.MaxPooling2D(pool_size=(3, 3), strides=(2, 2)),
    
        layers.Flatten(),
    
        layers.Dense(4096, activation='relu', kernel_initializer='he_uniform'),
        layers.Dropout(drop_rate),
        layers.Dense(4096, activation='relu', kernel_initializer='he_uniform'),
        layers.Dropout(drop_rate),
        layers.Dense(num_classes, activation='softmax', kernel_initializer='he_uniform')
    ])
    
    model.summary()
    
    model.compile(optimizer='adam',
                  loss='sparse_categorical_crossentropy',
                  metrics=['accuracy'])
    
    # shuffle 默认为 True, 意味着在训练一个 epoch 之后,CIFAR10Sequence 的 idx 会随机选择,而不是顺序选择,这样在 batch-level 进行了随机,一个 batch 内的样本顺序是固定的
    model.fit_generator(CIFAR10Sequence(x_train, y_train, batch_size=batch_size),
                        # steps_per_epoch=int(np.ceil(len(x_train)/batch_size)),
                        epochs=epochs,
                        verbose=2,
                        callbacks=None,
                        validation_data=(x_val[:], y_val[:]))
    

    References

    class CIFAR10Sequence(Sequence) -- github
    keras.utils.Sequence()
    AlexNet in pytorch/vision

  • 相关阅读:
    【转】《基于MFC的OpenGL编程》Part 5 Transformations Rotations, Translations and Scaling
    【转】 《基于MFC的OpenGL编程》Part 10 Texture Mapping
    【转】 《基于MFC的OpenGL编程》Part 11 Blending, Antialiasing and Fog
    win form 托盘功能的实现(引用CSDN)
    C# win form退出窗体时对话框实用
    智能DNS 笔记
    iis无法启动, 找出占用80端口的罪魁祸首
    gvim for windows的剪贴板操作
    内容交换
    Content Networking 读书笔记
  • 原文地址:https://www.cnblogs.com/wuliytTaotao/p/11191702.html
Copyright © 2011-2022 走看看