一 获取models中的slim模块代码
python -c "import tensorflow.contrib.slim as slim; eval = slim.evaluation.evaluate_once"
2. 下载models模块
To use TF-Slim for image classification, you also have to install the TF-Slim image models library, which is not part of the core TF library. To do this, check out the tensorflow/models repository as follows:
cd $HOME/workspace
git clone
This will put the TF-Slim image models library in $HOME/workspace/models/research/slim
. (It will also create a directory calledmodels/inception, which contains an older version of slim; you can safely ignore this.)
To verify that this has worked, execute the following commands; it should run without raising any errors.
cd $HOME/workspace/models/research/slim python -c "from nets import cifarnet; mynet = cifarnet.cifarnet"
二 models中的slim目录结构
slim位于models-master esearchslim路径下,一共有5个文件夹:
- datasets:处理数据集相关的代码。
- deployment:部署。通过创建clone方式实现跨机器的分布训练,可以在多CPU和多GPU上实现运算的同步或者异步。
- nets:该文件夹里存放着各种网络模型。
- preprocessing:适用于各种网络的图片处理函数。
- scripts:运行网络模型的一些案例脚本,这些脚本只能在支持shell的系统下使用。
imagenet_map = imagenet.create_readable_names_for_imagenet_labels()

# ============================================================================== """Contains the definition of the Inception Resnet V2 architecture. As described in Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, Alex Alemi """ from __future__ import absolute_import from __future__ import division from __future__ import print_function import tensorflow as tf slim = tf.contrib.slim def block35(net, scale=1.0, activation_fn=tf.nn.relu, scope=None, reuse=None): """Builds the 35x35 resnet block.""" with tf.variable_scope(scope, 'Block35', [net], reuse=reuse): with tf.variable_scope('Branch_0'): tower_conv = slim.conv2d(net, 32, 1, scope='Conv2d_1x1') with tf.variable_scope('Branch_1'): tower_conv1_0 = slim.conv2d(net, 32, 1, scope='Conv2d_0a_1x1') tower_conv1_1 = slim.conv2d(tower_conv1_0, 32, 3, scope='Conv2d_0b_3x3') with tf.variable_scope('Branch_2'): tower_conv2_0 = slim.conv2d(net, 32, 1, scope='Conv2d_0a_1x1') tower_conv2_1 = slim.conv2d(tower_conv2_0, 48, 3, scope='Conv2d_0b_3x3') tower_conv2_2 = slim.conv2d(tower_conv2_1, 64, 3, scope='Conv2d_0c_3x3') mixed = tf.concat(axis=3, values=[tower_conv, tower_conv1_1, tower_conv2_2]) up = slim.conv2d(mixed, net.get_shape()[3], 1, normalizer_fn=None, activation_fn=None, scope='Conv2d_1x1') scaled_up = up * scale if activation_fn == tf.nn.relu6: # Use clip_by_value to simulate bandpass activation. scaled_up = tf.clip_by_value(scaled_up, -6.0, 6.0) net += scaled_up if activation_fn: net = activation_fn(net) return net def block17(net, scale=1.0, activation_fn=tf.nn.relu, scope=None, reuse=None): """Builds the 17x17 resnet block.""" with tf.variable_scope(scope, 'Block17', [net], reuse=reuse): with tf.variable_scope('Branch_0'): tower_conv = slim.conv2d(net, 192, 1, scope='Conv2d_1x1') with tf.variable_scope('Branch_1'): tower_conv1_0 = slim.conv2d(net, 128, 1, scope='Conv2d_0a_1x1') tower_conv1_1 = slim.conv2d(tower_conv1_0, 160, [1, 7], scope='Conv2d_0b_1x7') tower_conv1_2 = slim.conv2d(tower_conv1_1, 192, [7, 1], scope='Conv2d_0c_7x1') mixed = tf.concat(axis=3, values=[tower_conv, tower_conv1_2]) up = slim.conv2d(mixed, net.get_shape()[3], 1, normalizer_fn=None, activation_fn=None, scope='Conv2d_1x1') scaled_up = up * scale if activation_fn == tf.nn.relu6: # Use clip_by_value to simulate bandpass activation. scaled_up = tf.clip_by_value(scaled_up, -6.0, 6.0) net += scaled_up if activation_fn: net = activation_fn(net) return net def block8(net, scale=1.0, activation_fn=tf.nn.relu, scope=None, reuse=None): """Builds the 8x8 resnet block.""" with tf.variable_scope(scope, 'Block8', [net], reuse=reuse): with tf.variable_scope('Branch_0'): tower_conv = slim.conv2d(net, 192, 1, scope='Conv2d_1x1') with tf.variable_scope('Branch_1'): tower_conv1_0 = slim.conv2d(net, 192, 1, scope='Conv2d_0a_1x1') tower_conv1_1 = slim.conv2d(tower_conv1_0, 224, [1, 3], scope='Conv2d_0b_1x3') tower_conv1_2 = slim.conv2d(tower_conv1_1, 256, [3, 1], scope='Conv2d_0c_3x1') mixed = tf.concat(axis=3, values=[tower_conv, tower_conv1_2]) up = slim.conv2d(mixed, net.get_shape()[3], 1, normalizer_fn=None, activation_fn=None, scope='Conv2d_1x1') scaled_up = up * scale if activation_fn == tf.nn.relu6: # Use clip_by_value to simulate bandpass activation. scaled_up = tf.clip_by_value(scaled_up, -6.0, 6.0) net += scaled_up if activation_fn: net = activation_fn(net) return net def inception_resnet_v2_base(inputs, final_endpoint='Conv2d_7b_1x1', output_stride=16, align_feature_maps=False, scope=None, activation_fn=tf.nn.relu): """Inception model from Constructs an Inception Resnet v2 network from inputs to the given final endpoint. This method can construct the network up to the final inception block Conv2d_7b_1x1. Args: inputs: a tensor of size [batch_size, height, width, channels]. final_endpoint: specifies the endpoint to construct the network up to. It can be one of ['Conv2d_1a_3x3', 'Conv2d_2a_3x3', 'Conv2d_2b_3x3', 'MaxPool_3a_3x3', 'Conv2d_3b_1x1', 'Conv2d_4a_3x3', 'MaxPool_5a_3x3', 'Mixed_5b', 'Mixed_6a', 'PreAuxLogits', 'Mixed_7a', 'Conv2d_7b_1x1'] output_stride: A scalar that specifies the requested ratio of input to output spatial resolution. Only supports 8 and 16. align_feature_maps: When true, changes all the VALID paddings in the network to SAME padding so that the feature maps are aligned. scope: Optional variable_scope. activation_fn: Activation function for block scopes. Returns: tensor_out: output tensor corresponding to the final_endpoint. end_points: a set of activations for external use, for example summaries or losses. Raises: ValueError: if final_endpoint is not set to one of the predefined values, or if the output_stride is not 8 or 16, or if the output_stride is 8 and we request an end point after 'PreAuxLogits'. """ if output_stride != 8 and output_stride != 16: raise ValueError('output_stride must be 8 or 16.') padding = 'SAME' if align_feature_maps else 'VALID' end_points = {} def add_and_check_final(name, net): end_points[name] = net return name == final_endpoint with tf.variable_scope(scope, 'InceptionResnetV2', [inputs]): with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d], stride=1, padding='SAME'): # 149 x 149 x 32 net = slim.conv2d(inputs, 32, 3, stride=2, padding=padding, scope='Conv2d_1a_3x3') if add_and_check_final('Conv2d_1a_3x3', net): return net, end_points # 147 x 147 x 32 net = slim.conv2d(net, 32, 3, padding=padding, scope='Conv2d_2a_3x3') if add_and_check_final('Conv2d_2a_3x3', net): return net, end_points # 147 x 147 x 64 net = slim.conv2d(net, 64, 3, scope='Conv2d_2b_3x3') if add_and_check_final('Conv2d_2b_3x3', net): return net, end_points # 73 x 73 x 64 net = slim.max_pool2d(net, 3, stride=2, padding=padding, scope='MaxPool_3a_3x3') if add_and_check_final('MaxPool_3a_3x3', net): return net, end_points # 73 x 73 x 80 net = slim.conv2d(net, 80, 1, padding=padding, scope='Conv2d_3b_1x1') if add_and_check_final('Conv2d_3b_1x1', net): return net, end_points # 71 x 71 x 192 net = slim.conv2d(net, 192, 3, padding=padding, scope='Conv2d_4a_3x3') if add_and_check_final('Conv2d_4a_3x3', net): return net, end_points # 35 x 35 x 192 net = slim.max_pool2d(net, 3, stride=2, padding=padding, scope='MaxPool_5a_3x3') if add_and_check_final('MaxPool_5a_3x3', net): return net, end_points # 35 x 35 x 320 with tf.variable_scope('Mixed_5b'): with tf.variable_scope('Branch_0'): tower_conv = slim.conv2d(net, 96, 1, scope='Conv2d_1x1') with tf.variable_scope('Branch_1'): tower_conv1_0 = slim.conv2d(net, 48, 1, scope='Conv2d_0a_1x1') tower_conv1_1 = slim.conv2d(tower_conv1_0, 64, 5, scope='Conv2d_0b_5x5') with tf.variable_scope('Branch_2'): tower_conv2_0 = slim.conv2d(net, 64, 1, scope='Conv2d_0a_1x1') tower_conv2_1 = slim.conv2d(tower_conv2_0, 96, 3, scope='Conv2d_0b_3x3') tower_conv2_2 = slim.conv2d(tower_conv2_1, 96, 3, scope='Conv2d_0c_3x3') with tf.variable_scope('Branch_3'): tower_pool = slim.avg_pool2d(net, 3, stride=1, padding='SAME', scope='AvgPool_0a_3x3') tower_pool_1 = slim.conv2d(tower_pool, 64, 1, scope='Conv2d_0b_1x1') net = tf.concat( [tower_conv, tower_conv1_1, tower_conv2_2, tower_pool_1], 3) if add_and_check_final('Mixed_5b', net): return net, end_points # TODO(alemi): Register intermediate endpoints net = slim.repeat(net, 10, block35, scale=0.17, activation_fn=activation_fn) # 17 x 17 x 1088 if output_stride == 8, # 33 x 33 x 1088 if output_stride == 16 use_atrous = output_stride == 8 with tf.variable_scope('Mixed_6a'): with tf.variable_scope('Branch_0'): tower_conv = slim.conv2d(net, 384, 3, stride=1 if use_atrous else 2, padding=padding, scope='Conv2d_1a_3x3') with tf.variable_scope('Branch_1'): tower_conv1_0 = slim.conv2d(net, 256, 1, scope='Conv2d_0a_1x1') tower_conv1_1 = slim.conv2d(tower_conv1_0, 256, 3, scope='Conv2d_0b_3x3') tower_conv1_2 = slim.conv2d(tower_conv1_1, 384, 3, stride=1 if use_atrous else 2, padding=padding, scope='Conv2d_1a_3x3') with tf.variable_scope('Branch_2'): tower_pool = slim.max_pool2d(net, 3, stride=1 if use_atrous else 2, padding=padding, scope='MaxPool_1a_3x3') net = tf.concat([tower_conv, tower_conv1_2, tower_pool], 3) if add_and_check_final('Mixed_6a', net): return net, end_points # TODO(alemi): register intermediate endpoints with slim.arg_scope([slim.conv2d], rate=2 if use_atrous else 1): net = slim.repeat(net, 20, block17, scale=0.10, activation_fn=activation_fn) if add_and_check_final('PreAuxLogits', net): return net, end_points if output_stride == 8: # TODO(gpapan): Properly support output_stride for the rest of the net. raise ValueError('output_stride==8 is only supported up to the ' 'PreAuxlogits end_point for now.') # 8 x 8 x 2080 with tf.variable_scope('Mixed_7a'): with tf.variable_scope('Branch_0'): tower_conv = slim.conv2d(net, 256, 1, scope='Conv2d_0a_1x1') tower_conv_1 = slim.conv2d(tower_conv, 384, 3, stride=2, padding=padding, scope='Conv2d_1a_3x3') with tf.variable_scope('Branch_1'): tower_conv1 = slim.conv2d(net, 256, 1, scope='Conv2d_0a_1x1') tower_conv1_1 = slim.conv2d(tower_conv1, 288, 3, stride=2, padding=padding, scope='Conv2d_1a_3x3') with tf.variable_scope('Branch_2'): tower_conv2 = slim.conv2d(net, 256, 1, scope='Conv2d_0a_1x1') tower_conv2_1 = slim.conv2d(tower_conv2, 288, 3, scope='Conv2d_0b_3x3') tower_conv2_2 = slim.conv2d(tower_conv2_1, 320, 3, stride=2, padding=padding, scope='Conv2d_1a_3x3') with tf.variable_scope('Branch_3'): tower_pool = slim.max_pool2d(net, 3, stride=2, padding=padding, scope='MaxPool_1a_3x3') net = tf.concat( [tower_conv_1, tower_conv1_1, tower_conv2_2, tower_pool], 3) if add_and_check_final('Mixed_7a', net): return net, end_points # TODO(alemi): register intermediate endpoints net = slim.repeat(net, 9, block8, scale=0.20, activation_fn=activation_fn) net = block8(net, activation_fn=None) # 8 x 8 x 1536 net = slim.conv2d(net, 1536, 1, scope='Conv2d_7b_1x1') if add_and_check_final('Conv2d_7b_1x1', net): return net, end_points raise ValueError('final_endpoint (%s) not recognized', final_endpoint) def inception_resnet_v2(inputs, num_classes=1001, is_training=True, dropout_keep_prob=0.8, reuse=None, scope='InceptionResnetV2', create_aux_logits=True, activation_fn=tf.nn.relu): """Creates the Inception Resnet V2 model. Args: inputs: a 4-D tensor of size [batch_size, height, width, 3]. Dimension batch_size may be undefined. If create_aux_logits is false, also height and width may be undefined. num_classes: number of predicted classes. If 0 or None, the logits layer is omitted and the input features to the logits layer (before dropout) are returned instead. is_training: whether is training or not. dropout_keep_prob: float, the fraction to keep before final layer. reuse: whether or not the network and its variables should be reused. To be able to reuse 'scope' must be given. scope: Optional variable_scope. create_aux_logits: Whether to include the auxilliary logits. activation_fn: Activation function for conv2d. Returns: net: the output of the logits layer (if num_classes is a non-zero integer), or the non-dropped-out input to the logits layer (if num_classes is 0 or None). end_points: the set of end_points from the inception model. """ end_points = {} with tf.variable_scope(scope, 'InceptionResnetV2', [inputs], reuse=reuse) as scope: with slim.arg_scope([slim.batch_norm, slim.dropout], is_training=is_training): net, end_points = inception_resnet_v2_base(inputs, scope=scope, activation_fn=activation_fn) if create_aux_logits and num_classes: with tf.variable_scope('AuxLogits'): aux = end_points['PreAuxLogits'] aux = slim.avg_pool2d(aux, 5, stride=3, padding='VALID', scope='Conv2d_1a_3x3') aux = slim.conv2d(aux, 128, 1, scope='Conv2d_1b_1x1') aux = slim.conv2d(aux, 768, aux.get_shape()[1:3], padding='VALID', scope='Conv2d_2a_5x5') aux = slim.flatten(aux) aux = slim.fully_connected(aux, num_classes, activation_fn=None, scope='Logits') end_points['AuxLogits'] = aux with tf.variable_scope('Logits'): # TODO(sguada,arnoegw): Consider adding a parameter global_pool which # can be set to False to disable pooling here (as in resnet_*()). kernel_size = net.get_shape()[1:3] if kernel_size.is_fully_defined(): net = slim.avg_pool2d(net, kernel_size, padding='VALID', scope='AvgPool_1a_8x8') else: net = tf.reduce_mean(net, [1, 2], keep_dims=True, name='global_pool') end_points['global_pool'] = net if not num_classes: return net, end_points net = slim.flatten(net) net = slim.dropout(net, dropout_keep_prob, is_training=is_training, scope='Dropout') end_points['PreLogitsFlatten'] = net logits = slim.fully_connected(net, num_classes, activation_fn=None, scope='Logits') end_points['Logits'] = logits end_points['Predictions'] = tf.nn.softmax(logits, name='Predictions') return logits, end_points inception_resnet_v2.default_image_size = 299 def inception_resnet_v2_arg_scope(weight_decay=0.00004, batch_norm_decay=0.9997, batch_norm_epsilon=0.001, activation_fn=tf.nn.relu): """Returns the scope with the default parameters for inception_resnet_v2. Args: weight_decay: the weight decay for weights variables. batch_norm_decay: decay for the moving average of batch_norm momentums. batch_norm_epsilon: small float added to variance to avoid dividing by zero. activation_fn: Activation function for conv2d. Returns: a arg_scope with the parameters needed for inception_resnet_v2. """ # Set weight_decay for weights in conv2d and fully_connected layers. with slim.arg_scope([slim.conv2d, slim.fully_connected], weights_regularizer=slim.l2_regularizer(weight_decay), biases_regularizer=slim.l2_regularizer(weight_decay)): batch_norm_params = { 'decay': batch_norm_decay, 'epsilon': batch_norm_epsilon, 'fused': None, # Use fused batch norm if possible. } # Set activation_fn and parameters for batch_norm. with slim.arg_scope([slim.conv2d], activation_fn=activation_fn, normalizer_fn=slim.batch_norm, normalizer_params=batch_norm_params) as scope: return scope
- inception_resnet_v2.default_image_size:默认图片的大小
- inception_resnet_v2_base:为inception_resnet_v2的基础结构实现函数,输出inception_resnet_v2网络中最原始的数据,默认是传到inception_resnet_v2函数中,一般不会改变其内部。当要使用自定义的输出层时,会将传入自己的函数来替代inception_resnet_v2函数。
- inception_resnet_v2:inception_resnet_v2网络的实现函数,这个函数有两个输出,一个是预测结果logits,另一个是辅助信息AuxLogits。辅助信息是为了显示或分析使用,主要包括summaries和losses。
- inception_resnet_v2_arg_scope:该函数返回命名空间的名字。在外层修改或者使用模型时,可以使用与模型相同的命名空间。
三 slim中的数据集处理
As part of this library, we've included scripts to download several popular image datasets (listed below) and convert them to slim format.
2 下载数据集并转换成TFRecord格式
For each dataset, we'll need to download the raw data and convert it to TensorFlow's native TFRecord format. Each TFRecord contains a TF-Example protocol buffer. Below we demonstrate how to do this for the Flowers dataset.
$ DATA_DIR=/tmp/data/flowers $ python --dataset_name=flowers --dataset_dir="${DATA_DIR}"
When the script finishes you will find several TFRecord files created:
These represent the training and validation data, sharded over 5 files each. You will also find the $DATA_DIR/labels.txt
file which contains the mapping from integer labels to class names.
You can use the same script to create the mnist and cifar10 datasets. However, for ImageNet, you have to follow the instructionshere. Note that you first have to sign up for an account at Also, the download can take several hours, and could use up to 500GB.
在这里我详细介绍一下执行的代码,我们打开 文件,代码内容如下:

# ============================================================================== r"""Downloads and converts a particular dataset. Usage: ```shell $ python --dataset_name=mnist --dataset_dir=/tmp/mnist $ python --dataset_name=cifar10 --dataset_dir=/tmp/cifar10 $ python --dataset_name=flowers --dataset_dir=/tmp/flowers ``` """ from __future__ import absolute_import from __future__ import division from __future__ import print_function import tensorflow as tf from datasets import download_and_convert_cifar10 from datasets import download_and_convert_flowers from datasets import download_and_convert_mnist FLAGS = 'dataset_name', None, 'The name of the dataset to convert, one of "cifar10", "flowers", "mnist".') 'dataset_dir', None, 'The directory where the output TFRecords and temporary files are saved.') def main(_): if not FLAGS.dataset_name: raise ValueError('You must supply the dataset name with --dataset_name') if not FLAGS.dataset_dir: raise ValueError('You must supply the dataset directory with --dataset_dir') if FLAGS.dataset_name == 'cifar10': elif FLAGS.dataset_name == 'flowers': elif FLAGS.dataset_name == 'mnist': else: raise ValueError( 'dataset_name [%s] was not recognized.' % FLAGS.dataset_name) if __name__ == '__main__':
- 程序使用过函数执行的,该函数会解析命令行参数,并传递给flags。当我们执行上面那一句命令行时,即等于FLAGS.dataset_name='flowers',FLAGS.dataset_dir=‘/tmp/data/flowers’
- 执行main函数,然后执行该函数。该函数实现:开始下载数据集,并解压数据集,然后再转换成TFRecord格式,删除数据集文件。
def run(dataset_dir): """Runs the download and conversion operation. Args: dataset_dir: The dataset directory where the dataset is stored. """ if not tf.gfile.Exists(dataset_dir): tf.gfile.MakeDirs(dataset_dir) if _dataset_exists(dataset_dir): print('Dataset files already exist. Exiting without re-creating them.') return dataset_utils.download_and_uncompress_tarball(_DATA_URL, dataset_dir) photo_filenames, class_names = _get_filenames_and_classes(dataset_dir) class_names_to_ids = dict(zip(class_names, range(len(class_names)))) # Divide into train and test: random.seed(_RANDOM_SEED) random.shuffle(photo_filenames) training_filenames = photo_filenames[_NUM_VALIDATION:] validation_filenames = photo_filenames[:_NUM_VALIDATION] # First, convert the training and validation sets. _convert_dataset('train', training_filenames, class_names_to_ids, dataset_dir) _convert_dataset('validation', validation_filenames, class_names_to_ids, dataset_dir) # Finally, write the labels file: labels_to_class_names = dict(zip(range(len(class_names)), class_names)) dataset_utils.write_label_file(labels_to_class_names, dataset_dir) _clean_up_temporary_files(dataset_dir) print(' Finished converting the Flowers dataset!')
- 判断dataset_dir文件夹是否存在,不存在则创建。
- 检查dataset_dir文件夹下是否存在所有的TFRecord文件,存在则退出。
- 从_DATA_URL网址下载数据集,并解压到dataset_dir文件下下。
- 获取所有图片的全路径和类别名,注意这里文件夹均是以类别名称命名的,所以全路径中就包含了类别。
- 创建标签->类别名的映射字典。
- 打乱文件名,然后划分验证集和训练集。
- 把训练集每一个样本分别以TF-Example 格式写入TFRecord文件中。
- 把验证集每一个样本分别以TF-Example 格式写入TFRecord文件中。
def image_to_tfexample(image_data, image_format, height, width, class_id): return tf.train.Example(features=tf.train.Features(feature={ 'image/encoded': bytes_feature(image_data), 'image/format': bytes_feature(image_format), 'image/class/label': int64_feature(class_id), 'image/height': int64_feature(height), 'image/width': int64_feature(width), }))
- 生成标签文件.txt。每行数据格式为 标签:类别名(后面是换行符 )
- 清除数据集.tgz文件和解压的文件。
3 利用slim读取TFRecord中的数据
# -*- coding: utf-8 -*- """ Created on Fri Jun 8 08:52:30 2018 @author: zy """ ''' 导入flowers数据集 ''' from datasets import download_and_convert_flowers from preprocessing import vgg_preprocessing from datasets import flowers import tensorflow as tf slim = tf.contrib.slim def read_flower_image_and_label(dataset_dir,is_training=False): ''' 下载flower_photos.tgz数据集 切分训练集和验证集 并将数据转换成TFRecord格式 5个训练数据文件(3320),5个验证数据文件(350),还有一个标签文件(存放每个数字标签对应的类名) args: dataset_dir:数据集所在的目录 is_training:设置为TRue,表示加载训练数据集,否则加载验证集 return: image,label:返回随机读取的一张图片,和对应的标签 ''' ''' 利用slim读取TFRecord中的数据 ''' #选择数据集train if is_training: dataset = flowers.get_split(split_name = 'train',dataset_dir=dataset_dir) else: dataset = flowers.get_split(split_name = 'validation',dataset_dir=dataset_dir) #创建一个数据provider provider = slim.dataset_data_provider.DatasetDataProvider(dataset) #通过provider的get随机获取一条样本数据 返回的是两个张量 [image,label] = provider.get(['image','label']) return image,label
if __name__ == '__main__': #test() #读取一张图片,以及对应的标签 image,label = read_flower_image_and_label('./datasets/data/flowers') ''' 启动session,读取数据 ''' with tf.Session() as sess: #创建一个协调器,管理线程 coord = tf.train.Coordinator() #启动QueueRunner, 此时文件名才开始进队。 threads=tf.train.start_queue_runners(sess=sess,coord=coord) img, lab =[image, label]) plt.imshow(img) plt.title('Original image') #终止线程 coord.request_stop() coord.join(threads)
- 将example反序列化成存储之前的格式。由tf完成
keys_to_features = { 'image/encoded': tf.FixedLenFeature((), tf.string, default_value=''), 'image/format': tf.FixedLenFeature((), tf.string, default_value='png'), 'image/class/label': tf.FixedLenFeature( [], tf.int64, default_value=tf.zeros([], dtype=tf.int64)), }
- 将反序列化的数据组装成更高级的格式。由slim完成
items_to_handlers = { 'image': slim.tfexample_decoder.Image('image/encoded','image/format'), 'label': slim.tfexample_decoder.Tensor('image/class/label'), }
- 解码器,进行解码
decoder = slim.tfexample_decoder.TFExampleDecoder( keys_to_features, items_to_handlers)
- dataset对象定义了数据集的文件位置,解码方式等元信息
dataset = slim.dataset.Dataset( data_sources=file_pattern, reader=tf.TFRecordReader, decoder=decoder, num_samples=SPLITS_TO_SIZES[split_name],#训练数据的总数 items_to_descriptions=_ITEMS_TO_DESCRIPTIONS, num_classes=_NUM_CLASSES, labels_to_names=labels_to_names #字典形式,格式为:id:class_call, )
- provider对象根据dataset信息读取数据
provider = slim.dataset_data_provider.DatasetDataProvider( dataset, num_readers=FLAGS.num_readers, common_queue_capacity=20 * FLAGS.batch_size, common_queue_min=10 * FLAGS.batch_size)
- 获取数据,获取到的数据是单个数据,还需要对数据进行预处理,组合数据
[image, label] = provider.get(['image', 'label']) # 图像预处理 image = preprocessing_image(image, train_image_size, train_image_size) images, labels = tf.train.batch( [image, label], batch_size=FLAGS.batch_size, num_threads=FLAGS.num_preprocessing_threads, capacity=5 * FLAGS.batch_size) labels = slim.one_hot_encoding( labels, dataset.num_classes - FLAGS.labels_offset)
def get_batch_images_and_label(dataset_dir,batch_size,num_classes,is_training=False,output_height=224, output_width=224,num_threads=10): ''' 每次取出batch_size个样本 注意:这里预处理调用的是slim库图片预处理的函数,例如:如果你使用的vgg网络,就调用vgg网络的图像预处理函数 如果你使用的是自己定义的网络,则可以自己写适合自己图像的预处理函数,比如归一化处理也可以使用其他网络已经写好的预处理函数 args: dataset_dir:数据集所在的目录 batch_size:一次取出的样本数量 num_classes:输出的类别 用于对标签one_hot编码 is_training:设置为TRue,表示加载训练数据集,否则加载验证集 output_height:输出图片高度 output_width:输出图片宽 return: images,labels:返回随机读取的batch_size张图片,和对应的标签one_hot编码 ''' #获取单张图像和标签 image,label = read_flower_image_and_label(dataset_dir,is_training) # 图像预处理 这里要求图片数据是tf.float32类型的 image = vgg_preprocessing.preprocess_image(image, output_height, output_width,is_training=is_training) #缩放处理 #image = tf.image.convert_image_dtype(image, dtype=tf.float32) #image = tf.image.resize_image_with_crop_or_pad(image, output_height, output_width) # shuffle_batch 函数会将数据顺序打乱 # bacth 函数不会将数据顺序打乱 images, labels = tf.train.batch( [image, label], batch_size = batch_size, capacity=5 * batch_size, num_threads = num_threads) #one-hot编码 labels = slim.one_hot_encoding(labels,num_classes) return images,labels
四 在slim中训练模型
python --train_dir=./log/train_logs --dataset_name=flowers --dataset_split_name=train --dataset_dir=./datasets/data/flowers --model_name=inception_v3
2 预训练模型
Neural nets work best when they have many parameters, making them powerful function approximators. However, this means they must be trained on very large datasets. Because training models from scratch can be a very computationally intensive process requiring days or even weeks, we provide various pre-trained models, as listed below. These CNNs have been trained on the ILSVRC-2012-CLS image classification dataset.
In the table below, we list each model, the corresponding TensorFlow model file, the link to the model checkpoint, and the top 1 and top 5 accuracy (on the imagenet test set). Note that the VGG and ResNet V1 parameters have been converted from their original caffe formats (here and here), whereas the Inception and ResNet V2 parameters have been trained internally at Google. Also be aware that these accuracies were computed by evaluating using a single image crop. Some academic papers report higher accuracy by using multiple crops at multiple scales.
--checkpoint_path = 模型路径
checkpoint_path 里的模型是用于预训练模型的参数初始化,在训练过程中不会改变,新产生的模型会被保存在--train_dir路径下。
- 通过参数--checkpoint_exclude_scopes指定载入预训练时哪一层的权重不被载入。
- 再通过--trainable_scopes参数指定对哪一层的参数进行训练,当--trainable_scopes出现时,没有被指定训练的参数将在训练中被冻结。
python --train_dir=./log/in3--dataset_dir=./datasets/data/flowers--dataset_name=flowers --dataset_split_name=train --model_name=inception_v3 --checkpoint_path=./inception_v3/inception_v3.ckpt--checkpoint_exclude_scopes=InceptionV3/Logits,InceptionV3/AuxLogits --trainable_scopes=InceptionV3/Logits,InceptionV3/AuxLogits
4 评估模型
To evaluate the performance of a model (whether pretrained or your own), you can use the script, as shown below.
Below we give an example of downloading the pretrained inception model and evaluating it on the imagenet dataset.
python --alsologtostderr --checkpoint_path=./log/in3/model.ckpt
--dataset_split_name=validation --model_name=inception_v3
5 打包模型