zoukankan      html  css  js  c++  java
  • Tensorflow的CNN教程解析

    之前的博客我们已经对RNN模型有了个粗略的了解。作为一个时序性模型,RNN的强大不需要我在这里重复了。今天,让我们来看看除了RNN外另一个特殊的,同时也是广为人知的强大的神经网络模型,即CNN模型。今天的讨论主要是基于Tensorflow的CIFAR10教程,不过作为对比,我们也会对Tensorflow的MINST教程作解析以及对比。很快大家就会发现,逻辑上考虑,其实内容都是大同小异的。由于所对应的目标不一样,在数据处理方面可能存在着些许差异,这里我们以CIFAR10的为基准,有兴趣的朋友欢迎去阅读并学习MNIST的过程,地址点击这里。CIFAR10的英文教程在Tensorflow官网上可以获得,教程代码地址点击这里

    CNN简介

    CNN是一个神奇的深度学习框架,也是深度学习学科里的一个异类。在被誉为AI寒冬的90年末到2000年初,在大部分学者都弃坑的情况下,CNN的效用却不减反增,感谢Yann LeCun!CNN的架构其实很符合其名,Convolutional Neural Network,CNN在运做的开始运用了卷积(convolution)的概念,外加pooling等方式在多次卷积了图像并形成多个特征图后,输入被平铺开进入一个完全连接的多层神经网络里(fully connected network)里,并由输出的softmax来判断图片的分类情况。该框架的发展史也很有趣,早在90年代末,以LeCun命名的Le-Net5就已经闻名。在深度学习火热后,更多的框架变种也接踵而至,较为闻名的包括多伦多大学的AlexNet,谷歌的GoogLeNet,牛津的OxfordNet外还有Network in Network(NIN),VGG16等多个network。最近,对物体识别的研究开发了RCNN框架,可见在深度学习发展迅猛的今天,CNN框架依然是很多著名研究小组的课题,特别是在了解了Alpha-Go的运作里也可以看到CNN的身影,可见其能力!至于CNN模型的基础构架,这方面的资源甚多,就不一一列举了。

    CIFAR10代码分析

    在运行CIFAR10代码时,你只需要下载该代码,然后cd到代码目录后直接输入python cifar10_train.py就可以了。默认的迭代步骤为100万步,每一步骤需要约3~4秒,运行5小时可以完成近10万步。由于根据cifar10_train.py的描述10万步的准确率为86%左右,我们运行近5个小时左右就可以了,没必要运行全部的100万步。查看结果时,运行python cifar_10_eval.py就可以了。由于模型被存储在了tmp目录里,eval文件可以找寻到最近保存的模型并运行该模型,所以还是很方便的。这个系统在运行后可以从照片里识别10种不同的物体,包括飞机等。这么好玩的系统,快让我们来看一看是怎么实现的吧!

    首先,让我们来看下cifar1_train.py文件。文件里的核心为train函数,它的表现如下:

    def train():
      """Train CIFAR-10 for a number of steps."""
      with tf.Graph().as_default():
        global_step = tf.Variable(0, trainable=False)
    
        # Get images and labels for CIFAR-10.
        # 输入选用的是distored_inputs函数
        images, labels = cifar10.distorted_inputs()
    
        # Build a Graph that computes the logits predictions from the
        # inference model.
        logits = cifar10.inference(images)
    
        # Calculate loss.
        loss = cifar10.loss(logits, labels)
    
        # Build a Graph that trains the model with one batch of examples and
        # updates the model parameters.
        train_op = cifar10.train(loss, global_step)
    
        # Create a saver.
        saver = tf.train.Saver(tf.all_variables())
    
        # Build the summary operation based on the TF collection of Summaries.
        summary_op = tf.merge_all_summaries()
    
        # Build an initialization operation to run below.
        init = tf.initialize_all_variables()
    
        # Start running operations on the Graph.
        sess = tf.Session(config=tf.ConfigProto(
            log_device_placement=FLAGS.log_device_placement))
        sess.run(init)
    
        # Start the queue runners.
        tf.train.start_queue_runners(sess=sess)
    
        summary_writer = tf.train.SummaryWriter(FLAGS.train_dir, sess.graph)
        
        # 在最高的迭代步骤数里进行循环迭代
        for step in xrange(FLAGS.max_steps):
          start_time = time.time()
          _, loss_value = sess.run([train_op, loss])
          duration = time.time() - start_time
    
          assert not np.isnan(loss_value), 'Model diverged with loss = NaN'
          # 每10个输入数据显示次step,loss,时间等运行数据
          if step % 10 == 0:
            num_examples_per_step = FLAGS.batch_size
            examples_per_sec = num_examples_per_step / duration
            sec_per_batch = float(duration)
    
            format_str = ('%s: step %d, loss = %.2f (%.1f examples/sec; %.3f '
                          'sec/batch)')
            print (format_str % (datetime.now(), step, loss_value,
                                 examples_per_sec, sec_per_batch))
          # 每100个输入数据将网络的状况体现在summary里
          if step % 100 == 0:
            summary_str = sess.run(summary_op)
            summary_writer.add_summary(summary_str, step)
    
          # Save the model checkpoint periodically.
          # 每1000个输入数据保存次模型
          if step % 1000 == 0 or (step + 1) == FLAGS.max_steps:
            checkpoint_path = os.path.join(FLAGS.train_dir, 'model.ckpt')
            saver.save(sess, checkpoint_path, global_step=step)

    这个训练函数本身逻辑很清晰,除了它运用了大量的cifar10.py文件里的函数外,一个值得注意的地方是输入里应用的是distorded_inputs函数。这个很有意思,因为据论文表达,对输入数据进行一定的处理后可以得到新的数据,这是增加数据存储量的一个简便的方法,那么具体它是如何做到的呢?让我们来看看这个distorded_inputs函数。在cifar10.py文件里,distorded_inputs函数实质上是一个wrapper,包装了来自cifar10_input.py函数里的distorted_inputs()函数。这个函数的逻辑如下:

    def distorted_inputs(data_dir, batch_size):
      """Construct distorted input for CIFAR training using the Reader ops.
      Args:
        data_dir: Path to the CIFAR-10 data directory.
        batch_size: Number of images per batch.
      Returns:
        images: Images. 4D tensor of [batch_size, IMAGE_SIZE, IMAGE_SIZE, 3] size.
        labels: Labels. 1D tensor of [batch_size] size.
      """
      filenames = [os.path.join(data_dir, 'data_batch_%d.bin' % i)
                   for i in xrange(1, 6)]
      for f in filenames:
        if not tf.gfile.Exists(f):
          raise ValueError('Failed to find file: ' + f)
    
      # Create a queue that produces the filenames to read.
      filename_queue = tf.train.string_input_producer(filenames)
    
      # Read examples from files in the filename queue.
      read_input = read_cifar10(filename_queue)
      reshaped_image = tf.cast(read_input.uint8image, tf.float32)
    
      height = IMAGE_SIZE
      width = IMAGE_SIZE
    
      # Image processing for training the network. Note the many random
      # distortions applied to the image.
    
      # Randomly crop a [height, width] section of the image.
      # 步骤1:随机截取一个以[高,宽]为大小的图矩阵。
      distorted_image = tf.random_crop(reshaped_image, [height, width, 3])
    
      # Randomly flip the image horizontally.
      # 步骤2:随机颠倒图片的左右。概率为50%
      distorted_image = tf.image.random_flip_left_right(distorted_image)
    
      # Because these operations are not commutative, consider randomizing
      # the order their operation.
      #  步骤3:随机改变图片的亮度以及色彩对比。
      distorted_image = tf.image.random_brightness(distorted_image,
                                                   max_delta=63)
      distorted_image = tf.image.random_contrast(distorted_image,
                                                 lower=0.2, upper=1.8)
    
      # Subtract off the mean and divide by the variance of the pixels.
      float_image = tf.image.per_image_whitening(distorted_image)
    
      # Ensure that the random shuffling has good mixing properties.
      min_fraction_of_examples_in_queue = 0.4
      min_queue_examples = int(NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN *
                               min_fraction_of_examples_in_queue)
      print ('Filling queue with %d CIFAR images before starting to train. '
             'This will take a few minutes.' % min_queue_examples)
    
      # Generate a batch of images and labels by building up a queue of examples.
      return _generate_image_and_label_batch(float_image, read_input.label,
                                             min_queue_examples, batch_size,
                                             shuffle=True)
    

    这里每一张图片被随机的截取一片图后有一定的概率被翻转,改变亮度对比等步骤。另外,最后一段的意思为在queue里有了不少于40%的数据的时候训练才能开始。那么在测试的时候,我们需要经过这个步骤么?答案是非也。在cifar10_input.py文件里,distorded_inputs函数的下方,一个名为inputs的函数代表了输入被运用在eval时的逻辑。在输入参数方面,这个inputs函数在保留了distorded_inputs的同时增加了一个名为eval_data的参数,一个bool参数代表了是运用训练的数据还是测试的数据。下面,让我们来大概看下这个函数的逻辑。

    def inputs(eval_data, data_dir, batch_size):
      """Construct input for CIFAR evaluation using the Reader ops.
      Args:
        eval_data: bool, indicating if one should use the train or eval data set.
        data_dir: Path to the CIFAR-10 data directory.
        batch_size: Number of images per batch.
      Returns:
        images: Images. 4D tensor of [batch_size, IMAGE_SIZE, IMAGE_SIZE, 3] size.
        labels: Labels. 1D tensor of [batch_size] size.
      """
      if not eval_data:
        filenames = [os.path.join(data_dir, 'data_batch_%d.bin' % i)
                     for i in xrange(1, 6)]
        num_examples_per_epoch = NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN
      else:
        filenames = [os.path.join(data_dir, 'test_batch.bin')]
        num_examples_per_epoch = NUM_EXAMPLES_PER_EPOCH_FOR_EVAL
    
      for f in filenames:
        if not tf.gfile.Exists(f):
          raise ValueError('Failed to find file: ' + f)
    
      # Create a queue that produces the filenames to read.
      filename_queue = tf.train.string_input_producer(filenames)
    
      # Read examples from files in the filename queue.
      read_input = read_cifar10(filename_queue)
      reshaped_image = tf.cast(read_input.uint8image, tf.float32)
    
      height = IMAGE_SIZE
      width = IMAGE_SIZE
    
      # Image processing for evaluation.
      # Crop the central [height, width] of the image.
    # 截取图片中心区域 resized_image = tf.image.resize_image_with_crop_or_pad(reshaped_image, width, height) # Subtract off the mean and divide by the variance of the pixels.
    # 平衡图片的色差 float_image = tf.image.per_image_whitening(resized_image) # Ensure that the random shuffling has good mixing properties. min_fraction_of_examples_in_queue = 0.4 min_queue_examples = int(num_examples_per_epoch * min_fraction_of_examples_in_queue) # Generate a batch of images and labels by building up a queue of examples. return _generate_image_and_label_batch(float_image, read_input.label, min_queue_examples, batch_size, shuffle=False)

    这里,我们看到截取只有图片的中心,另外处理也只有平衡色差。但是,聪明的读者朋友一定能想到,如果一张关于飞机的图片是以飞机头为图片中心的,而训练集合里所有的飞机图片都是以机翼为图片中心的话,我们之前的distorded_inputs函数将有机会截取飞机头的区域,从而给我们的测试图片提供相似信息。另外,随机调整色差也包含了平均色差,所以我们的训练集实质上包含了更广,更多种的可能性,故可想而之会有机会得到更好的效果。

    那么,讲了关于输入的小窍门,我们应该来看看具体的CNN模型了。如何制造一个CNN模型呢?让我们先来看一个简单的版本,即MNIST教程里的模型:

      # The variables below hold all the trainable weights. They are passed an
      # initial value which will be assigned when we call:
      # {tf.initialize_all_variables().run()}
      conv1_weights = tf.Variable(
          tf.truncated_normal([5, 5, NUM_CHANNELS, 32],  # 5x5 filter, depth 32.
                              stddev=0.1,
                              seed=SEED, dtype=data_type()))
      conv1_biases = tf.Variable(tf.zeros([32], dtype=data_type()))
      conv2_weights = tf.Variable(tf.truncated_normal(
          [5, 5, 32, 64], stddev=0.1,
          seed=SEED, dtype=data_type()))
      conv2_biases = tf.Variable(tf.constant(0.1, shape=[64], dtype=data_type()))
      fc1_weights = tf.Variable(  # fully connected, depth 512.
          tf.truncated_normal([IMAGE_SIZE // 4 * IMAGE_SIZE // 4 * 64, 512],
                              stddev=0.1,
                              seed=SEED,
                              dtype=data_type()))
      fc1_biases = tf.Variable(tf.constant(0.1, shape=[512], dtype=data_type()))
      fc2_weights = tf.Variable(tf.truncated_normal([512, NUM_LABELS],
                                                    stddev=0.1,
                                                    seed=SEED,
                                                    dtype=data_type()))
      fc2_biases = tf.Variable(tf.constant(
          0.1, shape=[NUM_LABELS], dtype=data_type()))
    
      # We will replicate the model structure for the training subgraph, as well
      # as the evaluation subgraphs, while sharing the trainable parameters.
      def model(data, train=False):
        """The Model definition."""
        # 2D convolution, with 'SAME' padding (i.e. the output feature map has
        # the same size as the input). Note that {strides} is a 4D array whose
        # shape matches the data layout: [image index, y, x, depth].
        conv = tf.nn.conv2d(data,
                            conv1_weights,
                            strides=[1, 1, 1, 1],
                            padding='SAME')
        # Bias and rectified linear non-linearity.
        relu = tf.nn.relu(tf.nn.bias_add(conv, conv1_biases))
        # Max pooling. The kernel size spec {ksize} also follows the layout of
        # the data. Here we have a pooling window of 2, and a stride of 2.
        pool = tf.nn.max_pool(relu,
                              ksize=[1, 2, 2, 1],
                              strides=[1, 2, 2, 1],
                              padding='SAME')
        conv = tf.nn.conv2d(pool,
                            conv2_weights,
                            strides=[1, 1, 1, 1],
                            padding='SAME')
        relu = tf.nn.relu(tf.nn.bias_add(conv, conv2_biases))
        pool = tf.nn.max_pool(relu,
                              ksize=[1, 2, 2, 1],
                              strides=[1, 2, 2, 1],
                              padding='SAME')
        # Reshape the feature map cuboid into a 2D matrix to feed it to the
        # fully connected layers.
        pool_shape = pool.get_shape().as_list()
        reshape = tf.reshape(
            pool,
            [pool_shape[0], pool_shape[1] * pool_shape[2] * pool_shape[3]])
        # Fully connected layer. Note that the '+' operation automatically
        # broadcasts the biases.
        hidden = tf.nn.relu(tf.matmul(reshape, fc1_weights) + fc1_biases)
        # Add a 50% dropout during training only. Dropout also scales
        # activations such that no rescaling is needed at evaluation time.
        if train:
          hidden = tf.nn.dropout(hidden, 0.5, seed=SEED)
        return tf.matmul(hidden, fc2_weights) + fc2_biases
    
      # Training computation: logits + cross-entropy loss.
      logits = model(train_data_node, True)
      loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(
          logits, train_labels_node))
    
      # L2 regularization for the fully connected parameters.
      regularizers = (tf.nn.l2_loss(fc1_weights) + tf.nn.l2_loss(fc1_biases) +
                      tf.nn.l2_loss(fc2_weights) + tf.nn.l2_loss(fc2_biases))
      # Add the regularization term to the loss.
      loss += 5e-4 * regularizers
    
      # Optimizer: set up a variable that's incremented once per batch and
      # controls the learning rate decay.
      batch = tf.Variable(0, dtype=data_type())
      # Decay once per epoch, using an exponential schedule starting at 0.01.
      learning_rate = tf.train.exponential_decay(
          0.01,                # Base learning rate.
          batch * BATCH_SIZE,  # Current index into the dataset.
          train_size,          # Decay step.
          0.95,                # Decay rate.
          staircase=True)
      # Use simple momentum for the optimization.
      optimizer = tf.train.MomentumOptimizer(learning_rate,
                                             0.9).minimize(loss,
                                                           global_step=batch)
    
      # Predictions for the current training minibatch.
      train_prediction = tf.nn.softmax(logits)
    
      # Predictions for the test and validation, which we'll compute less often.
      eval_prediction = tf.nn.softmax(model(eval_data))
    

    这段代码很直白,在定义了convolution1,convolution2,fully_connected1和fully_connected2层神经网络的weight和biases参数后,在模型函数里,我们通过conv2d, relu, max_pool等方式在两次重复后将得到的结果重新整理后输入那个fully connected的神经网络中,即matmul(reshape,fc1_weights) + fc1_biases。之后再经历了第二层的fully connected net后得到logits。定义loss以及optimizer等常见的过程后结果是由softmax来取得。这个逻辑我们在CIFAR10里也会见到,它的表达如下:

    def inference(images):
      """Build the CIFAR-10 model.
      Args:
        images: Images returned from distorted_inputs() or inputs().
      Returns:
        Logits.
      """
      # We instantiate all variables using tf.get_variable() instead of
      # tf.Variable() in order to share variables across multiple GPU training runs.
      # If we only ran this model on a single GPU, we could simplify this function
      # by replacing all instances of tf.get_variable() with tf.Variable().
      #
      # conv1
      with tf.variable_scope('conv1') as scope:
        # 输入的图片由于是彩图,有三个channel,所以在conv2d中,我们规定
        # 输出为64个channel的feature map。
        kernel = _variable_with_weight_decay('weights', shape=[5, 5, 3, 64],
                                             stddev=1e-4, wd=0.0)
        conv = tf.nn.conv2d(images, kernel, [1, 1, 1, 1], padding='SAME')
        biases = _variable_on_cpu('biases', [64], tf.constant_initializer(0.0))
        bias = tf.nn.bias_add(conv, biases)
        conv1 = tf.nn.relu(bias, name=scope.name)
        _activation_summary(conv1)
    
      # pool1
      pool1 = tf.nn.max_pool(conv1, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1],
                             padding='SAME', name='pool1')
      # norm1
      norm1 = tf.nn.lrn(pool1, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75,
                        name='norm1')
    
      # conv2
      with tf.variable_scope('conv2') as scope:
        # 由于之前的输出是64个channel,即我们这里的输入,我们的shape就会
        # 是输入channel数为64,输出,我们也规定为64
        kernel = _variable_with_weight_decay('weights', shape=[5, 5, 64, 64],
                                             stddev=1e-4, wd=0.0)
        conv = tf.nn.conv2d(norm1, kernel, [1, 1, 1, 1], padding='SAME')
        biases = _variable_on_cpu('biases', [64], tf.constant_initializer(0.1))
        bias = tf.nn.bias_add(conv, biases)
        conv2 = tf.nn.relu(bias, name=scope.name)
        _activation_summary(conv2)
    
      # norm2
      norm2 = tf.nn.lrn(conv2, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75,
                        name='norm2')
      # pool2
      pool2 = tf.nn.max_pool(norm2, ksize=[1, 3, 3, 1],
                             strides=[1, 2, 2, 1], padding='SAME', name='pool2')
    
      # local3
      with tf.variable_scope('local3') as scope:
        # Move everything into depth so we can perform a single matrix multiply.
        reshape = tf.reshape(pool2, [FLAGS.batch_size, -1])
        dim = reshape.get_shape()[1].value
        # 这里之前在reshape时的那个-1是根据tensor的大小自动定义为batch_size和
        # 剩下的,所以我们剩下的就是一张图的所有内容,我们将它训练并map到384
        # 个神经元节点上
        weights = _variable_with_weight_decay('weights', shape=[dim, 384],
                                              stddev=0.04, wd=0.004)
        biases = _variable_on_cpu('biases', [384], tf.constant_initializer(0.1))
        local3 = tf.nn.relu(tf.matmul(reshape, weights) + biases, name=scope.name)
        _activation_summary(local3)
    
      # local4
      with tf.variable_scope('local4') as scope:
        #由于我们之前的节点有384个,这里我们进一步缩减为192个。
        weights = _variable_with_weight_decay('weights', shape=[384, 192],
                                              stddev=0.04, wd=0.004)
        biases = _variable_on_cpu('biases', [192], tf.constant_initializer(0.1))
        local4 = tf.nn.relu(tf.matmul(local3, weights) + biases, name=scope.name)
        _activation_summary(local4)
    
      # softmax, i.e. softmax(WX + b)
      with tf.variable_scope('softmax_linear') as scope:
        # 这是softmax输出时的网络,我们由192个节点map到输出的不同数量上,这里假设
        # 有10类,我们就输出10个num_classes。
        weights = _variable_with_weight_decay('weights', [192, NUM_CLASSES],
                                              stddev=1/192.0, wd=0.0)
        biases = _variable_on_cpu('biases', [NUM_CLASSES],
                                  tf.constant_initializer(0.0))
        softmax_linear = tf.add(tf.matmul(local4, weights), biases, name=scope.name)
        _activation_summary(softmax_linear)
    
      return softmax_linear
    

    这里的逻辑跟之前的在框架上基本一样,不同在哪里呢?首先,这次我们的输入是彩图。学过图片处理的朋友肯定知道彩图有3个channel,而之前MNIST只是单个channel的灰白图。所以,在我们制作feature map的时候,由1个channel map到了32个(注,那个NUM_CHANNELS是1)。这里我们不过把NUM_CHANNELS给直接写为了3而已。另外,我们还运用了variable scope,这是一种很好的方式来界定何时对那些变量进行分享,同时,我们也不需要反复定义weight和biases的名字了。

    对Loss的定义由loss函数写明,其内容无非是运用了sparse_softmax_corss_entropy_with_logits,基本流程同于MNIST,这里将不详细描述。最后,cifar10.py里的train函数虽然逻辑很简单,但是也有值得注意的地方。代码如下:

    def train(total_loss, global_step):
      """Train CIFAR-10 model.
      Create an optimizer and apply to all trainable variables. Add moving
      average for all trainable variables.
      Args:
        total_loss: Total loss from loss().
        global_step: Integer Variable counting the number of training steps
          processed.
      Returns:
        train_op: op for training.
      """
      # Variables that affect learning rate.
      num_batches_per_epoch = NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN / FLAGS.batch_size
      decay_steps = int(num_batches_per_epoch * NUM_EPOCHS_PER_DECAY)
    
      # Decay the learning rate exponentially based on the number of steps.
      lr = tf.train.exponential_decay(INITIAL_LEARNING_RATE,
                                      global_step,
                                      decay_steps,
                                      LEARNING_RATE_DECAY_FACTOR,
                                      staircase=True)
      tf.scalar_summary('learning_rate', lr)
    
      # Generate moving averages of all losses and associated summaries.
      loss_averages_op = _add_loss_summaries(total_loss)
    
      # Compute gradients.
      # control dependencies的运用。这里只有loss_averages_op完成了
      # 我们才会进行gradient descent的优化。
      with tf.control_dependencies([loss_averages_op]):
        opt = tf.train.GradientDescentOptimizer(lr)
        grads = opt.compute_gradients(total_loss)
    
      # Apply gradients.
      apply_gradient_op = opt.apply_gradients(grads, global_step=global_step)
    
      # Add histograms for trainable variables.
      for var in tf.trainable_variables():
        tf.histogram_summary(var.op.name, var)
    
      # Add histograms for gradients.
      for grad, var in grads:
        if grad is not None:
          tf.histogram_summary(var.op.name + '/gradients', grad)
    
      # Track the moving averages of all trainable variables.
      variable_averages = tf.train.ExponentialMovingAverage(
          MOVING_AVERAGE_DECAY, global_step)
      variables_averages_op = variable_averages.apply(tf.trainable_variables())
    
      with tf.control_dependencies([apply_gradient_op, variables_averages_op]):
        train_op = tf.no_op(name='train')
    
      return train_op
    

    这里多出的一些内容为收集网络运算时的一些临时结果,如记录所有的loss的loss_averages_op = _add_loss_summaries(total_loss)以及对参数的histogram:tf.histogram_summary(var.op.name, var)。值得注意的地方是这里多次地使用了control_dependency概念,即dependency条件没有达成前,dependency内的代码是不会运行的。这个概念在Tensorflow中有着重要的意义,这里是一个实例,给大家很好的阐述了这个概念,建议有兴趣的朋友可以多加研究。至此,图片的训练便到此为止。

    那么eval文件是如何评价模型的好坏的呢?让我们来简单的看下eval文件的内容。我们首先通过evaluate函数中的cifar10.inputs函数得到输入图片以及其对应的label,之后,通过之前介绍的inference函数,即CNN框架得到logits,之后我们通过tensorflow的in_top_k函数来判断我们得到的那个logit是否在我们label里。这里的k被设置为1并对结果做展示以及记录等工作。有兴趣的朋友可以仔细阅读这段代码,这里将不详细说明。

    至此,系统完成,我们对于如何建立一个CNN系统有了初步了解。

  • 相关阅读:
    Shared Memory in Windows NT
    Layered Memory Management in Win32
    软件项目管理的75条建议
    Load pdbs when you need it
    Stray pointer 野指针
    About the Rebase and Bind operation in the production of software
    About "Serious Error: No RTTI Data"
    Realizing 4 GB of Address Space[MSDN]
    [bbk4397] 第1集 第一章 AMS介绍
    [bbk3204] 第67集 Chapter 17Monitoring and Detecting Lock Contention(00)
  • 原文地址:https://www.cnblogs.com/edwardbi/p/5598931.html
Copyright © 2011-2022 走看看