zoukankan      html  css  js  c++  java
  • CS231n assignment2 Q5 TensorFlow on CIFAR-10

    Part I: Preparation

    Part II:Barebone TensorFlow

    首先实现一个flatten函数:

    def flatten(x):
        """    
        Input:
        - TensorFlow Tensor of shape (N, D1, ..., DM)
        
        Output:
        - TensorFlow Tensor of shape (N, D1 * ... * DM)
        """
        N = tf.shape(x)[0]
        return tf.reshape(x, (N, -1))
    

    完成一个两层的全连接网络并测试:

    def two_layer_fc(x, params):
        """
        A fully-connected neural network; the architecture is:
        fully-connected layer -> ReLU -> fully connected layer.
        Note that we only need to define the forward pass here; TensorFlow will take
        care of computing the gradients for us.
        
        The input to the network will be a minibatch of data, of shape
        (N, d1, ..., dM) where d1 * ... * dM = D. The hidden layer will have H units,
        and the output layer will produce scores for C classes.
    
        Inputs:
        - x: A TensorFlow Tensor of shape (N, d1, ..., dM) giving a minibatch of
          input data.
        - params: A list [w1, w2] of TensorFlow Tensors giving weights for the
          network, where w1 has shape (D, H) and w2 has shape (H, C).
        
        Returns:
        - scores: A TensorFlow Tensor of shape (N, C) giving classification scores
          for the input data x.
        """
        w1, w2 = params  # Unpack the parameters
        x = flatten(x)   # Flatten the input; now x has shape (N, D)
        h = tf.nn.relu(tf.matmul(x, w1)) # Hidden layer: h has shape (N, H)
        scores = tf.matmul(h, w2)        # Compute scores of shape (N, C)
        return scores
    
    def two_layer_fc_test():
        # TensorFlow's default computational graph is essentially a hidden global
        # variable. To avoid adding to this default graph when you rerun this cell,
        # we clear the default graph before constructing the graph we care about.
        tf.reset_default_graph()
        hidden_layer_size = 42
    
        # Scoping our computational graph setup code under a tf.device context
        # manager lets us tell TensorFlow where we want these Tensors to be
        # placed.
        with tf.device(device):
            # Set up a placehoder for the input of the network, and constant
            # zero Tensors for the network weights. Here we declare w1 and w2
            # using tf.zeros instead of tf.placeholder as we've seen before - this
            # means that the values of w1 and w2 will be stored in the computational
            # graph itself and will persist across multiple runs of the graph; in
            # particular this means that we don't have to pass values for w1 and w2
            # using a feed_dict when we eventually run the graph.
            #这里w1,w2用tf.zeros来初始化,就不用去feed data了。
            x = tf.placeholder(tf.float32)
            w1 = tf.zeros((32 * 32 * 3, hidden_layer_size))
            w2 = tf.zeros((hidden_layer_size, 10))
            
            # Call our two_layer_fc function to set up the computational
            # graph for the forward pass of the network.
            scores = two_layer_fc(x, [w1, w2])
        
        # Use numpy to create some concrete data that we will pass to the
        # computational graph for the x placeholder.
        x_np = np.zeros((64, 32, 32, 3))
        with tf.Session() as sess:
            # The calls to tf.zeros above do not actually instantiate the values
            # for w1 and w2; the following line tells TensorFlow to instantiate
            # the values of all Tensors (like w1 and w2) that live in the graph.
            sess.run(tf.global_variables_initializer())
            #运行了这句话之后,tf.zeros才真正得到赋值。
            
            # Here we actually run the graph, using the feed_dict to pass the
            # value to bind to the placeholder for x; we ask TensorFlow to compute
            # the value of the scores Tensor, which it returns as a numpy array.
            scores_np = sess.run(scores, feed_dict={x: x_np})
            print(scores_np.shape)
    
    two_layer_fc_test()
    

    完成一个3层的卷积网络并测试:

    网络结构如下:

    1. A convolutional layer (with bias) with channel_1 filters, each with shape KW1 x KH1, and zero-padding of two
    2. ReLU nonlinearity
    3. A convolutional layer (with bias) with channel_2 filters, each with shape KW2 x KH2, and zero-padding of one
    4. ReLU nonlinearity
    5. Fully-connected layer with bias, producing scores for C classes.
    def three_layer_convnet(x, params):
        """
        A three-layer convolutional network with the architecture described above.
        
        Inputs:
        - x: A TensorFlow Tensor of shape (N, H, W, 3) giving a minibatch of images
        - params: A list of TensorFlow Tensors giving the weights and biases for the
          network; should contain the following:
          - conv_w1: TensorFlow Tensor of shape (KH1, KW1, 3, channel_1) giving
            weights for the first convolutional layer.
          - conv_b1: TensorFlow Tensor of shape (channel_1,) giving biases for the
            first convolutional layer.
          - conv_w2: TensorFlow Tensor of shape (KH2, KW2, channel_1, channel_2)
            giving weights for the second convolutional layer
          - conv_b2: TensorFlow Tensor of shape (channel_2,) giving biases for the
            second convolutional layer.
          - fc_w: TensorFlow Tensor giving weights for the fully-connected layer.
            Can you figure out what the shape should be? (channel_2 * * *,10)
          - fc_b: TensorFlow Tensor giving biases for the fully-connected layer.
            Can you figure out what the shape should be? (10,1)
        """
        conv_w1, conv_b1, conv_w2, conv_b2, fc_w, fc_b = params
        scores = None
        ############################################################################
        # TODO: Implement the forward pass for the three-layer ConvNet.            #
        ############################################################################
        h1 = tf.nn.conv2d(input = x,filter = conv_w1,strides = [1,1,1,1],padding = 'SAME',name = 'conv1') + conv_b1
        h11 = tf.nn.relu(h1)
        h2 = tf.nn.conv2d(input = h11,filter = conv_w2,strides = [1,1,1,1],padding = 'SAME' ,name = 'conv2') + conv_b2
        h22 = tf.nn.relu(h2)
        h = flatten(h22)
        scores = tf.matmul(h,fc_w) + fc_b
        ############################################################################
        #                              END OF YOUR CODE                            #
        ############################################################################
        return scores
    
    def three_layer_convnet_test():
        tf.reset_default_graph()
    
        with tf.device(device):
            x = tf.placeholder(tf.float32)
            conv_w1 = tf.zeros((5, 5, 3, 6))
            conv_b1 = tf.zeros((6,))
            conv_w2 = tf.zeros((3, 3, 6, 9))
            conv_b2 = tf.zeros((9,))
            fc_w = tf.zeros((32 * 32 * 9, 10))
            fc_b = tf.zeros((10,))
            params = [conv_w1, conv_b1, conv_w2, conv_b2, fc_w, fc_b]
            scores = three_layer_convnet(x, params)
    
        # Inputs to convolutional layers are 4-dimensional arrays with shape
        # [batch_size, height, width, channels]
        x_np = np.zeros((64, 32, 32, 3))
        
        with tf.Session() as sess:
            sess.run(tf.global_variables_initializer())
            scores_np = sess.run(scores, feed_dict={x: x_np})
            print('scores_np has shape: ', scores_np.shape)
    
    with tf.device('/gpu:0'):
        three_layer_convnet_test()
    

    完成train step,在一个step中会做这些事:

    1. Compute the loss
    2. Compute the gradient of the loss with respect to all network weights
    3. Make a weight update step using (stochastic) gradient descent.
    def training_step(scores, y, params, learning_rate):
        """
        Set up the part of the computational graph which makes a training step.
    
        Inputs:
        - scores: TensorFlow Tensor of shape (N, C) giving classification scores for
          the model.
        - y: TensorFlow Tensor of shape (N,) giving ground-truth labels for scores;
          y[i] == c means that c is the correct class for scores[i].
        - params: List of TensorFlow Tensors giving the weights of the model
        - learning_rate: Python scalar giving the learning rate to use for gradient
          descent step.
          
        Returns:
        - loss: A TensorFlow Tensor of shape () (scalar) giving the loss for this
          batch of data; evaluating the loss also performs a gradient descent step
          on params (see above).
        """
        # First compute the loss; the first line gives losses for each example in
        # the minibatch, and the second averages the losses acros the batch
        losses = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y, logits=scores)
        loss = tf.reduce_mean(losses) #计算loss
    
        # Compute the gradient of the loss with respect to each parameter of the the
        # network. This is a very magical function call: TensorFlow internally
        # traverses the computational graph starting at loss backward to each element
        # of params, and uses backpropagation to figure out how to compute gradients;
        # it then adds new operations to the computational graph which compute the
        # requested gradients, and returns a list of TensorFlow Tensors that will
        # contain the requested gradients when evaluated.
        grad_params = tf.gradients(loss, params) #计算梯度
        
        # Make a gradient descent step on all of the model parameters.
        new_weights = []   
        for w, grad_w in zip(params, grad_params):  #更新参数
            new_w = tf.assign_sub(w, learning_rate * grad_w)
            new_weights.append(new_w)
    
        # Insert a control dependency so that evaluting the loss causes a weight
        # update to happen; see the discussion above.
        with tf.control_dependencies(new_weights): #建立更新权重和loss之间的依赖关系
            return tf.identity(loss)
    

    完成train loop:

    def train_part2(model_fn, init_fn, learning_rate):
        """
        Train a model on CIFAR-10.
        
        Inputs:
        - model_fn: A Python function that performs the forward pass of the model
          using TensorFlow; it should have the following signature: 我们设计的网络模型
          scores = model_fn(x, params) where x is a TensorFlow Tensor giving a
          minibatch of image data, params is a list of TensorFlow Tensors holding
          the model weights, and scores is a TensorFlow Tensor of shape (N, C)
          giving scores for all elements of x.
        - init_fn: A Python function that initializes the parameters of the model.
          It should have the signature params = init_fn() where params is a list
          of TensorFlow Tensors holding the (randomly initialized) weights of the
          model.   初始化参数的函数
        - learning_rate: Python float giving the learning rate to use for SGD.
        """
        # First clear the default graph
        tf.reset_default_graph()
        is_training = tf.placeholder(tf.bool, name='is_training')
        # Set up the computational graph for performing forward and backward passes,
        # and weight updates.
        with tf.device(device):
            # Set up placeholders for the data and labels
            x = tf.placeholder(tf.float32, [None, 32, 32, 3])
            y = tf.placeholder(tf.int32, [None])
            params = init_fn()           # Initialize the model parameters
            scores = model_fn(x, params) # Forward pass of the model
            loss = training_step(scores, y, params, learning_rate)
    
        # Now we actually run the graph many times using the training data
        with tf.Session() as sess:
            # Initialize variables that will live in the graph
            sess.run(tf.global_variables_initializer())
            for t, (x_np, y_np) in enumerate(train_dset):
                # Run the graph on a batch of training data; recall that asking
                # TensorFlow to evaluate loss will cause an SGD step to happen.
                feed_dict = {x: x_np, y: y_np}
                loss_np = sess.run(loss, feed_dict=feed_dict)
                
                # Periodically print the loss and check accuracy on the val set
                if t % print_every == 0:
                    print('Iteration %d, loss = %.4f' % (t, loss_np))
                    check_accuracy(sess, val_dset, x, scores, is_training)
    

    Kaiming's normalization:

    def kaiming_normal(shape):
        if len(shape) == 2:
            fan_in, fan_out = shape[0], shape[1]
        elif len(shape) == 4:
            fan_in, fan_out = np.prod(shape[:3]), shape[3]
        return tf.random_normal(shape) * np.sqrt(2.0 / fan_in)
    

    训练我们的两层网络:

    def two_layer_fc_init():
        """
        Initialize the weights of a two-layer network, for use with the
        two_layer_network function defined above.
        
        Inputs: None
        
        Returns: A list of:
        - w1: TensorFlow Variable giving the weights for the first layer
        - w2: TensorFlow Variable giving the weights for the second layer
        """
        hidden_layer_size = 4000
        w1 = tf.Variable(kaiming_normal((3 * 32 * 32, 4000)))
        w2 = tf.Variable(kaiming_normal((4000, 10)))
        return [w1, w2]
    
    learning_rate = 1e-2
    train_part2(two_layer_fc, two_layer_fc_init, learning_rate)
    

    Iteration 0, loss = 2.8053
    Got 134 / 1000 correct (13.40%)
    Iteration 100, loss = 1.9526
    Got 383 / 1000 correct (38.30%)
    Iteration 200, loss = 1.4617
    Got 393 / 1000 correct (39.30%)
    Iteration 300, loss = 1.7108
    Got 372 / 1000 correct (37.20%)
    Iteration 400, loss = 1.8420
    Got 421 / 1000 correct (42.10%)
    Iteration 500, loss = 1.8536
    Got 429 / 1000 correct (42.90%)
    Iteration 600, loss = 1.8949
    Got 413 / 1000 correct (41.30%)
    Iteration 700, loss = 1.9321
    Got 424 / 1000 correct (42.40%)

    训练我们的三层网络:

    def three_layer_convnet_init():
        """
        Initialize the weights of a Three-Layer ConvNet, for use with the
        three_layer_convnet function defined above.
        
        Inputs: None
        
        Returns a list containing:
        - conv_w1: TensorFlow Variable giving weights for the first conv layer
        - conv_b1: TensorFlow Variable giving biases for the first conv layer
        - conv_w2: TensorFlow Variable giving weights for the second conv layer
        - conv_b2: TensorFlow Variable giving biases for the second conv layer
        - fc_w: TensorFlow Variable giving weights for the fully-connected layer
        - fc_b: TensorFlow Variable giving biases for the fully-connected layer
        """
        params = None
        ############################################################################
        # TODO: Initialize the parameters of the three-layer network.              #
        ############################################################################
        w1 = tf.Variable(kaiming_normal((5,5,3,6)))
        b1 = tf.Variable(kaiming_normal((1,6)))
        w2 = tf.Variable(kaiming_normal((3,3,6,9)))
        b2 = tf.Variable(kaiming_normal((1,9)))
        w = tf.Variable(kaiming_normal((32 * 32 * 9,10)))
        b = tf.Variable(kaiming_normal((1,10)))
        params = [w1,b1,w2,b2,w,b]
        ############################################################################
        #                             END OF YOUR CODE                             #
        ############################################################################
        return params
    
    learning_rate = 3e-3
    train_part2(three_layer_convnet, three_layer_convnet_init, learning_rate)
    

    Iteration 0, loss = 3.4851
    Got 96 / 1000 correct (9.60%)
    Iteration 100, loss = 1.8512
    Got 323 / 1000 correct (32.30%)
    Iteration 200, loss = 1.6490
    Got 372 / 1000 correct (37.20%)
    Iteration 300, loss = 1.8010
    Got 360 / 1000 correct (36.00%)
    Iteration 400, loss = 1.8237
    Got 394 / 1000 correct (39.40%)
    Iteration 500, loss = 1.8371
    Got 412 / 1000 correct (41.20%)
    Iteration 600, loss = 1.7767
    Got 428 / 1000 correct (42.80%)
    Iteration 700, loss = 1.6171
    Got 430 / 1000 correct (43.00%)

    Part III: Keras Model API

    使用Module API构建一个两层的全连接网络:

    class TwoLayerFC(tf.keras.Model): #定义为一个类
        def __init__(self, hidden_size, num_classes): #定义网络结构
            super().__init__()        
            initializer = tf.variance_scaling_initializer(scale=2.0)
            self.fc1 = tf.layers.Dense(hidden_size, activation=tf.nn.relu,
                                       kernel_initializer=initializer) #定义了全连接层,使用relu和初始化方法
            #tf.layers.Dense是一个类
            self.fc2 = tf.layers.Dense(num_classes,
                                       kernel_initializer=initializer)
        def call(self, x, training=None): #然后调用
            x = tf.layers.flatten(x) #拉直x
            x = self.fc1(x)
            x = self.fc2(x)
            return x
    
    
    def test_TwoLayerFC():
        """ A small unit test to exercise the TwoLayerFC model above. """
        tf.reset_default_graph()
        input_size, hidden_size, num_classes = 50, 42, 10
    
        # As usual in TensorFlow, we first need to define our computational graph.
        # To this end we first construct a TwoLayerFC object, then use it to construct
        # the scores Tensor.
        model = TwoLayerFC(hidden_size, num_classes)
        with tf.device(device):
            x = tf.zeros((64, input_size))
            scores = model(x)
    
        # Now that our computational graph has been defined we can run the graph
        with tf.Session() as sess:
            sess.run(tf.global_variables_initializer())
            scores_np = sess.run(scores)
            print(scores_np.shape)
            
    test_TwoLayerFC()
    

    使用Funtional API构建一个两层的全连接网络:

    def two_layer_fc_functional(inputs, hidden_size, num_classes): #定义为一个函数
        initializer = tf.variance_scaling_initializer(scale=2.0)
        flattened_inputs = tf.layers.flatten(inputs)
        fc1_output = tf.layers.dense(flattened_inputs, hidden_size, activation=tf.nn.relu,
                                     kernel_initializer=initializer)
        #tf.layers.dense 是一个函数
        scores = tf.layers.dense(fc1_output, num_classes,
                                 kernel_initializer=initializer)
        return scores
    
    def test_two_layer_fc_functional():
        """ A small unit test to exercise the TwoLayerFC model above. """
        tf.reset_default_graph()
        input_size, hidden_size, num_classes = 50, 42, 10
    
        # As usual in TensorFlow, we first need to define our computational graph.
        # To this end we first construct a two layer network graph by calling the
        # two_layer_network() function. This function constructs the computation
        # graph and outputs the score tensor.
        with tf.device(device):
            x = tf.zeros((64, input_size))
            scores = two_layer_fc_functional(x, hidden_size, num_classes)
    
        # Now that our computational graph has been defined we can run the graph
        with tf.Session() as sess:
            sess.run(tf.global_variables_initializer())
            scores_np = sess.run(scores)
            print(scores_np.shape)
            
    test_two_layer_fc_functional()
    

    使用Keras Model API构建一个三层卷积网络:

    1. Convolutional layer with 5 x 5 kernels, with zero-padding of 2
    2. ReLU nonlinearity
    3. Convolutional layer with 3 x 3 kernels, with zero-padding of 1
    4. ReLU nonlinearity
    5. Fully-connected layer to give class scores
    class ThreeLayerConvNet(tf.keras.Model):
        def __init__(self, channel_1, channel_2, num_classes):
            super().__init__()
            ########################################################################
            # TODO: Implement the __init__ method for a three-layer ConvNet. You   #
            # should instantiate layer objects to be used in the forward pass.     #
            ########################################################################
            initializer = tf.variance_scaling_initializer(scale=2.0)
            self.conv1 = tf.layers.Conv2D(filters = channel_1,kernel_size = [5,5],
                                          strides = [1,1],padding = 'SAME',activation = tf.nn.relu,
                                         use_bias = True,kernel_initializer = initializer,
                                          bias_initializer = initializer,name = 'conv1')
            self.conv2 = tf.layers.Conv2D(filters = channel_2,kernel_size = [3,3],
                                          strides = [1,1],padding = 'SAME',activation = tf.nn.relu,
                                         use_bias = True,kernel_initializer = initializer,
                                          bias_initializer = initializer,name = 'conv1')
            self.fc = tf.layers.Dense(units = num_classes,use_bias = True,
                                      kernel_initializer = initializer,bias_initializer = initializer,
                                      name = 'fc')
            ########################################################################
            #                           END OF YOUR CODE                           #
            ########################################################################
            
        def call(self, x, training=None):
            scores = None
            ########################################################################
            # TODO: Implement the forward pass for a three-layer ConvNet. You      #
            # should use the layer objects defined in the __init__ method.         #
            ########################################################################
            x = self.conv1(x)
            x = self.conv2(x)
            x = tf.layers.flatten(x)
            scores = self.fc(x)
            ########################################################################
            #                           END OF YOUR CODE                           #
            ########################################################################        
            return scores
    

    Keras Model API: Training Loop

    def train_part34(model_init_fn, optimizer_init_fn, num_epochs=1):
        """
        Simple training loop for use with models defined using tf.keras. It trains
        a model for one epoch on the CIFAR-10 training set and periodically checks
        accuracy on the CIFAR-10 validation set.
        
        Inputs:
        - model_init_fn: A function that takes no parameters; when called it
          constructs the model we want to train: model = model_init_fn()
        - optimizer_init_fn: A function which takes no parameters; when called it
          constructs the Optimizer object we will use to optimize the model:
          optimizer = optimizer_init_fn()
        - num_epochs: The number of epochs to train for
        
        Returns: Nothing, but prints progress during trainingn
        """
        tf.reset_default_graph()    
        with tf.device(device):
            # Construct the computational graph we will use to train the model. We
            # use the model_init_fn to construct the model, declare placeholders for
            # the data and labels
            x = tf.placeholder(tf.float32, [None, 32, 32, 3])
            y = tf.placeholder(tf.int32, [None])
            
            # We need a place holder to explicitly specify if the model is in the training
            # phase or not. This is because a number of layers behaves differently in
            # training and in testing, e.g., dropout and batch normalization.
            # We pass this variable to the computation graph through feed_dict as shown below.
            is_training = tf.placeholder(tf.bool, name='is_training')
            
            # Use the model function to build the forward pass.
            scores = model_init_fn(x, is_training)
    
            # Compute the loss like we did in Part II
            loss = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y, logits=scores)
            loss = tf.reduce_mean(loss)
    
            # Use the optimizer_fn to construct an Optimizer, then use the optimizer
            # to set up the training step. Asking TensorFlow to evaluate the
            # train_op returned by optimizer.minimize(loss) will cause us to make a
            # single update step using the current minibatch of data.
            
            # Note that we use tf.control_dependencies to force the model to run
            # the tf.GraphKeys.UPDATE_OPS at each training step. tf.GraphKeys.UPDATE_OPS
            # holds the operators that update the states of the network.
            # For example, the tf.layers.batch_normalization function adds the running mean
            # and variance update operators to tf.GraphKeys.UPDATE_OPS.
            optimizer = optimizer_init_fn()
            update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
            with tf.control_dependencies(update_ops):
                train_op = optimizer.minimize(loss)
    
        # Now we can run the computational graph many times to train the model.
        # When we call sess.run we ask it to evaluate train_op, which causes the
        # model to update.
        with tf.Session() as sess:
            sess.run(tf.global_variables_initializer())
            t = 0
            for epoch in range(num_epochs):
                print('Starting epoch %d' % epoch)
                for x_np, y_np in train_dset:
                    feed_dict = {x: x_np, y: y_np, is_training:1}
                    loss_np, _ = sess.run([loss, train_op], feed_dict=feed_dict)
                    if t % print_every == 0:
                        print('Iteration %d, loss = %.4f' % (t, loss_np))
                        check_accuracy(sess, val_dset, x, scores, is_training=is_training)
                        print()
                    t += 1
    

    Keras Model API: Train a Two-Layer Network

    hidden_size, num_classes = 4000, 10
    learning_rate = 1e-2
    
    def model_init_fn(inputs, is_training):
        return TwoLayerFC(hidden_size, num_classes)(inputs)
    
    def optimizer_init_fn():
        return tf.train.GradientDescentOptimizer(learning_rate)
    
    train_part34(model_init_fn, optimizer_init_fn)
    

    Starting epoch 0
    Iteration 0, loss = 2.9554
    Got 147 / 1000 correct (14.70%)

    Iteration 100, loss = 1.8660
    Got 374 / 1000 correct (37.40%)

    Iteration 200, loss = 1.5924
    Got 391 / 1000 correct (39.10%)

    Iteration 300, loss = 1.8491
    Got 390 / 1000 correct (39.00%)

    Iteration 400, loss = 1.7189
    Got 430 / 1000 correct (43.00%)

    Iteration 500, loss = 1.7548
    Got 432 / 1000 correct (43.20%)

    Iteration 600, loss = 1.8440
    Got 418 / 1000 correct (41.80%)

    Iteration 700, loss = 1.9507
    Got 451 / 1000 correct (45.10%)

    Keras Model API: Train a Two-Layer Network (functional API)

    hidden_size, num_classes = 4000, 10
    learning_rate = 1e-2
    
    def model_init_fn(inputs, is_training):
        return two_layer_fc_functional(inputs, hidden_size, num_classes)
    
    def optimizer_init_fn():
        return tf.train.GradientDescentOptimizer(learning_rate)
    
    train_part34(model_init_fn, optimizer_init_fn)
    

    Starting epoch 0
    Iteration 0, loss = 3.2064
    Got 113 / 1000 correct (11.30%)

    Iteration 100, loss = 1.8935
    Got 374 / 1000 correct (37.40%)

    Iteration 200, loss = 1.5011
    Got 384 / 1000 correct (38.40%)

    Iteration 300, loss = 1.9119
    Got 359 / 1000 correct (35.90%)

    Iteration 400, loss = 1.8919
    Got 416 / 1000 correct (41.60%)

    Iteration 500, loss = 1.7257
    Got 430 / 1000 correct (43.00%)

    Iteration 600, loss = 1.9092
    Got 414 / 1000 correct (41.40%)

    Iteration 700, loss = 2.0570
    Got 449 / 1000 correct (44.90%)

    Keras Model API: Train a Three-Layer ConvNet

    learning_rate = 3e-3
    channel_1, channel_2, num_classes = 32, 16, 10
    
    def model_init_fn(inputs, is_training):
        model = None
        ############################################################################
        # TODO: Complete the implementation of model_fn.                           #
        ############################################################################
        model = ThreeLayerConvNet(channel_1,channel_2,num_classes)
        ############################################################################
        #                           END OF YOUR CODE                               #
        ############################################################################
        return model(inputs)
    
    def optimizer_init_fn():
        optimizer = None
        ############################################################################
        # TODO: Complete the implementation of model_fn.                           #
        ############################################################################
        optimizer = tf.train.MomentumOptimizer(learning_rate= learning_rate,momentum = 0.9,use_nesterov = True)
        ############################################################################
        #                           END OF YOUR CODE                               #
        ############################################################################
        return optimizer
    
    train_part34(model_init_fn, optimizer_init_fn)
    

    Starting epoch 0
    Iteration 0, loss = 3.5594
    Got 81 / 1000 correct (8.10%)

    Iteration 100, loss = 1.6427
    Got 394 / 1000 correct (39.40%)

    Iteration 200, loss = 1.4471
    Got 453 / 1000 correct (45.30%)

    Iteration 300, loss = 1.4377
    Got 472 / 1000 correct (47.20%)

    Iteration 400, loss = 1.4059
    Got 489 / 1000 correct (48.90%)

    Iteration 500, loss = 1.5382
    Got 535 / 1000 correct (53.50%)

    Iteration 600, loss = 1.3765
    Got 525 / 1000 correct (52.50%)

    Iteration 700, loss = 1.4015
    Got 518 / 1000 correct (51.80%)

    Part IV: Keras Sequential API

    Keras Sequential API: Two-Layer Network

    learning_rate = 1e-2
    
    def model_init_fn(inputs, is_training): 
        input_shape = (32, 32, 3)
        hidden_layer_size, num_classes = 4000, 10
        initializer = tf.variance_scaling_initializer(scale=2.0)
        layers = [ #需要在第一层给出input_shape
            tf.layers.Flatten(input_shape=input_shape),
            tf.layers.Dense(hidden_layer_size, activation=tf.nn.relu,
                            kernel_initializer=initializer),
            tf.layers.Dense(num_classes, kernel_initializer=initializer),
        ]
        model = tf.keras.Sequential(layers)
        return model(inputs)
    
    def optimizer_init_fn():
        return tf.train.GradientDescentOptimizer(learning_rate)
    
    train_part34(model_init_fn, optimizer_init_fn)
    

    Starting epoch 0
    Iteration 0, loss = 3.0599
    Got 138 / 1000 correct (13.80%)

    Iteration 100, loss = 1.9839
    Got 363 / 1000 correct (36.30%)

    Iteration 200, loss = 1.4431
    Got 389 / 1000 correct (38.90%)

    Iteration 300, loss = 1.8575
    Got 375 / 1000 correct (37.50%)

    Iteration 400, loss = 1.7719
    Got 413 / 1000 correct (41.30%)

    Iteration 500, loss = 1.7979
    Got 438 / 1000 correct (43.80%)

    Iteration 600, loss = 1.8587
    Got 418 / 1000 correct (41.80%)

    Iteration 700, loss = 1.9053
    Got 442 / 1000 correct (44.20%)

    Keras Sequential API: Three-Layer ConvNet

    1. Convolutional layer with 16 5x5 kernels, using zero padding of 2
    2. ReLU nonlinearity
    3. Convolutional layer with 32 3x3 kernels, using zero padding of 1
    4. ReLU nonlinearity
    5. Fully-connected layer giving class scores
    def model_init_fn(inputs, is_training):
        model = None
        ############################################################################
        # TODO: Construct a three-layer ConvNet using tf.keras.Sequential.         #
        ############################################################################
        initializer = tf.variance_scaling_initializer(scale=2.0)
        layers = [
            tf.layers.Conv2D(input_shape = (32,32,3),filters = 16,kernel_size = [5,5],
                                strides = [1,1],padding = 'SAME',activation = tf.nn.relu,
                                use_bias = True,kernel_initializer = initializer,
                                bias_initializer = initializer,name = 'conv1'),
            tf.layers.Conv2D(filters = 32,kernel_size = [5,5],
                                strides = [1,1],padding = 'SAME',activation = tf.nn.relu,
                                use_bias = True,kernel_initializer = initializer,
                                bias_initializer = initializer,name = 'conv2'),
            tf.layers.Flatten(),
            tf.layers.Dense(units = 10,use_bias = True,
                                kernel_initializer = initializer,bias_initializer = initializer,
                                name = 'fc')]
        model = tf.keras.Sequential(layers)
        ############################################################################
        #                            END OF YOUR CODE                              #
        ############################################################################
        return model(inputs)
    
    learning_rate = 5e-4
    def optimizer_init_fn():
        optimizer = None
        ############################################################################
        # TODO: Complete the implementation of model_fn.                           #
        ############################################################################
        optimizer = tf.train.MomentumOptimizer(learning_rate = learning_rate,momentum = 0.9,use_nesterov = True)
        ############################################################################
        #                           END OF YOUR CODE                               #
        ############################################################################
        return optimizer
    
    train_part34(model_init_fn, optimizer_init_fn)
    

    Starting epoch 0
    Iteration 0, loss = 2.5582
    Got 103 / 1000 correct (10.30%)

    Iteration 100, loss = 1.5996
    Got 403 / 1000 correct (40.30%)

    Iteration 200, loss = 1.4355
    Got 461 / 1000 correct (46.10%)

    Iteration 300, loss = 1.5550
    Got 493 / 1000 correct (49.30%)

    Iteration 400, loss = 1.4755
    Got 484 / 1000 correct (48.40%)

    Iteration 500, loss = 1.5330
    Got 505 / 1000 correct (50.50%)

    Iteration 600, loss = 1.5811
    Got 523 / 1000 correct (52.30%)

    Iteration 700, loss = 1.3541
    Got 529 / 1000 correct (52.90%)

    Part V: CIFAR-10 open-ended challenge

    def model_init_fn(inputs, is_training):
        model = None
        ############################################################################
        # TODO: Construct a model that performs well on CIFAR-10                   #
        ############################################################################
        initializer = tf.variance_scaling_initializer(scale=2.0)
        layers = [
            tf.layers.Conv2D(input_shape = (32,32,3),filters = 64,kernel_size = [3,3],
                                strides = [1,1],padding = 'SAME',activation = tf.nn.relu,
                                use_bias = True,kernel_initializer = initializer,
                                bias_initializer = initializer,name = 'conv1'),
            tf.layers.Conv2D(filters = 64,kernel_size = [3,3],
                                strides = [1,1],padding = 'SAME',activation = tf.nn.relu,
                                use_bias = True,kernel_initializer = initializer,
                                bias_initializer = initializer,name = 'conv2'),
            tf.layers.Conv2D(filters = 128,kernel_size = [3,3],
                                strides = [1,1],padding = 'SAME',activation = tf.nn.relu,
                                use_bias = True,kernel_initializer = initializer,
                                bias_initializer = initializer,name = 'conv3'),
            tf.layers.MaxPooling2D(pool_size = [2,2],strides = [2,2],name = 'pool1'),        
            tf.layers.Conv2D(filters = 128,kernel_size = [3,3],
                                strides = [1,1],padding = 'SAME',activation = tf.nn.relu,
                                use_bias = True,kernel_initializer = initializer,
                                bias_initializer = initializer,name = 'conv4'),
            tf.layers.Conv2D(filters = 256,kernel_size = [3,3],
                                strides = [1,1],padding = 'SAME',activation = tf.nn.relu,
                                use_bias = True,kernel_initializer = initializer,
                                bias_initializer = initializer,name = 'conv5'),
            tf.layers.Conv2D(filters = 256,kernel_size = [3,3],
                                strides = [1,1],padding = 'SAME',activation = tf.nn.relu,
                                use_bias = True,kernel_initializer = initializer,
                                bias_initializer = initializer,name = 'conv6'),
            tf.layers.MaxPooling2D(pool_size = [2,2],strides = [2,2],name = 'pool2'),  
            tf.layers.Conv2D(filters = 256,kernel_size = [3,3],
                                strides = [1,1],padding = 'SAME',activation = tf.nn.relu,
                                use_bias = True,kernel_initializer = initializer,
                                bias_initializer = initializer,name = 'conv7'),
            tf.layers.Conv2D(filters = 256,kernel_size = [3,3],
                                strides = [1,1],padding = 'SAME',activation = tf.nn.relu,
                                use_bias = True,kernel_initializer = initializer,
                                bias_initializer = initializer,name = 'conv8'),
            tf.layers.Conv2D(filters = 256,kernel_size = [3,3],
                                strides = [1,1],padding = 'SAME',activation = tf.nn.relu,
                                use_bias = True,kernel_initializer = initializer,
                                bias_initializer = initializer,name = 'conv9'),
            tf.layers.MaxPooling2D(pool_size = [2,2],strides = [2,2],name = 'pool3'),  
            tf.layers.Conv2D(filters = 256,kernel_size = [3,3],
                                strides = [1,1],padding = 'SAME',activation = tf.nn.relu,
                                use_bias = True,kernel_initializer = initializer,
                                bias_initializer = initializer,name = 'conv10'),
            tf.layers.Conv2D(filters = 256,kernel_size = [3,3],
                                strides = [1,1],padding = 'SAME',activation = tf.nn.relu,
                                use_bias = True,kernel_initializer = initializer,
                                bias_initializer = initializer,name = 'conv11'),
            tf.layers.Conv2D(filters = 256,kernel_size = [3,3],
                                strides = [1,1],padding = 'SAME',activation = tf.nn.relu,
                                use_bias = True,kernel_initializer = initializer,
                                bias_initializer = initializer,name = 'conv12'),
            tf.layers.Conv2D(filters = 256,kernel_size = [3,3],
                                strides = [1,1],padding = 'SAME',activation = tf.nn.relu,
                                use_bias = True,kernel_initializer = initializer,
                                bias_initializer = initializer,name = 'conv13'),
            tf.layers.Flatten(),
            tf.layers.Dense(units = 1024,use_bias = True,
                                kernel_initializer = initializer,bias_initializer = initializer,
                                name = 'fc1'),
            tf.layers.Dense(units = 1024,use_bias = True,
                                kernel_initializer = initializer,bias_initializer = initializer,
                                name = 'fc2'),
            tf.layers.Dense(units = 10,use_bias = True,
                                kernel_initializer = initializer,bias_initializer = initializer,
                                name = 'fc3')
        ]
        model = tf.keras.Sequential(layers)
        ############################################################################
        #                            END OF YOUR CODE                              #
        ############################################################################
        return model(inputs)
    
    def optimizer_init_fn():
        optimizer = None
        ############################################################################
        # TODO: Construct an optimizer that performs well on CIFAR-10              #
        ############################################################################
        optimizer = tf.train.AdamOptimizer()
        ############################################################################
        #                            END OF YOUR CODE                              #
        ############################################################################
        return optimizer
    
    device = '/gpu:0'
    print_every = 700
    num_epochs = 10
    train_part34(model_init_fn, optimizer_init_fn, num_epochs)
    

    Starting epoch 0
    Iteration 0, loss = 3.8694
    Got 79 / 1000 correct (7.90%)

    Iteration 700, loss = 1.6052
    Got 484 / 1000 correct (48.40%)

    Starting epoch 1
    Iteration 1400, loss = 1.0688
    Got 616 / 1000 correct (61.60%)

    Starting epoch 2
    Iteration 2100, loss = 0.9978
    Got 643 / 1000 correct (64.30%)

    Starting epoch 3
    Iteration 2800, loss = 0.8107
    Got 678 / 1000 correct (67.80%)

    Starting epoch 4
    Iteration 3500, loss = 0.6718
    Got 717 / 1000 correct (71.70%)

    Starting epoch 5
    Iteration 4200, loss = 0.3733
    Got 750 / 1000 correct (75.00%)

    Starting epoch 6
    Iteration 4900, loss = 0.8152
    Got 697 / 1000 correct (69.70%)

    Starting epoch 7
    Iteration 5600, loss = 0.3667
    Got 704 / 1000 correct (70.40%)

    Starting epoch 8
    Iteration 6300, loss = 0.4429
    Got 753 / 1000 correct (75.30%)

    Starting epoch 9
    Iteration 7000, loss = 0.4751
    Got 761 / 1000 correct (76.10%)

    16层的一个模型,包括13个卷积层,3个池化层,3个全连接层,使用adam来训练。
    最终10个epoch准确率76.10%,还有很大的进步空间。

  • 相关阅读:
    POJ 1673
    POJ 1375
    POJ 1654
    POJ 1039
    POJ 1066
    UVA 10159
    POJ 1410
    POJ 2653
    POJ 2398
    POJ 1556
  • 原文地址:https://www.cnblogs.com/bernieloveslife/p/10197063.html
Copyright © 2011-2022 走看看