zoukankan      html  css  js  c++  java
  • Self-Taught Learning to Deep Networks

     

    In this section, we describe how you can fine-tune and further improve the learned features using labeled data. When you have a large amount of labeled training data, this can significantly improve your classifier's performance.

    In self-taught learning, we first trained a sparse autoencoder on the unlabeled data. Then, given a new example 	extstyle x, we used the hidden layer to extract features 	extstyle a. This is illustrated in the following diagram:

    STL SparseAE Features.png

    We are interested in solving a classification task, where our goal is to predict labels 	extstyle y. We have a labeled training set 	extstyle { (x_l^{(1)}, y^{(1)}),
(x_l^{(2)}, y^{(2)}), ldots (x_l^{(m_l)}, y^{(m_l)}) } of 	extstyle m_l labeled examples. We showed previously that we can replace the original features 	extstyle x^{(i)} with features 	extstyle a^{(l)} computed by the sparse autoencoder (the "replacement" representation). This gives us a training set 	extstyle {(a^{(1)},
y^{(1)}), ldots (a^{(m_l)}, y^{(m_l)}) }. Finally, we train a logistic classifier to map from the features 	extstyle a^{(i)} to the classification label 	extstyle y^{(i)}.

    we can draw our logistic regression unit (shown in orange) as follows:

    STL Logistic Classifier.png

    Now, consider the overall classifier (i.e., the input-output mapping) that we have learned using this method. In particular, let us examine the function that our classifier uses to map from from a new test example 	extstyle x to a new prediction p(y = 1 | x). We can draw a representation of this function by putting together the two pictures from above. In particular, the final classifier looks like this:

    STL CombinedAE.png

    The parameters of this model were trained in two stages: The first layer of weights 	extstyle W^{(1)} mapping from the input 	extstyle x to the hidden unit activations 	extstyle a were trained as part of the sparse autoencoder training process. The second layer of weights 	extstyle W^{(2)}mapping from the activations 	extstyle a to the output 	extstyle y was trained using logistic regression (or softmax regression).

    But the form of our overall/final classifier is clearly just a whole big neural network. So, having trained up an initial set of parameters for our model (training the first layer using an autoencoder, and the second layer via logistic/softmax regression), we can further modify all the parameters in our model to try to further reduce the training error. In particular, we can fine-tune the parameters, meaning perform gradient descent (or use L-BFGS) from the current setting of the parameters to try to reduce the training error on our labeled training set 	extstyle { (x_l^{(1)}, y^{(1)}),
(x_l^{(2)}, y^{(2)}), ldots (x_l^{(m_l)}, y^{(m_l)}) }.

    When fine-tuning is used, sometimes the original unsupervised feature learning steps (i.e., training the autoencoder and the logistic classifier) are called pre-training. The effect of fine-tuning is that the labeled data can be used to modify the weights W(1) as well, so that adjustments can be made to the features a extracted by the layer of hidden units.

    if we are using fine-tuning usually we will do so with a network built using the replacement representation. (If you are not using fine-tuning however, then sometimes the concatenation representation can give much better performance.)

    When should we use fine-tuning? It is typically used only if you have a large labeled training set; in this setting, fine-tuning can significantly improve the performance of your classifier. However, if you have a large unlabeled dataset (for unsupervised feature learning/pre-training) and only a relatively small labeled training set, then fine-tuning is significantly less likely to help.

  • 相关阅读:
    Devexpress之LayoutControl的使用及其控件布局设计
    C#入门笔记3 表达式及运算符2
    C#入门笔记3 表达式及运算符
    C#入门笔记2 变量
    C#入门笔记1
    Devexpress之GridControl显示序列号
    C++学习之重载运算符1
    解决"找不到该项目”无法删除该文件
    删除鼠标右键时“保存至360云盘”
    CSS基础知识——选择器
  • 原文地址:https://www.cnblogs.com/sprint1989/p/3977295.html
Copyright © 2011-2022 走看看