Self-Taught Learning to Deep Networks

zoukankan html css js c++ java

Self-Taught Learning to Deep Networks

In this section, we describe how you can fine-tune and further improve the learned features using labeled data. When you have a large amount of labeled training data, this can significantly improve your classifier's performance.

In self-taught learning, we first trained a sparse autoencoder on the unlabeled data. Then, given a new example $extstyle x$ , we used the hidden layer to extract features $extstyle a$ . This is illustrated in the following diagram:

We are interested in solving a classification task, where our goal is to predict labels $extstyle y$ . We have a labeled training set $extstyle { (x_l^{(1)}, y^{(1)}), (x_l^{(2)}, y^{(2)}), ldots (x_l^{(m_l)}, y^{(m_l)}) }$ of $extstyle m_l$ labeled examples. We showed previously that we can replace the original features $extstyle x^{(i)}$ with features $extstyle a^{(l)}$ computed by the sparse autoencoder (the "replacement" representation). This gives us a training set $extstyle {(a^{(1)}, y^{(1)}), ldots (a^{(m_l)}, y^{(m_l)}) }$ . Finally, we train a logistic classifier to map from the features $extstyle a^{(i)}$ to the classification label $extstyle y^{(i)}$ .

we can draw our logistic regression unit (shown in orange) as follows:

Now, consider the overall classifier (i.e., the input-output mapping) that we have learned using this method. In particular, let us examine the function that our classifier uses to map from from a new test example $extstyle x$ to a new prediction p(y = 1 | x). We can draw a representation of this function by putting together the two pictures from above. In particular, the final classifier looks like this:

The parameters of this model were trained in two stages: The first layer of weights $extstyle W^{(1)}$ mapping from the input $extstyle x$ to the hidden unit activations $extstyle a$ were trained as part of the sparse autoencoder training process. The second layer of weights $extstyle W^{(2)}$ mapping from the activations $extstyle a$ to the output $extstyle y$ was trained using logistic regression (or softmax regression).

But the form of our overall/final classifier is clearly just a whole big neural network. So, having trained up an initial set of parameters for our model (training the first layer using an autoencoder, and the second layer via logistic/softmax regression), we can further modify all the parameters in our model to try to further reduce the training error. In particular, we can fine-tune the parameters, meaning perform gradient descent (or use L-BFGS) from the current setting of the parameters to try to reduce the training error on our labeled training set $extstyle { (x_l^{(1)}, y^{(1)}), (x_l^{(2)}, y^{(2)}), ldots (x_l^{(m_l)}, y^{(m_l)}) }$ .

When fine-tuning is used, sometimes the original unsupervised feature learning steps (i.e., training the autoencoder and the logistic classifier) are called pre-training. The effect of fine-tuning is that the labeled data can be used to modify the weights W⁽¹⁾ as well, so that adjustments can be made to the features a extracted by the layer of hidden units.

if we are using fine-tuning usually we will do so with a network built using the replacement representation. (If you are not using fine-tuning however, then sometimes the concatenation representation can give much better performance.)

When should we use fine-tuning? It is typically used only if you have a large labeled training set; in this setting, fine-tuning can significantly improve the performance of your classifier. However, if you have a large unlabeled dataset (for unsupervised feature learning/pre-training) and only a relatively small labeled training set, then fine-tuning is significantly less likely to help.

查看全文

相关阅读:
iOS(iPho“.NET研究”ne/iPad)开发新手必读狼人:
如何解决““.NET研究”呈现控件时出错”的问题狼人:
VS2010 测试功能之旅：编码的UI测试（4）通“.NET研究”过编写测试代码的方式建立UI测试（上）狼人:
ASP.NET MVC中对数据进行排序的方“.NET研究”法狼人:
Android用户界面设计：“.NET研究”创建列表视图程序狼人:
Silverlight 2.5D RPG游戏技巧与特效处理：“.NET研究”（四）天气系统狼人:
对抽“.NET研究”象编程：接口和抽象类狼人:
Silverlight 2.5D RPG游戏技巧与特效处理：（五“.NET研究”）圣赞之HLSL渲染动画狼人:
VS2010测试功能之旅：编码的“.NET研究”UI测试（2）操作动作的录制原理（上）狼人:
更改“.NET研究”SharePoint 的web.config设置的两种方式狼人:

原文地址：https://www.cnblogs.com/sprint1989/p/3977295.html