(论文分析) Machine Learning -- Learning from labeled and unlabeled data

zoukankan html css js c++ java

(论文分析) Machine Learning -- Learning from labeled and unlabeled data

Learning from labeled and unlabeled data

主要思想：

无标签数据可以提供关于domain的结构性信息，如数据如何分布，等。

the unlabeled data provides information about the structure of the domain.

主要算法及思想介绍：

1. Self-Training

分类器在labeled data上进行训练，然后用其对unlabeled data进行分类。 the most confident unlabeled points（对无标签数据分类后的信任度），伴随着它们预测的标签，加入到训练集中。这个过程重复进行直到收敛。

2. Co-Training

描述objects的特征分为两类，其中每一个都可以用来训练得到一个好的分类器，并且这两个集合在给出类别属性后条件独立。这两个分类器在各自的集合中迭代训练，并且它们使用未标注数据中的一部分（可以实现最佳预测的那部分）和它们的最佳预测标签 teach each other。

3. transductive SVMs

4. Collective classification

使用labeled data 和unlabeled data的关联结构来提高分类精度。我们可以假设，一个example的预测标签将要被它相关的example的预测标签所影响。

5 另外一个想法

Using Weighted Nearest Neighbor to Benefit from Unlabeled Data

使用labeled data来进行训练分类器。使用这个分类器对unlabeled data 进行分类，给出相应的信任权重。我们将这种使用原始分类器对unlabeled data进行分类后的数据，称为pre-labeled data。接下来我们联合labeled data 和 pre-labeled data 作为一个新的集合。当来一个测试样本时，我们使用k-nearest在新的集合中来寻找k 个最相近的点。由于在这个新的集合中的点，我们已经知道了它们的标签（当然我们对它们所拥有的标签的准确度的信任程度是不同的，我们需要加权），从而我们可以用这k个近邻进行投票，从而决定这个测试样本是哪个类别。

查看全文

相关阅读:
android 2.3 bug android 4.x bug
设计高性能CSS3动画的几个要素
 开启硬件加速解决页面闪白保证动画流畅
 取消input在ios下，输入的时候英文首字母的默认大写 android 上去掉语音输入按钮
 JS中的类型检测
 CSS的一点知识
 HTML Canvas
HTML DOM Table 对象
 只是有所了解的语言
 骑士巡游问题

原文地址：https://www.cnblogs.com/jian-hello/p/3552113.html