zoukankan      html  css  js  c++  java
  • Inductive vs. Transductive Learning(归纳式学习与直推式学习)

    归纳式学习,就是我们平时训练的神经网络,训练阶段测试集不参与训练,模型训练好后,再对测试集进行预测;

    直推式学习,测试集也加入训练,知道这点区别就行了;

    Induction and Transduction…You may have come across these two words many times when reading books and articles on machine learning. In this article, let’s try to understand the differences in these two learning approaches and how they can be used according to our use-case.

    归纳法和直推法……阅读有关机器学习的书籍和文章时,您可能多次遇到过这两个词。 在本文中,让我们尝试了解这两种学习方法的差异以及如何根据我们的用例使用它们。

    Understanding the Definitions(理解定义)

    Transduction is reasoning from observed, specific (training) cases to specific (test) cases. In contrast, induction is reasoning from observed training cases to general rules, which are then applied to the test cases.

    直推式学习是从观察到的特定(训练)案例到特定(测试)案例的推理。 相反,归纳是从观察到的训练案例到一般规则的推理,然后将其应用于测试案例。

    Let’s breakdown and understand these two definitions.

    让我们分析并并理解这两个定义

    Induction(归纳学习)

    Induction is reasoning from observed training cases to general rules, which are then applied to the test cases.

    归纳是从观察到的训练案例到一般规则的推理,然后将其应用于测试案例。

    Inductive learning is the same as what we commonly know as traditional supervised learning. We build and train a machine learning model based on a labelled training dataset we already have. Then we use this trained model to predict the labels of a testing dataset which we have never encountered before.

    归纳学习与我们通常称为传统监督学习的知识相同。 我们基于已有的标记训练数据集构建和训练机器学习模型。 然后,我们使用这个训练过的模型来预测我们从未遇到过的测试数据集的标签。

     

    Transduction(直推式学习)

    Transduction is reasoning from observed, specific (training) cases to specific (test) cases.

    直推式学习是从观察到的特定(训练)案例到特定(测试)案例的推理。

    In contrast to inductive learning, transductive learning techniques have observed all the data beforehand, both the training and testing datasets. We learn from the already observed training dataset and then predict the labels of the testing dataset. Even though we do not know the labels of the testing datasets, we can make use of the patterns and additional information present in this data during the learning process.

    与归纳学习相反,直推式学习技术已经预先观察了所有数据,包括训练和测试数据集。 我们从已经观察到的训练数据集中学习,然后预测测试数据集的标签。 即使我们不知道测试数据集的标签,我们也可以在学习过程中利用这些数据中存在的模式和其他信息。

    Example transductive learning approaches include transductive SVM (TSVM) and graph-based label propagation algorithms (LPA).

    直推式的学习方法例子有直推式SVM(TSVM)和基于图标签的传播算法(LPA)。

    What are the Differences?(两种学习方法有什么区别?)

    Now that you have a clear idea about the definitions of inductive and transductive learning, let’s see what are the differences. The definitions pretty much speak out the differences, but let’s go through them so that it will be more clear.

    既然您对归纳学习和直推式学习的定义有了清晰的认识,让我们看看有什么区别。 这些定义几乎说明了差异,但让我们仔细研究一下它们,以便更加清楚。

    The main difference is that during transductive learning, you have already encountered both the training and testing datasets when training the model. However, inductive learning encounters only the training data when training the model and applies the learned model on a dataset which it has never seen before.

    主要区别在于,在直推式学习期间,你在训练模型时已经遇到了训练和测试数据集。 但是,归纳学习在训练模型时仅遇到训练数据,并将学习到的模型应用于从未见过的数据集。

    Transduction does not build a predictive model. If a new data point is added to the testing dataset, then we will have to re-run the algorithm from the beginning, train the model and then use it to predict the labels. On the other hand, inductive learning builds a predictive model. When you encounter new data points, there is no need to re-run the algorithm from the beginning.

    直推式学习不能建立预测模型。 如果将新的数据点添加到测试数据集中,那么我们将必须从头开始重新运行算法,训练模型,然后使用它来预测标签。 另一方面,归纳学习建立了预测模型。 当您遇到新的数据点时,无需从头开始重新运行算法。

    In more simple terms, inductive learning tries to build a generic model where any new data point would be predicted, based on an observed set of training data points. Here you can predict any point in the space of points, beyond the unlabelled points. In contrary, transductive learning builds a model that fits the training and testing data points it has already observed. This approach predicts labels of unlabelled points using the knowledge of the labelled points and additional information.

    用更简单的术语来说,归纳学习试图基于观察到的一组训练数据点,建立一个可以预测任何新数据点的通用模型。 在这里,您可以预测点空间中除未标记点之外的任何点。 相反,直推式学习建立一个模型,该模型适合已经观察到的训练和测试数据点。 这种方法使用标记点的知识和其他信息来预测未标记点的标记。

    Transductive learning can become costly in the case where new data points are introduced by an input stream. Each time a new data point arrives, you will have to re-run everything. On the other hand, inductive learning initially builds a predictive model and new data points can be labelled within a very short time with lesser computations.

    在输入流引入新数据点的情况下,直推式学习的成本可能会很高。 每次有新数据点到达时,您都必须重新运行所有内容。 另一方面,归纳学习最初会建立一个预测模型,并且可以在很短的时间内用较少的计算来标记新的数据点。

    Example Walkthrough(示例演练)

    Firstly, I will take the example shown in Figure 1. Consider that you have a set of points as shown in Figure 1. There are four labelled points A, B, C and D. Our goal is to label (colour) the remaining unlabelled (uncoloured) points numbered from 1 to 14. If we use inductive learning for this task, we will have to use these 4 labelled points and build a supervised learning model.

    首先,我将以图1所示的示例为例。假设您具有一组如图1所示的点。有四个标记点A,B,C和D。我们的目标是标记(彩色)其余未标记的点 (无色)点,编号为1到14。如果我们使用归纳学习来完成此任务,则必须使用这4个标记点,并建立一个监督学习模型。

    At a glance, we can see that there are two separate clusters. However, in inductive learning, since we have a very little number of training samples, it will be quite hard to build a predictive model that captures the complete structure of the data. For example, if a nearest neighbour approach is used, points closer to the border such as 12 and 14 may be coloured as red instead of green as they are closer to the red points A and B rather than the green points C and D (as shown in Figure 2).

    一目了然,我们可以看到有两个单独的集群。 但是,在归纳学习中,由于我们的训练样本数量很少,因此要构建一个捕获数据完整结构的预测模型将非常困难。 例如,如果使用最近邻点方法,则靠近边界的点(例如12和14)可能会被着色为红色而不是绿色,因为它们更靠近红色点A和B而不是绿色点C和D(例如 如图2所示。

     

    If we have some additional information about the data points such as connectivity information between the points based on features like similarity (as shown in Figure 3), we can use this additional information while training the model and labelling the unlabelled points.

    如果我们有一些关于数据点的附加信息,例如基于相似性等特征的点之间的连通性信息(如图3所示),我们可以在训练模型和标记未标记的点时使用这些附加信息。

    图4没有找到,原文中的显示不了

    For example, we can use a transductive learning approach such as a semi-supervised graph-based label propagation algorithm to label the unlabelled points as shown in Figure 4, using the structural information of all the labelled and unlabelled points. Points along the border such as 12 and 14 are connected to more green points than red points, and hence they get labelled as green, rather than red.

    例如,我们可以使用直推式学习方法,例如基于半监督图的标签传播算法,使用所有标记和未标记点的结构信息来标记未标记点,如图4所示。 沿边界的点(例如12和14)连接到的绿色点多于红色点,因此它们被标记为绿色,而不是红色。

    Final Thoughts

    We have discussed the differences between inductive and transductive learning and have gone through an example. Now that you have a basic idea of inductive and transductive learning approaches and their differences, you can make use of this knowledge when you are developing your next machine learning model.
    我们已经讨论了归纳学习和直推式学习之间的差异,并通过一个例子进行了探讨。 既然您已经具备了归纳和直推式学习方法及其差异的基本概念,那么在开发下一个机器学习模型时就可以利用这些知识。
    原文的参考链接

    References

    [1] Transduction (machine learning) Wikipedia — https://en.wikipedia.org/wiki/Transduction_(machine_learning)
    [2] What is the difference between inductive and transductive learning — https://www.quora.com/What-is-the-difference-between-inductive-and-transductive-learning
    [3] Transductive Inference and Semi-Supervised Learning — https://pdfs.semanticscholar.org/5a8c/38e6aadc29fb995d5b9562df0c4365156256.pdf
  • 相关阅读:
    论文笔记:SRCNN
    4.2 CNN实例探究
    4.1 卷积神经网络
    3 ML策略
    2.3 超参数调试,batch正则化和程序框架
    2.2 优化算法
    2.1 深度学习的实用层面
    Lecture4 反向传播算法
    Lecture3 神经网络学习
    java基础部分
  • 原文地址:https://www.cnblogs.com/chuanyang/p/13892321.html
Copyright © 2011-2022 走看看