zoukankan      html  css  js  c++  java
  • 四种losses

    1. Classification losses

    每次输入一个样本,对样本进行类别预测,根据预测类别和真实标签得到对应的分类损失。

    2. Pairwise losses

    每次输入两个样本,数据集包含了这两个样本是否相似的信息。计算损失时根据模型在这两个样本上的输出和相似信息进行计算。

    3. Triplet losses

    每次输入三个样本,(x, x_-, x_+)。其中(x_-和x_+)分别为(x)的负样本(不相似)和正样本(相似)。根据这三个数据在模型的输出以及对应的相似信息得到损失。

    Quadruplet losses

    每次输入四个数据,这四个数据都是不同的,其中包含了一对相似数据和一对不相似数据。根据四个数据在模型的输出和对应的相似信息得到损失。

    疑问

    • 不同losses之间的优缺点以及适用场景。

    参考

    • Ustinova E, Lempitsky V. Learning Deep Embeddings with Histogram Loss[J]. 2016.

    Classification losses. It has been observed in [8] and confirmed later in multiple works (e.g. [15])
    that deep networks trained for classification can be used for deep embedding. In particular, it is
    sufficient to consider an intermediate representation arising in one of the last layers of the deep
    network. The normalization is added post-hoc. Many of the works mentioned below pre-train their
    embeddings as a part of the classification networks.
    Pairwise losses. Methods that use pairwise losses sample pairs of training points and score them
    independently. The pioneering work on deep embeddings [3] penalizes the deviation from the unit
    cosine similarity for positive pairs and the deviation from -1 or -0:9 for negative pairs. Perhaps,
    the most popular of pairwise losses is the contrastive loss [5, 20], which minimizes the distances in
    the positive pairs and tries to maximize the distances in the negative pairs as long as these distances
    are smaller than some margin M. Several works pointed to the fact that attempting to collapse all
    positive pairs may lead to excessive overfitting and therefore suggested losses that mitigate this
    effect, e.g. a double-margin contrastive loss [12], which drops to zero for positive pairs as long as
    their distances fall beyond the second (smaller) margin. Finally, several works use non-hinge based
    pairwise losses such as log-sum-exp and cross-entropy on the similarity values that softly encourage
    the similarity to be high for positive values and low for negative values (e.g. [24, 27]). The main
    problem with pairwise losses is that the margin parameters might be hard to tune, especially since
    the distributions of distances or similarities can be changing dramatically as the learning progresses.
    While most works “skip” the burn-in period by initializing the embedding to a network pre-trained for classification [24], [22] further demonstrated the benefit of admixing the classification loss during
    the fine-tuning stage (which brings in another parameter).
    Triplet losses. While pairwise losses care about the absolute values of distances of positive and
    negative pairs, the quality of embeddings ultimately depends on the relative ordering between positive
    and negative distances (or similarities). Indeed, the embedding meets the needs of most practical
    applications as long as the similarities of positive pairs are greater than similarities of negative pairs
    [19, 26]. The most popular class of losses for metric learning therefore consider triplets of points
    x0; x+; x-, where x0; x+ form a positive pair and x0; x- form a negative pair and measure the
    difference in their distances or similarities. Triplet-based loss can then e.g. be aggregated over all
    triplets using a hinge function of these differences. Triplet-based losses are popular for large-scale
    embedding learning [4] and in particular for deep embeddings [13, 14, 17, 21, 28]. Setting the margin
    in the triplet hinge-loss still represents the challenge, as well as sampling “correct” triplets, since the
    majority of them quickly become associated with zero loss. On the other hand, focusing sampling on
    the hardest triplets can prevent efficient learning [17]. Triplet-based losses generally make learning
    less constrained than pairwise losses. This is because for a low-loss embedding, the characteristic
    distance separating positive and negative pairs can vary across the embedding space (depending on
    the location of x0), which is not possible for pairwise losses. In some situations, such added flexibility
    can increase overfitting.
    Quadruplet losses. Quadruplet-based losses are similar to triplet-based losses as they are computed
    by looking at the differences in distances/similarities of positive pairs and negative pairs. In the case
    of quadruplet-based losses, the compared positive and negative pairs do not share a common point
    (as they do for triplet-based losses). Quadruplet-based losses do not allow the flexibility of tripletbased losses discussed above (as they includes comparisons of positive and negative pairs located in
    different parts of the embedding space). At the same time, they are not as rigid as pairwise losses, as
    they only penalize the relative ordering for negative pairs and positive pairs. Nevertheless, despite
    these appealing properties, quadruplet-based losses remain rarely-used and confined to “shallow”
    embeddings [9, 30]. We are unaware of deep embedding approaches using quadruplet losses. A
    potential problem with quadruplet-based losses in the large-scale setting is that the number of all
    quadruplets is even larger than the number of triplets. Among all groups of losses, our approach
    is most related to quadruplet-based ones, and can be seen as a way to organize learning of deep
    embeddings with a quarduplet-based loss in an efficient and (almost) parameter-free manner.

  • 相关阅读:
    程序员如何制定自己的一份年度计划
    【Spring入门系列】篇3:再探IOC
    【Spring入门系列】篇2:SpringIOC入门
    【Spring入门系列】篇1:Spring简介
    适配器模式
    java编程思想之正则表达式
    代理模式
    建造者模式
    抽象工厂模式
    工厂方法模式
  • 原文地址:https://www.cnblogs.com/CSLaker/p/9829051.html
Copyright © 2011-2022 走看看