zoukankan      html  css  js  c++  java
  • Noise Contrastive Estimation

    Notes from Notes on Noise Contrastive Estimation and Negative Sampling
    one sample:

    [x_i o [y_i^0,cdots,y_{i}^{k}] ]

    where (y_i^0) are true labeled words , and (y_i^1,cdots,y_i^{k}) are noise samples word index, which is generated by unigram distribution (q(w)) of the dataset.
    the probability of true data:

    [p(y_i^0=1|x_i, heta)=frac{exp(y_i^0,h_ heta)}{exp(y_i^0 h_ heta) + k*q(y_i^0)} ]

    the noise sample probability:

    [p(y_i^t=0|x_i, heta)=frac{k*q(y_i^t)}{exp(y_i^t h_ heta) + k*q(y_i^t)},t=1,cdots,k ]

    the cost function of this sample:

    [l_{nce}=log p(y_i^0|x_i, heta)+sum_{t=1}^k{log p(y_i^t|x_i, heta)} ]

    the overall cost function of the dataset:

    [mathcal{L}_{nce}=frac{1}{N}sum_i^N{left{log p(y_i^0|x_i, heta)+sum_{t=1}^k{log p(y_i^t|x_i, heta)} ight}} ]

    [Noise-Contrastive Estimation of Unnormalized Statistical Models with Applications to Natural Image Statistics]

    [Word2vec Parameter Learning Explained]

    [Efficient Estimation of Word Representation in Vector Space]

    [Distributed Representations of Words and Phrases and their Compositionality]

    [Notes on Noise Contrastive Estimation and Negative Sampling]

  • 相关阅读:
    详解 注解
    线段树分治
    实用Trick
    CF932F(李超线段树+dp)
    CF24D Broken robot(高斯消元)
    LCT学习笔记
    [HNOI2008]GT考试
    [AHOI2009]中国象棋
    [APIO2012]派遣
    CF961G Partitions
  • 原文地址:https://www.cnblogs.com/ZJUT-jiangnan/p/5934647.html
Copyright © 2011-2022 走看看