zoukankan      html  css  js  c++  java
  • Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks-paper

     Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks

    作者信息:
    Kai Sheng Tai Stanford University
    Richard Socher MetaMind
    Christopher D. Manning Stanford University

     数据:

    1)Stanford Sentiment Treebank 情感分为五类

    2)Sentence Involving Compositional Knowledge(SICK)  句子对有相关性得分

    1 introduction 

    Most models for distributed representations of phrases and sentences—that is, models where realvalued vectors are used to represent meaning—fall into one of three classes:

    bag-of-words models-句子中的单词的序列关系看不出来

    sequence models

    tree-structured models.-包含了句法语义

    与standard LSTM 相比, Tree-LSTM 有以下这行特性:
    (1)Tree-LSTM 可能依赖多个子节点
    (2)forget gate 可能有多个,与子节点的个数有关

    本文给出两种tree-LSTM :
    (1) Child-Sum Tree-LSTMs
    (2) N-ary Tree-LSTMs

    这篇文章介绍了将标准lstm改进为树结构一般化过程,在序列lstm上可以表示出句子的含义a generalization of
    区别: 

    the standard LSTM composes -- hidden state from the input at the current time step and the hidden state of the LSTM unit in the previous time step,

    the tree-structured LSTM, orTree-LSTM--composes its state from an input vector and the hidden states of arbitrarily many child units.

    标准lstm是tree-lstm的一个特例,看做tree-lstm的每个内部节点只有一个孩子

    2 Long Short-Term Memory Networks

     Two commonly-used variants of the basic LSTM architecture :

    the Bidirectional LSTM —— At each time step, the hidden state of the Bidirectional LSTM is the concatenation of the forward and backward hidden states.

    the Multilayer LSTM (also known as the stacked or deep LSTM)—— the idea is to let the higher layers capture longerterm dependencies of the input sequence.

    3Tree-Structured LSTMs

    该论文提出两个结构:

    the Child-Sum Tree-LSTM

    and the N-ary Tree-LSTM.

    Under the Tree-RNN framework,the vectorial representation associated with each node of a tree is composed as a function of the vectors corresponding to the children of the node. The choice of composition function gives rise to numerous variants of this basic framework.

    Tree-RNNs have been used to parse images of natural scenes (Socher et al., 2011), compose phrase representations from word vectors (Socher et al., 2012), and classify the sentiment polarity of sentences (Socher et al., 2013).

     

    4 models

    tree-LSTM的两个应用: 
    (1)classification

    hjj 就是利用tree-LSTM计算出的node j 的embedding

    (2) Semantic relatedness of Sentence Pairs

    hLL 和 hRR 是利用Tree-LSTM对两个句子的embedding representations, 经过上面一系列公式的操作比较两个句子的senmantic relatedness

    6 Results

    指标:

    1)Pearson's

    2)Spearman's

    3)MSE

    6.1 Sentiment Classification

    细腻度情感分析Fine-grained: 5-class sentiment  classification.

    Binary: positive/negative sentiment classification.

    微调有助于区分更细腻度的区分度

     对于细腻度情感分析来说bi-lstm比lstm更更好,但是对于二分类来说效果差不多,猜测是由于细腻度的需要更多输入向量表示和隐藏层有更多更复杂的互动,而二分类中想要保留的分类的状态lstm已经足够去保持

    6.2 Semantic Relatedness

    --------------------------------------------------

    斯坦福的sentiment treebank:

    treebank的形式如下
    (0 (1 You) (2 (3 can) (4 (5 (6 run) (7 (8 this) (9 code))) (10 (11 with) (12 (13 (14 our) (15 (16 trained) (17 model))) (18 (19 on) (20 (21 (22 text) (23 files)) (24 (25 with) (26 (27 the) (28 (29 following) (30 command)))))))))))
    这是句子“You can run this code with our trained model on text files with the following command”经过stanford模型计算后得到的情感treebank形式。

    每个括号中的第一个元素为规则的头,比如对于左右两边都只有一个节点的规则:
    (1 You): 1->You , 1表示的是NON-Terminal字符,You表示terminal字符,和标准的pennetreebank的区别是1代表的是这个节点的情感强度,分五个等级。

    (0 (1 You) (2 (3 can)…) :
    在这个规则里,右边有两个节点,是一个标准的二叉树,0-> 1, 2。

  • 相关阅读:
    网宿科技股份有限公司投资者关系活动记录表(2014.3.30)
    网宿科技投资者关系活动记录2016年10月31日
    [转载]20131206 网宿科技电话交流会纪要
    strlcpy和strlcat
    114 的 dns 的解析测试
    大批量数据读写
    ART——一个轻量级的web报表工具
    递归删除.svn文件
    SA常用命令
    淘女郎团队敏捷开发实践2011上半年回顾
  • 原文地址:https://www.cnblogs.com/rosyYY/p/10186294.html
Copyright © 2011-2022 走看看