zoukankan      html  css  js  c++  java
  • Sentiment Analysis(1)-Dependency Tree-based Sentiment Classification using CRFs with Hidden Variables

    The content is from this paper: Dependency Tree-based Sentiment Classification using CRFs with Hidden Variables, by Tetsuji Nakagawa. 

    A typical approach for sentiment classification is to use supervised machine learning algorithms with bag-of-words as features. A subjective sen- tence is represented as a set of words in the sentence, ignoring word order and head-modifier relation between words. However, sentiment classifi- cation is different from traditional topic-based text classification. Topic-based text classification is generally a linearly separable problem. For example, when a document contains some domain-specific words, the document will probably belong to the domain. However, in sentiment classification, sentiment polarities can be reversed. In sentiment classification, a sentence which contains positive (or negative) polarity words does not necessarily have the same polarity as a whole, and we need to consider interactions between words instead of handling words independently. 

    One issue of the approach to use sentence composition and machine learning is that only the whole sentence is labeled with its polarity in general corpora for sentiment classification, and each component of the sentence is not labeled.  

    From: Phrase Dependency Parsing for Opinion Mining 

    Previous works on mining opinions can be divided into two directions: sentiment classification and sentiment related information extraction. The former is a task of identifying positive and negative sentiments from a text which can be a passage, a sentence, a phrase and even a word. The latter focuses on extracting the elements composing a sentiment text. The elements include source of opinions who expresses an opinion, target of opinions which is a receptor of an opinion, opinion expression which delivers an opinion. In this paper, we define an opinion unit as a triple consisting of a product feature, an expression of opinion, and an emotional attitude(positive or negative). 

    Since a product feature could not be represented by a single word, dependency parsing might not be the best approach here unfortunately, which provides dependency relations only between words. Previous works on relation extraction usually use the head word to represent the whole phrase and extract features from the word level dependency tree. This solution is problematic because the information provided by the phrase itself can not be used by this kind of methods. And, experimental results show that relation extraction task can benefit from dependencies within a phrase. 

    Currently, the mainstream of dependency parsing is conducted on lexical elements: relations are built between single words. A major information loss of this word level dependency tree compared with constituent tree is that it doesn’t explicitly provide local structures and syntactic categories. On the other hand, dependency tree provides connections between distant words, which are useful in extracting long distance relations.  

    In practice, for a certain domain of product reviews, a language model is build on easily acquired unlabeled data. Each candidate NP or VP chunk in the output of shallow parser is scored by the model, and cut off if its score is less than a threshold. 

    Enhanced SenticNet with Affective Labels for Concept-Based Opinion Mining 

     

    Recent research shows that concept-based sentiment analysis and opinion mining outperform word-based methods. This concept-focused approach relies on polarity and affective information for commonsense knowledge concepts 

     

  • 相关阅读:
    python详解json模块
    postman---post请求数据类型
    postman---postman发送请求
    SpringBoot之集成通用Mapper
    Mybatis-generator/通用Mapper/Mybatis-Plus对比
    spring-data-JPA repository自定义方法规则
    JPA之@GeneratedValue注解
    Java工具类NumberUtils使用
    shell函数
    Maven的生命周期
  • 原文地址:https://www.cnblogs.com/wintor12/p/3777662.html
Copyright © 2011-2022 走看看