zoukankan      html  css  js  c++  java
  • 情感分析:基于循环神经网络

    情感分析:基于循环神经网络

    Sentiment Analysis: Using Recurrent Neural Networks

    与搜索同义词和类比词类似,文本分类也是单词嵌入的一个下游应用。在本文中,将应用预训练的词向量(glow)和具有多个隐藏层的双向递归神经网络,如图1所示。将使用该模型来判断长度不定的文本序列是包含积极情绪还是消极情绪。

     

     图1. 本节将经过预训练的GloVe to RNN-based提供给基于RNN的体系结构,用于情感分析。

    from d2l import mxnet as d2l

    from mxnet import gluon, init, np, npx

    from mxnet.gluon import nn, rnn

    npx.set_np()

    batch_size = 64

    train_iter, test_iter, vocab = d2l.load_data_imdb(batch_size)

    1. Using a Recurrent Neural Network Model

    在该模型中,每个词首先从嵌入层获得一个特征向量。然后,利用双向递归神经网络对特征序列进行编码,得到序列信息。最后,将编码后的序列信息通过全连通层输出。具体来说,可以将双向长短期存储器的隐藏状态连接在初始时间步和最终时间步中,并将其作为编码的特征序列信息传递给输出层分类。在下面实现的BiRNN类中,嵌入实例是嵌入层,LSTM实例是序列编码的隐藏层,密集实例是生成分类结果的输出层。

    class BiRNN(nn.Block):

        def __init__(self, vocab_size, embed_size, num_hiddens,

                     num_layers, **kwargs):

            super(BiRNN, self).__init__(**kwargs)

            self.embedding = nn.Embedding(vocab_size, embed_size)

            # Set Bidirectional to True to get a bidirectional recurrent neural

            # network

            self.encoder = rnn.LSTM(num_hiddens, num_layers=num_layers,

                                    bidirectional=True, input_size=embed_size)

            self.decoder = nn.Dense(2)

        def forward(self, inputs):

            # The shape of inputs is (batch size, number of words). Because LSTM

            # needs to use sequence as the first dimension, the input is

            # transformed and the word feature is then extracted. The output shape

            # is (number of words, batch size, word vector dimension).

            embeddings = self.embedding(inputs.T)

            # Since the input (embeddings) is the only argument passed into

            # rnn.LSTM, it only returns the hidden states of the last hidden layer

            # at different timestep (outputs). The shape of outputs is

            # (number of words, batch size, 2 * number of hidden units).

            outputs = self.encoder(embeddings)

            # Concatenate the hidden states of the initial timestep and final

            # timestep to use as the input of the fully connected layer. Its

            # shape is (batch size, 4 * number of hidden units)

            encoding = np.concatenate((outputs[0], outputs[-1]), axis=1)

            outs = self.decoder(encoding)

            return outs

    创建一个具有两个隐藏层的双向递归神经网络。

    embed_size, num_hiddens, num_layers, ctx = 100, 100, 2, d2l.try_all_gpus()

    net = BiRNN(len(vocab), embed_size, num_hiddens, num_layers)

    net.initialize(init.Xavier(), ctx=ctx)

    1.1. Loading Pre-trained Word Vectors

    由于用于情感分类的训练数据集不是很大,为了处理过拟合,将直接使用在较大语料库上预先训练的词向量作为所有词的特征向量。这里,为字典vocab中的每个单词加载一个100维的GloVe词向量。

    glove_embedding = d2l.TokenEmbedding('glove.6b.100d')

    查询词汇表中的单词向量。

    embeds = glove_embedding[vocab.idx_to_token]

    embeds.shape

    (49339, 100)

    然后,将这些词向量作为评论中每个词的特征向量。请注意,预先训练的词向量的维数需要与创建的模型中的嵌入层输出大小embed_size一致。此外,不再在训练期间更新这些词向量。

    net.embedding.weight.set_data(embeds)

    net.embedding.collect_params().setattr('grad_req', 'null')

    2. Training and Evaluating the Model

    现在,可以开始训练了。

    lr, num_epochs = 0.01, 5

    trainer = gluon.Trainer(net.collect_params(), 'adam', {'learning_rate': lr})

    loss = gluon.loss.SoftmaxCrossEntropyLoss()

    d2l.train_ch13(net, train_iter, test_iter, loss, trainer, num_epochs, ctx)

    loss 0.299, train acc 0.872, test acc 0.842

    626.7 examples/sec on [gpu(0), gpu(1)]

     

    最后,定义了预测函数。

    #@save

    def predict_sentiment(net, vocab, sentence):

        sentence = np.array(vocab[sentence.split()], ctx=d2l.try_gpu())

        label = np.argmax(net(sentence.reshape(1, -1)), axis=1)

    return 'positive' if label == 1 else 'negative'

    然后,利用训练好的模型对两个简单句子的情感进行分类。

    predict_sentiment(net, vocab, 'this movie is so great')

    'positive'

    predict_sentiment(net, vocab, 'this movie is so bad')

    'negative'

    2. Summary

    • Text classification transforms a sequence of text of indefinite length into a category of text. This is a downstream application of word embedding.
    • We can apply pre-trained word vectors and recurrent neural networks to classify the emotions in a text.
  • 相关阅读:
    不懂数据库索引的底层原理?那是因为你心里没点b树
    你必须了解的java内存管理机制(三)-垃圾标记
    一次给女朋友转账引发我对分布式事务的思考
    看完此文,妈妈还会担心你docker入不了门?
    再过半小时,你就能明白kafka的工作原理了
    IEEE 754浮点数表示标准
    ROM正弦波发生器
    对分布式工程的初步认识和理解
    泛型(二)
    泛型(一)
  • 原文地址:https://www.cnblogs.com/wujianming-110117/p/13226132.html
Copyright © 2011-2022 走看看