zoukankan      html  css  js  c++  java
  • Long Short-Term Memory (LSTM)公式简介

    Long short-term memory:

    make that short-term memory last for a long time.

    Paper Reference:

    A Critical Review of Recurrent Neural Networks for Sequence Learning

    LSTM Gate模型

    Three Types of Gate

    Input Gate:

    Controls how much of the current input (x_t) and the previous output (h_{t-1}) will enter into the new cell.

    [i_t=sigma(W^i x_t+U^i h_{t-1}+b^i) ]

    Forget Gate:

    Decide whether to erase (set to zero) or keep individual components of the memory.

    [f_t=sigma(W^f x_t+U^f h_{t-1}+b^f) ]

    Cell Update:

    Transforms the input and previous state to be taken into account into the current state.

    [g_t=phi(W^g x_t+U^g h_{t-1}+b^g) ]

    Output Gate:

    Scales the output from the cell.

    [o_t=sigma(W_o x_t+U^o h^{t-1}+b^o) ]

    Internal State update:

    Computes the current timestep's state using the gated previous state and the gated input.

    [s_t=g_tcdot i_t+s_{t-1}cdot f_t ]

    Hidden Layer:

    Output of the LSTM scaled by a ( anh) (squashed) transformations of the current state.

    [h_t=s_tcdot phi(o_t) ]

    其中(cdot) 代表"element-wise matrix multiplication"(对应元素相乘),(phi(x)= anh(x),sigma(x)=sigmoid(x))

    [phi(x)=frac{e^x-e^{-x}}{e^x+e^{-x}},sigma(x)=frac{1}{1+e^{-x}} ]

    Parallel Computing

    input gate, forget gate, cell update, output gate can be computed in parallel.

    [egin{bmatrix} i^t\ f^t\g^t\o^t end{bmatrix} =egin{bmatrix}sigma\ sigma\phi\sigmaend{bmatrix} imes W imes[x^t,h^{t-1}] ]

    LSTM network for Semantic Analysis

    LSTM network for semantic analysis
    Model Architecture
    Model: LSTM layer --> Averaging Pooling --> Logistic Regession

    Input sequence:

    [x_0,x_1,x_2,cdots,x_n ]

    representation sequence:

    [h_0,h_1,h_2,cdots,h_n ]

    This representation sequence is then averaged over all timesteps resulting in representation h:

    [h=sumlimits_i^n{h_i} ]

    Bidirectional LSTM

    貌似只能用于 fixed-length sequence. 还有一点就是在传统的机器学习中我们实际上无法获取到 future infromation

  • 相关阅读:
    ORM框架-SQLAchemy使用
    python与MySQL
    python 与rabbitmq
    阻止微信浏览器/QQ浏览器长按弹框“在浏览器打开”
    解决ios不支持按钮:active伪类的方法
    HTTP-FLV直播初探
    对比requirejs更好的理解seajs
    ‘true’==true返回false详解
    支付宝wap支付调起客户端
    JavaScript中基本数据类型和引用数据类型的区别
  • 原文地址:https://www.cnblogs.com/ZJUT-jiangnan/p/5506613.html
Copyright © 2011-2022 走看看