序列模型（2）-----循环神经网络RNN

zoukankan html css js c++ java

序列模型（2）-----循环神经网络RNN
一、RNN的作用和粗略介绍：

RNN可解决的问题：

训练样本输入是连续的序列,且序列的长短不一，比如基于时间的序列：一段段连续的语音，一段段连续的手写文字。这些序列比较长，且长度不一，比较难直接的拆分成一个个独立的样本来通过DNN/CNN进行训练。

T个时间步：

我们先来看单个RNN cell:

简单的RNN前向传播实现过程：

以上代码实现：
import numpy as np # 定义RNN的参数。 X = [1,2] state = [0.0, 0.0] w_cell_state = np.asarray([[0.1, 0.2], [0.3, 0.4]]) w_cell_input = np.asarray([0.5, 0.6]) b_cell = np.asarray([0.1, -0.1]) w_output = np.asarray([[1.0], [2.0]]) b_output = 0.1 # 执行前向传播过程。 for i in range(len(X)): before_activation = np.dot(state, w_cell_state) + X[i] * w_cell_input + b_cell state = np.tanh(before_activation) final_output = np.dot(state, w_output) + b_output print ("before activation: ", before_activation) print ("state: ", state) print ("output: ", final_output)
二、RNN模型：

上图中左边是RNN模型没有按时间展开的图，如果按时间序列展开，则是上图中的右边部分。我们重点观察右边部分的图。

这幅图描述了在序列索引号

　　　　1）

　　　　2）

　　　　3）

　　　　4）

　　　　5）

　　　　6）

三、 RNN前向传播算法
1. 对于任意一个序列索引号
2. 序列索引号
为了简化描述，这里的损失函数我们为对数损失函数，输出的激活函数为softmax函数，隐藏层的激活函数为tanh函数。

（1）对于RNN，由于我们在序列的每个位置 t 都有损失函数，因此最终的损失 $L$

　　　

（2）其中

（3） $W, U, b$

$W, U, b$

$W, U, b$

各个参数的更新式子：

五、RNN的应用：

（1）多对多【输入输出个数相同】

（2）多对一

（3）一对多：

只在序列中开始进行输入计算。

或者：

摘自：https://www.cnblogs.com/pinard/p/6509630.html

relu + rNN论文: Improving performance of recurrent neural network with relu nonlinearity

https://blog.csdn.net/qq_32284189/article/details/82225121
查看全文