zoukankan html css js c++ java

tensorflow.nn.bidirectional_dynamic_rnn()函数的用法

在分析Attention-over-attention源码过程中，对于tensorflow.nn.bidirectional_dynamic_rnn()函数的总结：

首先来看一下，函数：

def bidirectional_dynamic_rnn(
cell_fw, # 前向RNN
cell_bw, # 后向RNN
inputs, # 输入
sequence_length=None,# 输入序列的实际长度（可选，默认为输入序列的最大长度）
initial_state_fw=None,  # 前向的初始化状态（可选）
initial_state_bw=None,  # 后向的初始化状态（可选）
dtype=None, # 初始化和输出的数据类型（可选）
parallel_iterations=None,
swap_memory=False, 
time_major=False,
# 决定了输入输出tensor的格式：如果为true, 向量的形状必须为 `[max_time, batch_size, depth]`. 
# 如果为false, tensor的形状必须为`[batch_size, max_time, depth]`. 
scope=None
)

返回值：

元组：(outputs, output_states)
 outputs = (output_fw, output_bw)
 output_states = (output_state_fw, output_state_bw)

其中，

outputs为(output_fw, output_bw)，是一个包含前向cell输出tensor和后向cell输出tensor组成的元组。假设

time_major=false,tensor的shape为[batch_size, max_time, depth]。实验中使用tf.concat(outputs, 2)将其拼接。

output_states为(output_state_fw, output_state_bw)，包含了前向和后向最后的隐藏状态的组成的元组。

output_state_fw和output_state_bw的类型为LSTMStateTuple。

LSTMStateTuple由（c，h）组成，分别代表memory cell和hidden state。

c_fw,h_fw = output_state_fw

c_bw,h_bw = output_state_bw

最后再分别将c和h状态concat起来，用tf.contrib.rnn.LSTMStateTuple()函数生成decoder端的初始状态。

# lstm模型　正方向传播的RNN
lstm_fw_cell = tf.nn.rnn_cell.BasicLSTMCell(embedding_size, forget_bias=1.0)
# 反方向传播的RNN
lstm_bw_cell = tf.nn.rnn_cell.BasicLSTMCell(embedding_size, forget_bias=1.0)

但是看来看去，输入两个cell都是相同的啊？
其实在bidirectional_dynamic_rnn函数的内部，会把反向传播的cell使用array_ops.reverse_sequence的函数将输入的序列逆序排列，使其可以达到反向传播的效果。
在实现的时候，我们是需要传入两个cell作为参数就可以了：

(outputs, output_states) = tf.nn.bidirectional_dynamic_rnn(lstm_fw_cell, lstm_bw_cell, 
                                                           embedded_chars,  dtype=tf.float32)

embedded_chars为输入的tensor，[batch_szie, max_time, depth]。batch_size为模型当中batch的大小，应用在文
本中时，max_time可以为句子的长度（一般以最长的句子为准，短句需要做padding），depth为输入句子词向量的维度。

当为双向GRU时，跟LSTM类似：

  with tf.variable_scope('document', initializer=orthogonal_initializer()):#生成正交矩阵的初始化器。
    fwd_cell = tf.contrib.rnn.GRUCell(FLAGS.hidden_size)#变长动态RNN的实现
    back_cell = tf.contrib.rnn.GRUCell(FLAGS.hidden_size)

    doc_len = tf.reduce_sum(doc_mask, reduction_indices=1)#在第二维上压缩求和，可用来降维
    h, _ = tf.nn.bidirectional_dynamic_rnn(
        fwd_cell, back_cell, doc_emb, sequence_length=tf.to_int64(doc_len), dtype=tf.float32)
        #doc_len求得的结果可能是其他类型，然后将他转化为64为整型
        #doc_emb前面已经确定它的[batch_size,max_time,depth]
        #dype输出类型

    h_doc = tf.concat(h, 2)

可参考：

变长双向rnn的正确使用姿势：https://blog.csdn.net/lijin6249/article/details/78955175

tensorflow.nn.bidirectional_dynamic_rnn()函数的用法：https://blog.csdn.net/wuzqChom/article/details/75453327

查看全文

相关阅读:
SytemC on CentOS 5.3 64bit
Fast Poisson Disk Sampling
Geometry Imager Viewport Filter
Dinornis – Rendering your Model in Mudbox by RenderMan Directly !
Models of biological pattern formation
OrthoLab
如何编译ATILA GPU Emulator
感受谷歌地图
 树状列表完成
 获取地图标记点经纬度

原文地址：https://www.cnblogs.com/gaofighting/p/9673338.html