Pytorch中的RNN、RNNCell、LSTM、LSTMCell、GRU、GRUCell的用法

zoukankan html css js c++ java

Pytorch中的RNN、RNNCell、LSTM、LSTMCell、GRU、GRUCell的用法
首先，当然，官方文档都有

RNN: https://pytorch.org/docs/stable/generated/torch.nn.RNN.html

RNNCell: https://pytorch.org/docs/stable/generated/torch.nn.RNNCell.html

LSTM: https://pytorch.org/docs/stable/generated/torch.nn.LSTM.html

LSTMCell: https://pytorch.org/docs/stable/generated/torch.nn.LSTMCell.html

GRU: https://pytorch.org/docs/stable/generated/torch.nn.GRU.html

GRUCell: https://pytorch.org/docs/stable/generated/torch.nn.GRUCell.html

这里，只是自己做下笔记

以LSTM和LSTMCell为例

LSTM的结构

LSTM the dim of definition input output weights

LSTM parameters:
- input_size: input x 的 features
- hidden_size: hidden state h 的 features
- num_layers: 层数，默认为1
- batch_first: if True，是(batch, seq, feature)，否则是(seq, batch, feature)，默认是False
- bidirectional: 默认为False
input:
- input: 当batch_first=False， tensor为(L, N, H_i) ，否则为 (N, L, H_i)
- h_0: tensor of shape (D*num_layers, N, H_out)，默认为zeros，如果(h_0, c_0) not provided
- c_0: tensor of shape (D*num_layers, n, H_cell)，默认为zeros，如果(h_0, c_0) not provided
where:

N = batch size

L = sequence length

D = 2 if bidirectional=True otherwise 1

H_in = input_size

H_cell = hidden_size

H_out = proj_size if proj_size>0 otherwise hidden_size，通常就是hidden_size咯

Output:
- output: (L, N, D*H_out) when batch_first=False，是一个长度为L的序列，[h_1[-1], h_2[-1], ..., h_L[-1]]，就是最后一层的hidden states
- h_n: tensor of shape (D*num_layers, N, H_out)
- c_n: tensor of shape (D*num_layers, N, H_cell)
Variables:

好像新版的有改动
- all_weights
Examples:
>>> rnn = nn.LSTM(10, 20, 2) # (input_size, hidden_size, num_layers) >>> input = torch.randn(5, 3, 10) # (time_steps, batch, input_size) >>> h0 = torch.randn(2, 3, 20) # (num_layers, batch_size, hidden_size) >>> c0 = torch.randn(2, 3, 20) >>> output, (hn, cn) = rnn(input, (h0, c0)) # (time_steps, batch, hidden_size) # output[-1] = h0[-1]
LSTM Cell

就是LSTM的一个单元，许多个LSTM Cell组成一个LSTM

结构

相比LSTM，少了参数t

Parameters:
- 只有input_size 和 hidden_size，没有了 num_layers
Inputs:
- input: (batch, input_size)
- h_0: (batch, hidden_size)
- c_0: (batch, hidden_size)
Outputs:
- h_1: (batch, hidden_size)
- c_1: (batch, hidden_size)
Variables:
- weight_ih: input-hidden weights, of shape (4*hidden_size, input_size)，因为是左乘W*input，且有4个W，所以是4*hidden_size
- weight_hh: hidden-hidden weights, of shape (4*hidden_size, hidden_size)
- bias_ih: input-hidden bias, of shape (4*hidden_size)
- bias_hh: hidden-hidden bias, of shape (4*hidden_size)
Example:
>>> rnn = nn.LSTMCell(10, 20) # (input_size, hidden_size) >>> input = torch.randn(2, 3, 10) # (time_steps, batch, input_size) >>> hx = torch.randn(3, 20) # (batch, hidden_size) >>> cx = torch.randn(3, 20) >>> output = [] >>> for i in range(2): hx, cx = rnn(input[i], (hx, cx)) output.append(hx) >>> output = torch.stack(output, dim=0)
个性签名：时间会解决一切
查看全文

相关阅读:
存储过程output String[1]: Size 属性具有无效大小值0
深入理解JS异步编程四（HTML5 Web Worker）
深入理解JS异步编程三(promise)
深入理解JS异步编程二(分布式事件)
深入理解JS异步编程（一）
不定高多行溢出文本省略
 深入解析js中基本数据类型与引用类型，函数参数传递的区别
 javascript的replace+正则实现ES6的字符串模版
 从输入网址到显示网页的全过程分析
 WebStorage 和 Cookie的区别

原文地址：https://www.cnblogs.com/lfri/p/15044391.html

Pytorch中的RNN、RNNCell、LSTM、LSTMCell、GRU、GRUCell的用法

LSTM的结构

LSTM the dim of definition input output weights

LSTM parameters:

input:

Output:

Variables:

Examples:

LSTM Cell

结构

Parameters:

Inputs:

Outputs:

Variables:

Example: