以下是一些我在使用PyTorch中遇到的一些类及函数,为了便于理解和使用,将官网中的说明摘录一些下来。
torch.nn.modules.conv1d
来源 https://pytorch.org/docs/stable/_modules/torch/nn/modules/conv.html#Conv1d
class Conv1d(_ConvNd):
r"""Applies a 1D convolution over an input signal composed of several input planes. """
def __init__(self, in_channels, out_channels, kernel_size, stride=1,
padding=0, dilation=1, groups=1,
bias=True, padding_mode='zeros'):
kernel_size = _single(kernel_size)
stride = _single(stride)
padding = _single(padding)
dilation = _single(dilation)
super(Conv1d, self).__init__(
in_channels, out_channels, kernel_size, stride, padding, dilation,
False, _single(0), groups, bias, padding_mode)
类的说明:对由多个输入平面组成的输入信号应用一维卷积
官网中对初始化函数中一些参数的说明:
in_channels (int): Number of channels in the input image
out_channels (int): Number of channels produced by the convolution
kernel_size (int or tuple): Size of the convolving kernel
stride (int or tuple, optional): Stride of the convolution. Default: 1
padding (int or tuple, optional): Zero-padding added to both sides of the input. Default: 0
dilation (int or tuple, optional): Spacing between kernel elements. Default: 1
groups (int, optional): Number of blocked connections from input channels to output channels. Default: 1
bias (bool, optional): If
True
, adds a learnable bias to the output. Default:True
padding_mode (string, optional). Accepted values
zeros
andcircular
Default:zeros
机翻
in_channels (int): 输入图像中的通道数
out_channels (int): 由卷积产生的信道数
kernel_size (int or tuple): 卷积核的大小
stride (int or tuple, optional): 卷积的步幅
padding (int or tuple, optional): 输入的两边都加上了零填充
dilation (int or tuple, optional): 卷积核元素之间的间距
groups (int, optional): 从输入通道到输出通道的阻塞连接数
bias (bool, optional): 如果为“ True” ,则在输出中添加可学习的偏差
padding_mode (string, optional): 接受值“0”和“循环”
torch.nn.modules.conv2d
来源 https://pytorch.org/docs/stable/_modules/torch/nn/modules/conv.html#Conv2d
class Conv2d(_ConvNd):
"""Applies a 2D convolution over an input signal composed of several input planes."""
def __init__(self, in_channels, out_channels, kernel_size, stride=1,
padding=0, dilation=1, groups=1,
bias=True, padding_mode='zeros'):
kernel_size = _pair(kernel_size)
stride = _pair(stride)
padding = _pair(padding)
dilation = _pair(dilation)
super(Conv2d, self).__init__(
in_channels, out_channels, kernel_size, stride, padding, dilation,
False, _pair(0), groups, bias, padding_mode)
类的说明:在由多个输入平面组成的输入信号上应用二维卷积。
官网中对初始化函数中一些参数的说明:
in_channels (int): Number of channels in the input image
out_channels (int): Number of channels produced by the convolution
kernel_size (int or tuple): Size of the convolving kernel
stride (int or tuple, optional): Stride of the convolution. Default: 1
padding (int or tuple, optional): Zero-padding added to both sides of the input. Default: 0
dilation (int or tuple, optional): Spacing between kernel elements. Default: 1
groups (int, optional): Number of blocked connections from input channels to output channels. Default: 1
bias (bool, optional): If
True
, adds a learnable bias to the output. Default:True
padding_mode (string, optional). Accepted values
zeros
andcircular
Default:zeros
机翻
in_channels (int): 输入图像中的通道数
out_channels (int): 由卷积产生的信道数
kernel_size (int or tuple): 卷积核的大小
stride (int or tuple, optional): 卷积的步幅
padding (int or tuple, optional): 输入的两边都加上了零填充
dilation (int or tuple, optional): 卷积核元素之间的间距
groups (int, optional): 从输入通道到输出通道的阻塞连接数
bias (bool, optional): 如果为“ True” ,则在输出中添加可学习的偏差
padding_mode (string, optional): 接受值“0”和“循环”
以下是博客中对参数的含义做的进一步解释。
-
stride(步长):控制cross-correlation的步长,可以设为1个int型数或者一个(int, int)型的tuple。
-
padding(补0):控制zero-padding的数目。=0时,不填充,原图与卷积核进行卷积;=1时,在原图四边填充一行(一列),具体填充的数据由padding_mode控制,一般填0。
-
dilation(扩张):控制kernel点(卷积核点)的间距; 也被称为 "à trous"算法. 可以在此github地址查看:Dilated convolution animations
-
groups(卷积核个数):这个比较好理解,通常来说,卷积个数唯一,但是对某些情况,可以设置范围在1 —— in_channels中数目的卷积核:
At groups=1, all inputs are convolved to all outputs.
At groups=2, the operation becomes equivalent to having two conv layers side by side, each seeing half the input channels, and producing half the output channels, and both subsequently concatenated.
At groups=in_channels, each input channel is convolved with its own set of filters (of size ⌊out_channelsin_channels⌋
).
下面是官网对于nn.Conv2d
输入输出shape的说明。
以及官网中给出的样例
>>> # With square kernels and equal stride
>>> m = nn.Conv2d(16, 33, 3, stride=2)
>>> # non-square kernels and unequal stride and with padding
>>> m = nn.Conv2d(16, 33, (3, 5), stride=(2, 1), padding=(4, 2))
>>> # non-square kernels and unequal stride and with padding and dilation
>>> m = nn.Conv2d(16, 33, (3, 5), stride=(2, 1), padding=(4, 2), dilation=(3, 1))
>>> input = torch.randn(20, 16, 50, 100)
>>> output = m(input)