Pytorch——torch.nn.init 中实现的初始化函数

zoukankan html css js c++ java

Pytorch——torch.nn.init 中实现的初始化函数
参考：官方

1. 均匀分布

　　torch.nn.init.uniform_(tensor, a=0.0, b=1.0)

解释：

　　Fills the input Tensor with values drawn from the uniform distribution $mathcal{U}(a, b)$

参数：
- tensor – an n-dimensional torch.Tensor
- a – the lower bound of the uniform distribution
- b – the upper bound of the uniform distribution
例子：
w = torch.empty(2, 2) print('before init w = ',w) torch.nn.init.uniform_(w, a=0.0, b=1.0) print('after init w = ',w)
结果：
```
before init w = 
 tensor([[1.4013e-45, 0.0000e+00],
        [0.0000e+00, 0.0000e+00]])
after init w = 
 tensor([[0.8658, 0.3711],
        [0.8950, 0.1419]])
```
2. 高斯分布

torch.nn.init.normal_(tensor, mean=0.0, std=1.0)

解释：

　　Fills the input Tensor with values drawn from the normal distribution $mathcal{N}left( ext { mean, } operatorname{std}^{2} ight) $.

参数：
- tensor – an n-dimensional torch.Tensor
- mean – the mean of the normal distribution
- std – the standard deviation of the normal distribution
例子：
w = torch.empty(2, 2) print('before init w = ',w) torch.nn.init.normal_(w,mean=10,std=0.01) print('after init w = ',w)
结果：
```
before init w = 
 tensor([[2.3877e-38, 1.0010e+01],
        [2.2421e-44, 0.0000e+00]])
after init w = 
 tensor([[10.0128, 10.0086],
        [10.0064,  9.9983]])
```
3. 初始化为常数

torch.nn.init.constant_(tensor, val)

解释：

　　Fills the input Tensor with the value $

参数：
- tensor – an n-dimensional torch.Tensor
- val – the value to fill the tensor with
例子：
w = torch.empty(2, 2) print('before init w = ',w) torch.nn.init.constant_(w,18) print('after init w = ',w)
结果：
```
before init w = 
 tensor([[1.4013e-45, 0.0000e+00],
        [0.0000e+00, 0.0000e+00]])
after init w = 
 tensor([[18., 18.],
        [18., 18.]])
```
4.初始化为全 1

torch.nn.init.ones_(tensor)

解释：

　　Fills the input Tensor with the scalar value 1.

参数：
- tensor – an n-dimensional torch.Tensor
例子：
w = torch.empty(2, 2) print('before init w = ',w) torch.nn.init.ones_(w) print('after init w = ',w)
结果：
```
before init w = 
 tensor([[9.1477e-41, 0.0000e+00],
        [8.4078e-44, 0.0000e+00]])
after init w = 
 tensor([[1., 1.],
        [1., 1.]])
```
5.初始化为全 0

torch.nn.init.zeros_(tensor)

解释：

　　Fills the input Tensor with the scalar value 0.

参数：
- tensor – an n-dimensional torch.Tensor
例子：
w = torch.empty(2, 2) print('before init w = ',w) torch.nn.init.zeros_(w) print('after init w = ',w)
结果：
```
before init w = 
 tensor([[9.1477e-41, 0.0000e+00],
        [4.4842e-44, 0.0000e+00]])
after init w = 
 tensor([[0., 0.],
        [0., 0.]])
```
6.初始化为对角单位阵

torch.nn.init.eye_(tensor)

解释：

　　Fills the 2-dimensional input Tensor with the identity matrix. Preserves the identity of the inputs in Linear layers, where as many inputs are preserved as possible.

参数：
- tensor – a 2-dimensional torch.Tensor
例子：
w = torch.empty(2, 2) print('before init w = ',w) torch.nn.init.eye_(w) print('after init w = ',w)
结果：
```
before init w = 
 tensor([[1., 1.],
        [1., 1.]])
after init w = 
 tensor([[1., 0.],
        [0., 1.]])
```
7 .Xavier 均匀分布

torch.nn.init.xavier_uniform_(tensor, gain=1.0)

解释：

　　Fills the input Tensor with values according to the method described in Understanding the difficulty of training deep feedforward neural networks - Glorot, X. & Bengio, Y. (2010), using a uniform distribution. The resulting tensor will have values sampled from $ $U (- a,$

参数：
- tensor – an n-dimensional torch.Tensor
- gain – an optional scaling factor
例子：
w = torch.empty(2, 2) print('before init w = ',w) torch.nn.init.xavier_uniform_(w,gain=nn.init.calculate_gain('relu')) print('after init w = ',w)
结果：
```
before init w = 
 tensor([[1.4013e-45, 0.0000e+00],
        [0.0000e+00, 0.0000e+00]])
after init w = 
 tensor([[ 0.6120, -0.9743],
        [-1.5010,  0.5827]])
```
例子：
gain=nn.init.calculate_gain('relu') gain
结果：
```
1.4142135623730951
```
例子：
gain=nn.init.calculate_gain('sigmoid') gain
结果：
```
1
```
8 .Xavier 高斯分布

torch.nn.init.xavier_normal_(tensor, gain=1.0)

解释：

　　Fills the input Tensor with values according to the method described in Understanding the difficulty of training deep feedforward neural networks - Glorot, X. & Bengio, Y. (2010), using a normal distribution. The resulting tensor will have values sampled from $mathcal{N}left(0, mathrm{std}^{2} ight)$ where

　　　　$operatorname{std}=operatorname{gain} imes sqrt{frac{2}{ ext { fan_in }+ ext { fan_out }}}$

参数：
- tensor – an n-dimensional torch.Tensor
- gain – an optional scaling factor
例子：
w = torch.empty(2, 2) print('before init w = ',w) torch.nn.init.xavier_normal_(w,gain=nn.init.calculate_gain('relu')) print('after init w = ',w)
结果：
```
before init w = 
 tensor([[0., 0.],
        [0., 0.]])
after init w = 
 tensor([[ 0.9703,  1.0088],
        [ 1.1271, -0.0602]])
```
9.He均匀分布

torch.nn.init.kaiming_uniform_(tensor, a=0, mode='fan_in', nonlinearity='leaky_relu')

解释：

　　Fills the input Tensor with values according to the method described in Delving deep into rectifiers: Surpassing humanlevel performance on ImageNet classification - He, K. et al. (2015), using a uniform distribution. The resulting tensor will have values sampled from $mathcal{U}(- bound, bound)$ where

　　　　$ ext { bound }= ext { gain } imes sqrt{frac{3}{ ext { fan_mode }}}$

参数：
- tensor – an n-dimensional torch.Tensor
- a – the negative slope of the rectifier used after this layer (only used with 'leaky_relu')
- mode – either 'fan_in' (default) or 'fan_out'. Choosing 'fan_in' preserves the magnitude of the variance of the weights in the forward pass. Choosing 'fan_out' preserves the magnitudes in the backwards pass.
- nonlinearity – the non-linear function (nn.functional name), recommended to use only with 'relu' or 'leaky_relu' (default).
例子：
w = torch.empty(2, 2) print('before init w = ',w) torch.nn.init.kaiming_uniform_(w, mode='fan_in', nonlinearity='relu') print('after init w = ',w)
结果：
```
before init w = 
 tensor([[-3.6893e+19,  1.3658e+00],
        [ 2.2421e-44,  0.0000e+00]])
after init w = 
 tensor([[-0.8456,  1.3498],
        [-0.8480, -1.1506]])
```
10.He高斯分布

torch.nn.init.kaiming_normal_(tensor, a=0, mode='fan_in', nonlinearity='leaky_relu')

解释：

　　Fills the input Tensor with values according to the method described in Delving deep into rectifiers: Surpassing humanlevel performance on ImageNet classification - He, K. et al. (2015), using a normal distribution. The resulting tensor will have values sampled from $mathcal{N}left(0, mathrm{std}^{2} ight)$ where

　　　　$operatorname{std}=frac{ ext { gain }}{sqrt{ ext { fan_mode }}}$

参数：
- tensor – an n-dimensional torch.Tensor
- a – the negative slope of the rectifier used after this layer (only used with 'leaky_relu')
- mode – either 'fan_in' (default) or 'fan_out'. Choosing 'fan_in' preserves the magnitude of the variance of the weights in the forward pass. Choosing 'fan_out' preserves the magnitudes in the backwards pass.
- nonlinearity – the non-linear function (nn.functional name), recommended to use only with 'relu' or 'leaky_relu' (default).
例子：
w = torch.empty(2, 2) print('before init w = ',w) torch.nn.init.kaiming_normal_(w, mode='fan_out', nonlinearity='relu') print('after init w = ',w)
结果：
```
before init w = 
 tensor([[-0.8456,  1.3498],
        [-0.8480, -1.1506]])
after init w = 
 tensor([[-1.0357, -1.1732],
        [ 0.1517,  0.4935]])
```
因上求缘，果上努力~~~~ 作者：希望每天涨粉，转载请注明原文链接：https://www.cnblogs.com/BlairGrowing/p/15428616.html
查看全文

相关阅读:
深度学习笔记之关于基本思想、浅层学习、Neural Network和训练过程（三）
深度学习笔记之关于特征（二）
深度学习笔记之概述、背景和人脑视觉机理（一）
初步认识深度学习笔记（一）
EM（期望最大化）算法初步认识
 Ubuntu16.04下安装Tensorflow GPU版本（图文详解）
Ubuntu16.04下安装Tensorflow CPU版本（图文详解）
[转]粤语学习
 [转]微信公众平台开发（十）消息回复总结
 [转]C#开发微信公众平台-就这么简单

原文地址：https://www.cnblogs.com/BlairGrowing/p/15428616.html

Pytorch——torch.nn.init 中实现的初始化函数

1. 均匀分布

2. 高斯分布

3. 初始化为常数

4.初始化为全 1

5.初始化为全 0

6.初始化为对角单位阵

7 .Xavier 均匀分布

8 .Xavier 高斯分布

9.He均匀分布

10.He高斯分布