PyTorch学习（一）

zoukankan html css js c++ java

PyTorch学习（一）
PyTorch官网
 PyTorch官方教程
 PyTorch官方文档
 PyTorch中文文档/教程
 动手学深度学习PyTorch版

引言

做了一个小测试，发现在cpu上pytorch比tensorflow快很多。另外还发现，conda命令安装的tensorflow比pip安装的要快，pytorch则没有明显区别，之前就看到有人说conda中的tensorflow经过了优化，看来是真的。

寻找下面函数的最小值：

conda：
```
import torch
import tensorflow as tf
import time
import numpy as np

def himmelblau(x):
    return (x[0]**2 + x[1] - 11)**2 + (x[0] + x[1]**2 - 7)**2

import plotly.graph_objects as go
x = np.arange(-6, 6, 0.1)
y = np.arange(-6, 6, 0.1)
# print('x,y range:', x.shape, y.shape)
X, Y = np.meshgrid(x, y)
fig = go.Figure(data=go.Surface(z=himmelblau([X,Y])))
fig.write_image('figure2.svg')
fig.write_html('first_figure.html', auto_open=True)


tic = time.time()
x = torch.tensor([0., 0.], requires_grad=True)
optimizer = torch.optim.Adam([x], lr=1e-3)
for step in range(20000):
    pred = himmelblau(x)
    optimizer.zero_grad() # 梯度信息清零
    pred.backward()
    optimizer.step() # 每调用一次step，就更新一次: x' = x, y' = y
    
    if step % 2000 == 0:
        print('step{}: x = {}, f(x) = {}'.format(step, x.detach().numpy(), pred.item()))
toc = time.time()
print('time:',toc-tic)

tic = time.time()
x = tf.Variable([0., 0.])  # 传入GradientTape计算梯度的必须是tf.Variable类型
optimizer = tf.optimizers.Adam(lr=1e-3)
for step in range(20000):
    with tf.GradientTape() as tape:
        tape.watch([x])
        pred = himmelblau(x)
        
    grads = tape.gradient(pred, [x])
    optimizer.apply_gradients(zip(grads, [x])) # 和pytorch不同，tf是将所有梯度信息存起来一次性更新
    # x -= 0.001*grads
    
    if step % 2000 == 0:
        print('step{}: x = {}, f(x) = {}'.format(step, x.numpy(), pred.numpy()))
toc = time.time()
print('time:',toc-tic)
```
conda版：

step0: x = [0.001 0.001], f(x) = 170.0
step2000: x = [2.3331807 1.9540695], f(x) = 13.730916023254395
step4000: x = [2.982008 2.0270984], f(x) = 0.014858869835734367
step6000: x = [2.9999835 2.0000222], f(x) = 1.1074007488787174e-08
step8000: x = [2.9999938 2.0000083], f(x) = 1.5572823031106964e-09
step10000: x = [2.9999979 2.0000029], f(x) = 1.8189894035458565e-10
step12000: x = [2.9999993 2.000001 ], f(x) = 1.6370904631912708e-11
step14000: x = [2.9999998 2.0000002], f(x) = 1.8189894035458565e-12
step16000: x = [3. 2.], f(x) = 0.0
step18000: x = [3. 2.], f(x) = 0.0
time: 8.470422983169556
step0: x = [0.001 0.001], f(x) = 170.0
step2000: x = [2.3331852 1.9540718], f(x) = 13.730728149414062
step4000: x = [2.9820085 2.0270977], f(x) = 0.01485812570899725
step6000: x = [2.9999835 2.0000222], f(x) = 1.1074007488787174e-08
step8000: x = [2.9999938 2.0000083], f(x) = 1.5572823031106964e-09
step10000: x = [2.9999979 2.0000029], f(x) = 1.8189894035458565e-10
step12000: x = [2.9999995 2.0000007], f(x) = 9.322320693172514e-12
step14000: x = [3. 2.0000002], f(x) = 9.094947017729282e-13
step16000: x = [3. 2.], f(x) = 0.0
step18000: x = [3. 2.], f(x) = 0.0
time: 43.112674951553345

pip版：

step0: x = [0.001 0.001], f(x) = 170.0
step2000: x = [2.3331807 1.9540695], f(x) = 13.730916023254395
step4000: x = [2.982008 2.0270984], f(x) = 0.014858869835734367
step6000: x = [2.9999835 2.0000222], f(x) = 1.1074007488787174e-08
step8000: x = [2.9999938 2.0000083], f(x) = 1.5572823031106964e-09
step10000: x = [2.9999979 2.0000029], f(x) = 1.8189894035458565e-10
step12000: x = [2.9999993 2.000001 ], f(x) = 1.6370904631912708e-11
step14000: x = [2.9999998 2.0000002], f(x) = 1.8189894035458565e-12
step16000: x = [3. 2.], f(x) = 0.0
step18000: x = [3. 2.], f(x) = 0.0
time: 8.337981462478638
step0: x = [0.001 0.001], f(x) = 170.0
step2000: x = [2.3331852 1.9540718], f(x) = 13.730728149414062
step4000: x = [2.9820085 2.0270977], f(x) = 0.01485812570899725
step6000: x = [2.9999835 2.0000222], f(x) = 1.1074007488787174e-08
step8000: x = [2.9999938 2.0000083], f(x) = 1.5572823031106964e-09
step10000: x = [2.9999979 2.0000029], f(x) = 1.8189894035458565e-10
step12000: x = [2.9999995 2.0000007], f(x) = 9.322320693172514e-12
step14000: x = [3. 2.0000002], f(x) = 9.094947017729282e-13
step16000: x = [3. 2.], f(x) = 0.0
step18000: x = [3. 2.], f(x) = 0.0
time: 54.814427614212036

安装

新建环境：
```
conda create --name torch python=3.7
```
安装一些可能要用到的包（非必须，看自己情况）：
```
conda install numpy
conda install spyder
conda install jupyter notebook
```
安装PyTorch：
```
conda install pytorch torchvision cpuonly -c pytorch # CPU版
```
GPU版根据CUDA版本不同命令也不同，可以去这里查看安装命令

自动求导

requires_grad

设置张量的属性 .requires_grad 为 True，那么它将会追踪对于该张量的所有操作。
```
import torch
import torch.nn.functional as F

x = torch.ones(1)
w = torch.full([1],2)  # 应为w = torch.full([1], 2, requires_grad=True)
mse = F.mse_loss(x, x+w)
torch.autograd.grad(mse, [w])
```
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

PyTorch的图是静态创建的，如果使用requires_grad_()方法：
```
x = torch.ones(1)
w = torch.full([1], 2)
mse = F.mse_loss(x, x+w)
w.requires_grad_()
torch.autograd.grad(mse, [w])
```
依然会报错：

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

因为mse这张图已经创建好了，因此要重新创建一次，或者将mse的创建放到w.requires_grad_(True)后面：
```
x = torch.ones(1)
w = torch.full([1], 2)
w.requires_grad_()
mse = F.mse_loss(x, x+w)
grad = torch.autograd.grad(mse, [w])  # 返回一个列表，分别是对每个变量的梯度
print(grad)
```
(tensor([4.]),)

backward()

也可以通过调用 .backward()，来自动计算所有的梯度。这个张量的所有梯度将会自动累加到.grad属性.
```
x = torch.ones(1)
w = torch.full([1], 2)
w.requires_grad_()
mse = F.mse_loss(x, x+w)
# grad = torch.autograd.grad(mse, [w]) 和下面语句等价
mse.backward()   # 不返回值，而是把梯度附加在每个变量的grad属性
print(w.grad)
```
tensor([4.])

调用完backward()后，pytorch会把图的信息清除掉，当再次调用backward()，会报错：

RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.

要保持图的信息，可设置retain_graph=True
```
torch.autograd.grad(mse, [w], retain_graph=True)
或
mse.backward(retain_graph=True)
```
detach()

要阻止一个张量被跟踪历史，可以调用 .detach() 方法将其与计算历史分离，并阻止它未来的计算记录被跟踪。

为了防止跟踪历史记录(和使用内存），可以将代码块包装在 with torch.no_grad(): 中。在评估模型时特别有用，因为模型可能具有 requires_grad = True 的可训练的参数，但是我们不需要在此过程中对他们进行梯度计算。
查看全文

相关阅读:
很长的下拉框菜单
 Pure CSS Buttons – Good Button Style and No Images
ssh 配置
 php大量session存储到内存中,散列及过期回收
 array_append_distinct, array_erase函数
 关于C# 中的Attribute 特性（转载）
Jquery如何操作Table的某一个td
ASP.NET应用程序生命周期趣谈(四) HttpHandler和页面生命周期
 ASP.NET应用程序生命周期趣谈(五) IIS7瞎说
 ASP.NET应用程序生命周期趣谈(三)

原文地址：https://www.cnblogs.com/pengweii/p/12722503.html

PyTorch学习（一）

引言

安装

自动求导

PyTorch的图是静态创建的，如果使用`requires_grad_()`方法：

调用完`backward()`后，pytorch会把图的信息清除掉，当再次调用`backward()`，会报错：

PyTorch学习（一）

引言

安装

自动求导

PyTorch的图是静态创建的，如果使用requires_grad_()方法：

调用完backward()后，pytorch会把图的信息清除掉，当再次调用backward()，会报错：

PyTorch的图是静态创建的，如果使用`requires_grad_()`方法：

调用完`backward()`后，pytorch会把图的信息清除掉，当再次调用`backward()`，会报错：