转载于:Pytorch中的仿射变换(affine_grid)
参考:详细解读Spatial Transformer Networks (STN)
假设我们有这么一张图片:
![](http://upload-images.jianshu.io/upload_images/5798456-5285414fb5239b43.png?imageMogr2/auto-orient/strip|imageView2/2/w/183/format/webp)
下面我们将通过分别通过手动编码和pytorch方式对该图片进行平移、旋转、转置、缩放等操作,这些操作的数学原理在本文中不会详细讲解。
实现载入图片(注意,下面的代码都是在 jupyter 中进行):
1 from torchvision import transforms
2 from PIL import Image
3 import matplotlib.pyplot as plt
4
5 %matplotlib inline
6
7 img_path = "图片文件路径"
8 img_torch = transforms.ToTensor()(Image.open(img_path))
9
10 plt.imshow(img_torch.numpy().transpose(1,2,0))
11 plt.show()
平移操作
普通方式
例如我们需要向右平移50px,向下平移100px。
1 import numpy as np
2 import torch
3
4 theta = np.array([
5 [1,0,50],
6 [0,1,100]
7 ])
8 # 变换1:可以实现缩放/旋转,这里为 [[1,0],[0,1]] 保存图片不变
9 t1 = theta[:,[0,1]]
10 # 变换2:可以实现平移
11 t2 = theta[:,[2]]
12
13 _, h, w = img_torch.size()
14 new_img_torch = torch.zeros_like(img_torch, dtype=torch.float)
15 for x in range(w):
16 for y in range(h):
17 pos = np.array([[x], [y]])
18 npos = t1@pos+t2
19 nx, ny = npos[0][0], npos[1][0]
20 if 0<=nx<w and 0<=ny<h:
21 new_img_torch[:,ny,nx] = img_torch[:,y,x]
22 plt.imshow(new_img_torch.numpy().transpose(1,2,0))
23 plt.show()
图片变为:
![](https://upload-images.jianshu.io/upload_images/5798456-52725f141110d242.png?imageMogr2/auto-orient/strip|imageView2/2/w/188/format/webp)
图片平移-1
pytorch 方式
向右移动0.2,向下移动0.4:
1 from torch.nn import functional as F
2
3 theta = torch.tensor([
4 [1,0,-0.2],
5 [0,1,-0.4]
6 ], dtype=torch.float)
7 grid = F.affine_grid(theta.unsqueeze(0), img_torch.unsqueeze(0).size())
8 output = F.grid_sample(img_torch.unsqueeze(0), grid)
9 new_img_torch = output[0]
10 plt.imshow(new_img_torch.numpy().transpose(1,2,0))
11 plt.show()
得到的图片为:
![](http://upload-images.jianshu.io/upload_images/5798456-1933576668ffe355.png?imageMogr2/auto-orient/strip|imageView2/2/w/188/format/webp)
图片平移-2
总结:
- 要使用 pytorch 的平移操作,只需要两步:theta 的第三列为平移比例,向右为负,向下为负;
- 创建 grid:
grid = torch.nn.functional.affine_grid(theta, size)
,其实我们可以通过调节size
设置所得到的图像的大小(相当于resize); - grid_sample 进行重采样:
outputs = torch.nn.functional.grid_sample(inputs, grid, mode='bilinear')
- 创建 grid:
- theta 的第三列为平移比例,向右为负,向下为负;
我们通过设置 size
可以将图像resize:
1 from torch.nn import functional as F
2
3 theta = torch.tensor([
4 [1,0,-0.2],
5 [0,1,-0.4]
6 ], dtype=torch.float)
7 # 修改size
8 N, C, W, H = img_torch.unsqueeze(0).size()
9 size = torch.Size((N, C, W//2, H//3))
10 grid = F.affine_grid(theta.unsqueeze(0), size)
11 output = F.grid_sample(img_torch.unsqueeze(0), grid)
12 new_img_torch = output[0]
13 plt.imshow(new_img_torch.numpy().transpose(1,2,0))
14 plt.show()
![](https://upload-images.jianshu.io/upload_images/5798456-9cb89845b5fd7f46.png?imageMogr2/auto-orient/strip|imageView2/2/w/140/format/webp)
修改size的效果
缩放操作
普通方式
放大1倍:
1 import numpy as np
2 import torch
3
4 theta = np.array([
5 [2,0,0],
6 [0,2,0]
7 ])
8 t1 = theta[:,[0,1]]
9 t2 = theta[:,[2]]
10
11 _, h, w = img_torch.size()
12 new_img_torch = torch.zeros_like(img_torch, dtype=torch.float)
13 for x in range(w):
14 for y in range(h):
15 pos = np.array([[x], [y]])
16 npos = t1@pos+t2
17 nx, ny = npos[0][0], npos[1][0]
18 if 0<=nx<w and 0<=ny<h:
19 new_img_torch[:,ny,nx] = img_torch[:,y,x]
20 plt.imshow(new_img_torch.numpy().transpose(1,2,0))
21 plt.show()
结果为:
![](https://upload-images.jianshu.io/upload_images/5798456-820530e427101dd6.png?imageMogr2/auto-orient/strip|imageView2/2/w/188/format/webp)
放大操作-1
由于没有使用插值算法,所以中间有很多部分是黑色的。
pytorch 方式
1 from torch.nn import functional as F
2
3 theta = torch.tensor([
4 [0.5, 0 , 0],
5 [0 , 0.5, 0]
6 ], dtype=torch.float)
7 grid = F.affine_grid(theta.unsqueeze(0), img_torch.unsqueeze(0).size())
8 output = F.grid_sample(img_torch.unsqueeze(0), grid)
9 new_img_torch = output[0]
10 plt.imshow(new_img_torch.numpy().transpose(1,2,0))
11 plt.show()
结果为:
![](https://upload-images.jianshu.io/upload_images/5798456-3abddfdb11c78536.png?imageMogr2/auto-orient/strip|imageView2/2/w/188/format/webp)
放大操作-2
结论:可以看到,affine_grid
的放大操作是以图片中心为原点的。
旋转操作
普通操作
将图片旋转30度:
import numpy as np
import torch
import math
angle = 30*math.pi/180
theta = np.array([
[math.cos(angle),math.sin(-angle),0],
[math.sin(angle),math.cos(angle) ,0]
])
t1 = theta[:,[0,1]]
t2 = theta[:,[2]]
_, h, w = img_torch.size()
new_img_torch = torch.zeros_like(img_torch, dtype=torch.float)
for x in range(w):
for y in range(h):
pos = np.array([[x], [y]])
npos = t1@pos+t2
nx, ny = int(npos[0][0]), int(npos[1][0])
if 0<=nx<w and 0<=ny<h:
new_img_torch[:,ny,nx] = img_torch[:,y,x]
plt.imshow(new_img_torch.numpy().transpose(1,2,0))
plt.show()
结果为:
![](https://upload-images.jianshu.io/upload_images/5798456-4265aac391f480f6.png?imageMogr2/auto-orient/strip|imageView2/2/w/188/format/webp)
旋转操作-1
pytorch 操作
from torch.nn import functional as F
import math
angle = -30*math.pi/180
theta = torch.tensor([
[math.cos(angle),math.sin(-angle),0],
[math.sin(angle),math.cos(angle) ,0]
], dtype=torch.float)
grid = F.affine_grid(theta.unsqueeze(0), img_torch.unsqueeze(0).size())
output = F.grid_sample(img_torch.unsqueeze(0), grid)
new_img_torch = output[0]
plt.imshow(new_img_torch.numpy().transpose(1,2,0))
plt.show()
结果为:
![](https://upload-images.jianshu.io/upload_images/5798456-79097d85e572b88e.png?imageMogr2/auto-orient/strip|imageView2/2/w/188/format/webp)
旋转操作-2
pytorch 以图片中心为原点进行旋转,并且在旋转过程中会发生图片缩放,如果选择角度变为 90°,图片为:
![](https://upload-images.jianshu.io/upload_images/5798456-48b991eafba74af0.png?imageMogr2/auto-orient/strip|imageView2/2/w/188/format/webp)
旋转 90° 结果
转置操作
普通操作
1 import numpy as np
2 import torch
3
4 theta = np.array([
5 [0,1,0],
6 [1,0,0]
7 ])
8 t1 = theta[:,[0,1]]
9 t2 = theta[:,[2]]
10
11 _, h, w = img_torch.size()
12 new_img_torch = torch.zeros_like(img_torch, dtype=torch.float)
13 for x in range(w):
14 for y in range(h):
15 pos = np.array([[x], [y]])
16 npos = t1@pos+t2
17 nx, ny = npos[0][0], npos[1][0]
18 if 0<=nx<w and 0<=ny<h:
19 new_img_torch[:,ny,nx] = img_torch[:,y,x]
20 plt.imshow(new_img_torch.numpy().transpose(1,2,0))
21 plt.show()
结果为:
![](https://upload-images.jianshu.io/upload_images/5798456-85420eeeea133f87.png?imageMogr2/auto-orient/strip|imageView2/2/w/188/format/webp)
图片转置-1
pytorch 操作
我们可以通过size大小,保存图片不被压缩:
1 from torch.nn import functional as F
2
3 theta = torch.tensor([
4 [0, 1, 0],
5 [1, 0, 0]
6 ], dtype=torch.float)
7 N, C, H, W = img_torch.unsqueeze(0).size()
8 grid = F.affine_grid(theta.unsqueeze(0), torch.Size((N, C, W, H)))
9 output = F.grid_sample(img_torch.unsqueeze(0), grid)
10 new_img_torch = output[0]
11 plt.imshow(new_img_torch.numpy().transpose(1,2,0))
12 plt.show()
结果为:
![](https://upload-images.jianshu.io/upload_images/5798456-8bf31af3f4067add.png?imageMogr2/auto-orient/strip|imageView2/2/w/370/format/webp)
图片转置-2