Heuristics-driven
Pad and Crop
即:
import torchvision.transforms as T
T.Compose([
T.Pad(4, padding_mode='reflect'),
T.RandomCrop(32), # for CIFAR-10 and CIFAR-100
T.RandomHorizontalFlip(),
T.ToTensor()
])
注: T.RandomCrop(200)
Cutout
形象点说, 就是给图片挖几个洞(不过作者的解释挺有趣的, 输入层的连续dropout, 这个想法有趣).
# uoguelph-mlrg cutout
# https://github.com/uoguelph-mlrg/Cutout/blob/master/util/cutout.py
# 原文代码
import torch
import numpy as np
class Cutout(object):
"""Randomly mask out one or more patches from an image.
Args:
n_holes (int): Number of patches to cut out of each image.
length (int): The length (in pixels) of each square patch.
"""
def __init__(self, n_holes, length):
self.n_holes = n_holes
self.length = length
def __call__(self, img):
"""
Args:
img (Tensor): Tensor image of size (C, H, W).
Returns:
Tensor: Image with n_holes of dimension length x length cut out of it.
"""
h = img.size(1)
w = img.size(2)
mask = np.ones((h, w), np.float32)
for n in range(self.n_holes):
y = np.random.randint(h)
x = np.random.randint(w)
y1 = np.clip(y - self.length // 2, 0, h)
y2 = np.clip(y + self.length // 2, 0, h)
x1 = np.clip(x - self.length // 2, 0, w)
x2 = np.clip(x + self.length // 2, 0, w)
mask[y1: y2, x1: x2] = 0.
mask = torch.from_numpy(mask)
mask = mask.expand_as(img)
img = img * mask
return img
MixUp
MixUp 就是将两个图片按照一定比例相加
[ ilde{x} = lambda x_A + (1 - lambda) x_B,
]
(lambda sim mathrm{Beta}(alpha, alpha), alpha in (0, +infty)), 同时
[ ilde{y} = lambda y_A + (1 - lambda) y_B.
]
CutMix
CutMix 就是将一个图片裁剪掉一块, 然后用另一张图片的对应位置填补:
[ ilde{x} = M circ x_A + (1 - M) circ x_B,
]
其中(M in {0, 1}^{H imes W}) 是掩码矩阵, 按照如下方式生成:
[r_x sim U[0, W], : r_y = U[0, H], \
r_w = Wsqrt{1 - lambda} , : r_h = Hsqrt{1 - lambda},
]
得到一个bounding box, 在其中的(M_{ij}= 0), 否则为(1), (lambda)控制框的大小的:
[ frac{r_w r_h}{HW} = 1 - lambda.
]
此时, 标签需要发生相应变化:
[ ilde{y} = lambda y_{A} + (1 - lambda)y_B,
]
这里标签是one-hot形式(或者概率向量).
AugMix
Data-driven
AutoAugment
根据数据集选择合理的augmentations策略.
RandAugment
从(K)中augmentations中随机选择(N)(replace=True), 每个的magnitude都是(M).
其中(M, N)是通过网格搜索找到的(太粗暴了吧)
DeepAugment
假设(f(cdot; heta))是一个Image-to-Image的网络, 则DeepAugment是指对( heta)添加扰动, 比如扰动权重, 改变激活函数等等.
和这个思想类似的还有AdversarialAugment.