zoukankan html css js c++ java

torch 的包应用

1.torchvision.transforms是pytorch中的图像预处理包一般用Compose把多个步骤整合到一起，比如说

    transforms.Compose([transforms.CenterCrop(10),
                                        transforms.ToTensor(),])

2.transforms中的函数

    Resize：把给定的图片resize到given size；
    Normalize：Normalized an tensor image with mean and standard deviation；
    ToTensor：convert a PIL image to tensor (H*W*C) in range [0,255] to a torch.Tensor(C*H*W) in the range [0.0,1.0]；
    ToPILImage: convert a tensor to PIL imageScale：目前已经不用了，推荐用ResizeCenterCrop；
    ResizeCenterCrop：在图片的中间区域进行裁剪；
    RandomCrop：在一个随机的位置进行裁剪；
    RandomHorizontalFlip：以0.5的概率水平翻转给定的PIL图像；
    RandomVerticalFlip：以0.5的概率竖直翻转给定的PIL图像；
    RandomResizedCrop：将PIL图像裁剪成任意大小和纵横比；
    Grayscale：将图像转换为灰度图像；
    RandomGrayscale：将图像以一定的概率转换为灰度图像；
    FiceCrop：把图像裁剪为四个角和一个中心T；
    enCropPad：填充ColorJitter：随机改变图像的亮度对比度和饱和度

随机裁剪：transforms.RandomCrop
class torchvision.transforms.RandomCrop(size, padding=None, pad_if_needed=False, fill=0, padding_mode=‘constant’)
功能：依据给定的size随机裁剪
参数：
size- (sequence or int)，若为sequence,则为(h,w)，若为int，则(size,size)
padding-(sequence or int, optional)，此参数是设置填充多少个pixel。
当为int时，图像上下左右均填充int个，例如padding=4，则上下左右均填充4个pixel，若为32*32，则会变成40*40。
当为sequence时，若有2个数，则第一个数表示左右扩充多少，第二个数表示上下的。当有4个数时，则为左，上，右，下。
fill- (int or tuple) 填充的值是什么（仅当填充模式为constant时有用）。int时，各通道均填充该值，当长度为3的tuple时，表示RGB通道需要填充的值。
padding_mode- 填充模式，这里提供了4种填充模式，1.constant，常量。2.edge 按照图片边缘的像素值来填充。3.reflect，暂不了解。 4. symmetric，暂不了解。

class torchvision.transforms.RandomHorizontalFlip
随机水平翻转给定的PIL.Image,概率为0.5。即：一半的概率翻转，一半的概率不翻转。

class torchvision.transforms.ToTensor
把一个取值范围是[0,255]的PIL.Image或者shape为(H,W,C)的numpy.ndarray，转换成形状为[C,H,W]，取值范围是[0,1.0]的torch.FloadTensor
data = np.random.randint(0, 255, size=300)
img = data.reshape(10,10,3)
print(img.shape)
img_tensor = transforms.ToTensor()(img) # 转换成tensor
print(img_tensor) #没有/255

if the PIL Image belongs to one of the modes (L, LA, P, I, F, RGB, YCbCr, RGBA, CMYK, 1) or if the numpy.ndarray has dtype = np.uint8 In the other cases, tensors are returned without scaling.

class torchvision.transforms.Normalize(mean, std)
给定均值：(R,G,B) 方差：（R，G，B），将会把Tensor正则化。即：Normalized_image=(image-mean)/std。
mnist是灰度图，给出的应该是transforms.Normalize((0.1307 ), (0.3081 ))
cifar10是transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010))
image net是([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])

torchvision.datasets
torchvision.datasets中包含了以下数据集

MNIST
COCO（用于图像标注和目标检测）(Captioning and Detection)
LSUN Classification
ImageFolder
Imagenet-12
CIFAR10 and CIFAR100
STL10
Datasets 拥有以下API:

__getitem__ __len__

由于以上Datasets都是 torch.utils.data.Dataset的子类，所以，他们也可以通过torch.utils.data.DataLoader使用多线程（python的多进程）。

举例说明： torch.utils.data.DataLoader(coco_cap, batch_size=args.batchSize, shuffle=True, num_workers=args.nThreads)
torch.utils.data.DataLoader使用方法
数据加载器，结合了数据集和取样器，并且可以提供多个线程处理数据集。
在训练模型时使用到此函数，用来把训练数据分成多个小组，此函数每次抛出一组数据。直至把所有的数据都抛出。就是做一个数据的初始化


在构造函数中，不同的数据集直接的构造函数会有些许不同，但是他们共同拥有 keyword 参数。 In the constructor, each dataset has a slightly different API as needed, but they all take the keyword args: 

- transform： 一个函数，原始图片作为输入，返回一个转换后的图片。（详情请看下面关于torchvision-tranform的部分）

target_transform - 一个函数，输入为target，输出对其的转换。例子，输入的是图片标注的string，输出为word的索引。
MNIST
dset.MNIST(root, train=True, transform=None, target_transform=None, download=False)
参数说明： - root : processed/training.pt 和 processed/test.pt 的主目录 - train : True = 训练集, False = 测试集 - download : True = 从互联网上下载数据集，并把数据集放在root目录下. 如果数据集之前下载过，将处理过的数据（minist.py中有相关函数）放在processed文件夹下。

COCO
需要安装COCO API

图像标注:
dset.CocoCaptions(root="dir where images are", annFile="json annotation file", [transform, target_transform])
例子:

import torchvision.datasets as dset
import torchvision.transforms as transforms
cap = dset.CocoCaptions(root = 'dir where images are',
                        annFile = 'json annotation file',
                        transform=transforms.ToTensor())

print('Number of samples: ', len(cap))
img, target = cap[3] # load 4th sample

print("Image Size: ", img.size())
print(target)
输出:

Number of samples: 82783
Image Size: (3L, 427L, 640L)
[u'A plane emitting smoke stream flying over a mountain.',
u'A plane darts across a bright blue sky behind a mountain covered in snow',
u'A plane leaves a contrail above the snowy mountain top.',
u'A mountain that has a plane flying overheard in the distance.',
u'A mountain view with a plume of smoke in the background']
检测:
dset.CocoDetection(root="dir where images are", annFile="json annotation file", [transform, target_transform])
LSUN
dset.LSUN(db_path, classes='train', [transform, target_transform])
参数说明： - db_path = 数据集文件的根目录 - classes = ‘train’ (所有类别, 训练集), ‘val’ (所有类别, 验证集), ‘test’ (所有类别, 测试集) [‘bedroom_train’, ‘church_train’, …] : a list of categories to load

ImageFolder
一个通用的数据加载器，数据集中的数据以以下方式组织

root/dog/xxx.png
root/dog/xxy.png
root/dog/xxz.png

root/cat/123.png
root/cat/nsdf3.png
root/cat/asd932_.png
dset.ImageFolder(root="root folder path", [transform, target_transform])
他有以下成员变量:

self.classes - 用一个list保存 类名
self.class_to_idx - 类名对应的 索引
self.imgs - 保存(img-path, class) tuple的list
Imagenet-12
This is simply implemented with an ImageFolder dataset.

The data is preprocessed as described here

Here is an example

CIFAR
dset.CIFAR10(root, train=True, transform=None, target_transform=None, download=False)

dset.CIFAR100(root, train=True, transform=None, target_transform=None, download=False)
参数说明： - root : cifar-10-batches-py 的根目录 - train : True = 训练集, False = 测试集 - download : True = 从互联上下载数据，并将其放在root目录下。如果数据集已经下载，什么都不干。

STL10
dset.STL10(root, split='train', transform=None, target_transform=None, download=False)
参数说明： - root : stl10_binary的根目录 - split : 'train' = 训练集, 'test' = 测试集, 'unlabeled' = 无标签数据集, 'train+unlabeled' = 训练 + 无标签数据集 (没有标签的标记为-1) - download : True = 从互联上下载数据，并将其放在root目录下。如果数据集已经下载，什么都不干。

detach()

当我们再训练网络的时候可能希望保持一部分的网络参数不变，只对其中一部分的参数进行调整；或者只训练部分分支网络，并不让其梯度对主网络的梯度造成影响，

这时候我们就需要使用detach()函数来切断一些分支的反向传播

detach()[source]
返回一个新的Variable，从当前计算图中分离下来的，但是仍指向原变量的存放位置,不同之处只是requires_grad为false，得到的这个Variable永远不需要计算其梯度，不具有grad。

即使之后重新将它的requires_grad置为true,它也不会具有梯度grad

这样我们就会继续使用这个新的Variable进行计算，后面当我们进行反向传播时，到该调用detach()的Variable就会停止，不能再继续向前进行传播

import torch
a = torch.tensor([1,2,3.],requires_grad=True)#3后面一定有个.
print(a.grad)
out = a.sigmoid()
out.sum().backward()
print(a.grad)

None
tensor([0.1966, 0.1050, 0.0452])


import torch
a = torch.tensor([1,2,3.],requires_grad=True)#3后面一定有个.
print(a.grad)
out = a.sigmoid()
print(out)
#添加detach(),c的require_grad=False
c = out.detach()
print(c)#没有梯度
#没有对c更改，并不影响backward()
out.sum().backward()
print(a.grad)

#使用新生成的Variable进行反向传播

c.sum().backward()
print(a.grad)

None
tensor([0.7311, 0.8808, 0.9526], grad_fn=<SigmoidBackward>)
tensor([0.7311, 0.8808, 0.9526])
tensor([0.1966, 0.1050, 0.0452])

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn


import torch
a = torch.tensor([1,2,3.],requires_grad=True)#3后面一定有个.
print(a.grad)
out = a.sigmoid()
print(out)
#添加detach(),c的require_grad=False
c = out.detach()
print(c)#没有梯度
c.zero_()
print(c)
print(out)#修改c的同时影响out的值
#没有对c更改，并不影响backward()
out.sum().backward()
print(a.grad)

None
tensor([0.7311, 0.8808, 0.9526], grad_fn=<SigmoidBackward>)
tensor([0.7311, 0.8808, 0.9526])
tensor([0., 0., 0.])
tensor([0., 0., 0.], grad_fn=<SigmoidBackward>)但是还有梯度

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [3]], which is output 0 of SigmoidBackward, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

torch.randn和torch.rand有什么区别

>>> torch.rand(2,3)
tensor([[0.1619, 0.0097, 0.2034],
[0.6225, 0.1300, 0.3960]])

查看全文

相关阅读:
加密解密
 论事件驱动与异步IO
linux 基础命令
 libgcc_s_dw2-1.dll 缺失问题解决
 TightVNC 远程桌面
 配置机器学习开发环境(eclipse + anaconda2)
Caffe 执行python实例并可视化
 Caffe windows编译找不到python27_d.lib问题解决
 PHP 上传文件名中带中文的文件失败问题
 Windows 搭建PHP开发环境

原文地址：https://www.cnblogs.com/tingtin/p/12286820.html