zoukankan      html  css  js  c++  java
  • introduction

    Introduction

    What does understanding language mean?

    This “understanding” of text is mainly derived by transforming texts to useable computational representations, which are discrete or continuous combinatorial structures such as vectors or tensors, graphs and trees.

    Computational Graphs

    Technically, a computational graph is an abstraction that models mathematical expressions. Let’s see how the
    computational graph models expressions. Consider the expression:

    [oldsymbol{y}=oldsymbol{w} oldsymbol{x}+oldsymbol{b} ]

    We can then represent the original expression using a directed acyclic graph (DAG) in which
    the nodes are the mathematical operations, like multiplication and addition.

    computational graph

    Pytorch Basics

    Unlike Theano, Caffe, and TensorFlow, PyTorch implements a tape-based
    automatic differentiation method that allows us to define and execute
    computational graphs dynamically.

    Static frameworks like Theano, Caffe, and TensorFlow require the computational graph to be first declared, compiled, and then executed. Although this leads to extremely efficient implementations (useful in production and mobile settings), it can become quite cumbersome during research and development. Modern frameworks like Chainer, DyNet, and PyTorch implement dynamic computational graphs to allow for a more flexible, imperative style of development, without needing to compile the models before every execution. Dynamic computational graphs are especially useful in modeling NLP tasks for which each input could potentially result in a different graph structure.

    Tensor

    A tensor of order zero is just a number, or a scalar. A tensor of order one (1st-order tensor) is an array of numbers, or a vector. Similarly, a 2nd-order tensor is an array of vectors, or a matrix.

    computational graph

    Create Tensors

    def describe(x):
        print("Type: {}".format(x.type()))
        print("Shape: {}".format(x.shape))
        print("Values: {}".format(x))
    
    #  1. initialize a random tensor
    describe(torch.Tensor(2,3))
    
    # 2. uniform random on the interval [0,1)
    describe(torch.rand(2,3))
    
    # 3. random standard normaldistribution
    decribe(torch.randn(2,3))
    
    

    Any PyTorch method with an underscore (_)
    refers to an in-place operation; that is, it modifies the content in place without
    creating a new object:

    x = torch.ones(2,3)
    x.fill_(5) # x has been changed
    
    # creating and initializing a tensor from lists
    x = torch.Tensor([[1,2,3],[2,3,4]])
    
    # creating and initializing a tensor from numpy. note the type of x_t is torch.DoubleTensor other than the default FloatTensor
    x = np.random.rand(2,3)
    x_t = torch.from_numpy(x)
    

    Tensor Types and Size

    x = torch.FloatTensor([[1,2,3]]) 
    # Type: torch.FloatTensor
    
    x = x.long()
    # torch.LongTensor
    
    x = torch.tensor([[1,2,3]],dtype=torch.int64)
    # Type: torch.LongTensor
    
    

    We use the shape property and size() method of a tensor object to access the
    measurements of its dimensions. The two ways of accessing these measurements
    are mostly synonymous.

    Tensor Operations

    x = torch.arange(6) 
    # tensor([0, 1, 2, 3, 4, 5])
    
    x = x.view(2,3)
    # tensor([[0, 1, 2],
    #        [3, 4, 5]])
    
    torch.sum(x,dim=0)
    # tensor([3, 5, 7])  对第0个维度进行操作
    
    torch.sum(x,dim=1) 
    # tensor([ 3, 12]) 对第1个维度进行操作
    

    Indexing, Slicing, and Joining

    
    describe(x[:1,:2]) 
    # Type: torch.LongTensor
    # Shape: torch.Size([1, 2])
    # Values: tensor([[0, 1]])
    
    describe(x[0,1])
    # Type: torch.LongTensor
    # Shape: torch.Size([])
    # Values: 1
    
    # complex indexing
    indices = torch.LongTensor([0,2]) 
    describe(torch.index_select(x,dim=1,index=indices))
    # Type: torch.LongTensor
    # Shape: torch.Size([2, 2])
    # Values: tensor([[0, 2],
    #         [3, 5]])
    
    indices = torch.LongTensor([0,0,0])
    describe(torch.index_select(x,dim=0,index=indices))                    
    # Type: torch.LongTensor
    # Shape: torch.Size([3, 3])
    # Values: tensor([[0, 1, 2],
    #         [0, 1, 2],
    #         [0, 1, 2]])
    

    concatenating Tensors

    x.shape
    # torch.Size([2, 3])
    
    describe(torch.cat([x,x],dim=0))                                       
    # Type: torch.LongTensor
    # Shape: torch.Size([4, 3])
    # Values: tensor([[0, 1, 2],
    #         [3, 4, 5],
    #         [0, 1, 2],
    #         [3, 4, 5]])
    
    describe(torch.cat([x,x],dim=1))                                       
    # Type: torch.LongTensor
    # Shape: torch.Size([2, 6])
    # Values: tensor([[0, 1, 2, 0, 1, 2],
    #         [3, 4, 5, 3, 4, 5]])
    
    describe(torch.stack([x,x]))                                           
    # Type: torch.LongTensor
    # Shape: torch.Size([2, 2, 3])
    # Values: tensor([[[0, 1, 2],
    #          [3, 4, 5]],
    #         [[0, 1, 2],
    #          [3, 4, 5]]])
    

    Linear algebra on tensors : multiplication

    x = torch.arange(6).view(2,3) 
    # differnet from the examples in the book, the type of x is LongTensor, but x in the examples was FloatTensor
    
    x2 = torch.ones(3,2)
    
    x2[:,1] += 1 
    # tensor([[1., 2.],
    #         [1., 2.],
    #         [1., 2.]])
    
    torch.mm(x,x2)
    # this will appear RuntimeError: Expected object of type torch.LongTensor but found type torch.FloatTensor for argument #2 'mat2'
    # so we should change x = torch.arange(6.0).view(2,3)
    
    

    Tensors and Computational Graphs

    In the computational graph setting, gradients
    exist for each parameter in the model and can be thought of as the parameter's
    contribution to the error signal.

    x = torch.ones(2,2,requires_grad=True)
    y = (x+2)*(x+5) + 3
    # tensor([[21., 21.],
    #        [21., 21.]], grad_fn=<AddBackward>)
    
    z = y.mean()
    # tensor(21., grad_fn=<MeanBackward1>)
    
    z.backward()
    print(x.grad is None)
    # False
    x.grad
    # tensor([[2.2500, 2.2500],
    # [2.2500, 2.2500]])
    
    

    how to calculate x?

    [z = frac{1}{4} * (y) ]

    [y = (x+2)*(x+5) + 3 ]

    [frac{partial z}{partial x}=frac{1}{4}*(2x+7) ]

    把 x 的值带进去,就能得到 x 的梯度值

    CUDA Tensors

    torch.cuda.is_available()
    # device = torch.device("cuda" if torch.cuda.is_available() else "cpu") 
    # define which gpu to use
    device = torch.device(6)
    x = torch.rand(2,2).to(device)
    # Type: torch.cuda.FloatTensor
    

    note: To operate on CUDA and non-CUDA objects, we need to ensure that they are on the same device. If we don’t, the computations will break.

    CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py
    

    Exercise

    # 1. Create a 2D tensor and then add a dimension of size 1 inserted at dimension 0.
    a = torch.ones(2,2)
    b = a.unsqueeze(0)
    # torch.Size([1, 2, 2])
    # 2. Remove the extra dimension you just added to the previous tensor.
    b.squeeze(0)
    
    # 3. Create a random tensor of shape 5x3 in the interval [3, 7)
    3 + torch.rand(5,3)*(7-3)
    
    # 4. Create a tensor with values from a normal distribution (mean=0, std=1).
    a = torch.rand(3,3)
    a.normal_()
    
    # 5. Retrieve the indexes of all the nonzero elements in the tensor
    a = torch.Tensor([1, 1, 1, 0, 1]) 
    torch.nonzero(a) 
    # tensor([[0],
    #         [1],
    #         [2],
    #         [4]])
    
    # 6. Create a random tensor of size (3,1) and then horizontally stack four copies together.
    a = torch.rand(3,1) 
    # tensor([[0.5543],
    #         [0.1504],
    #         [0.6194]])
    
    a.expand(3,4) 
    # tensor([[0.5543, 0.5543, 0.5543, 0.5543],
    #         [0.1504, 0.1504, 0.1504, 0.1504],
    #         [0.6194, 0.6194, 0.6194, 0.6194]])
    
    # 7. Return the batch matrix-matrix product of two three-dimensional matrices (a=torch.rand(3,4,5), b=torch.rand(3,5,4)).
    torch.bmm(a,b)
    
    # 8. Return the batch matrix-matrix product of a 3D matrix and a 2D matrix (a=torch.rand(3,4,5),b=torch.rand(5,4)).
    a=torch.rand(3,4,5)
    b=torch.rand(5,4)
    torch.bmm(a,b.unsqueeze(0).expand(a.size(0),*b.size()))
    

    note: expand can only expand the dimension with size 1
    *b.shape 表示展开shape,只有传参的时候才能使用;譬如 shape=(1,2) ,相当于调用f的时候f(1,2)

  • 相关阅读:
    去除金额千分位,还原成数字
    替换对象的key
    合并两个对象的属性
    js常用数组方法
    document对象的一些属性
    js数字四舍五入保留n位小数
    js时间日期类常用方法
    数字转换成千分位格式
    valueOf获取日期时间初始值
    常见的数据库Cause:Packet for query is too large(xxx > 1024)
  • 原文地址:https://www.cnblogs.com/curtisxiao/p/10674541.html
Copyright © 2011-2022 走看看