zoukankan      html  css  js  c++  java
  • introduction

    Introduction

    What does understanding language mean?

    This “understanding” of text is mainly derived by transforming texts to useable computational representations, which are discrete or continuous combinatorial structures such as vectors or tensors, graphs and trees.

    Computational Graphs

    Technically, a computational graph is an abstraction that models mathematical expressions. Let’s see how the
    computational graph models expressions. Consider the expression:

    [oldsymbol{y}=oldsymbol{w} oldsymbol{x}+oldsymbol{b} ]

    We can then represent the original expression using a directed acyclic graph (DAG) in which
    the nodes are the mathematical operations, like multiplication and addition.

    computational graph

    Pytorch Basics

    Unlike Theano, Caffe, and TensorFlow, PyTorch implements a tape-based
    automatic differentiation method that allows us to define and execute
    computational graphs dynamically.

    Static frameworks like Theano, Caffe, and TensorFlow require the computational graph to be first declared, compiled, and then executed. Although this leads to extremely efficient implementations (useful in production and mobile settings), it can become quite cumbersome during research and development. Modern frameworks like Chainer, DyNet, and PyTorch implement dynamic computational graphs to allow for a more flexible, imperative style of development, without needing to compile the models before every execution. Dynamic computational graphs are especially useful in modeling NLP tasks for which each input could potentially result in a different graph structure.

    Tensor

    A tensor of order zero is just a number, or a scalar. A tensor of order one (1st-order tensor) is an array of numbers, or a vector. Similarly, a 2nd-order tensor is an array of vectors, or a matrix.

    computational graph

    Create Tensors

    def describe(x):
        print("Type: {}".format(x.type()))
        print("Shape: {}".format(x.shape))
        print("Values: {}".format(x))
    
    #  1. initialize a random tensor
    describe(torch.Tensor(2,3))
    
    # 2. uniform random on the interval [0,1)
    describe(torch.rand(2,3))
    
    # 3. random standard normaldistribution
    decribe(torch.randn(2,3))
    
    

    Any PyTorch method with an underscore (_)
    refers to an in-place operation; that is, it modifies the content in place without
    creating a new object:

    x = torch.ones(2,3)
    x.fill_(5) # x has been changed
    
    # creating and initializing a tensor from lists
    x = torch.Tensor([[1,2,3],[2,3,4]])
    
    # creating and initializing a tensor from numpy. note the type of x_t is torch.DoubleTensor other than the default FloatTensor
    x = np.random.rand(2,3)
    x_t = torch.from_numpy(x)
    

    Tensor Types and Size

    x = torch.FloatTensor([[1,2,3]]) 
    # Type: torch.FloatTensor
    
    x = x.long()
    # torch.LongTensor
    
    x = torch.tensor([[1,2,3]],dtype=torch.int64)
    # Type: torch.LongTensor
    
    

    We use the shape property and size() method of a tensor object to access the
    measurements of its dimensions. The two ways of accessing these measurements
    are mostly synonymous.

    Tensor Operations

    x = torch.arange(6) 
    # tensor([0, 1, 2, 3, 4, 5])
    
    x = x.view(2,3)
    # tensor([[0, 1, 2],
    #        [3, 4, 5]])
    
    torch.sum(x,dim=0)
    # tensor([3, 5, 7])  对第0个维度进行操作
    
    torch.sum(x,dim=1) 
    # tensor([ 3, 12]) 对第1个维度进行操作
    

    Indexing, Slicing, and Joining

    
    describe(x[:1,:2]) 
    # Type: torch.LongTensor
    # Shape: torch.Size([1, 2])
    # Values: tensor([[0, 1]])
    
    describe(x[0,1])
    # Type: torch.LongTensor
    # Shape: torch.Size([])
    # Values: 1
    
    # complex indexing
    indices = torch.LongTensor([0,2]) 
    describe(torch.index_select(x,dim=1,index=indices))
    # Type: torch.LongTensor
    # Shape: torch.Size([2, 2])
    # Values: tensor([[0, 2],
    #         [3, 5]])
    
    indices = torch.LongTensor([0,0,0])
    describe(torch.index_select(x,dim=0,index=indices))                    
    # Type: torch.LongTensor
    # Shape: torch.Size([3, 3])
    # Values: tensor([[0, 1, 2],
    #         [0, 1, 2],
    #         [0, 1, 2]])
    

    concatenating Tensors

    x.shape
    # torch.Size([2, 3])
    
    describe(torch.cat([x,x],dim=0))                                       
    # Type: torch.LongTensor
    # Shape: torch.Size([4, 3])
    # Values: tensor([[0, 1, 2],
    #         [3, 4, 5],
    #         [0, 1, 2],
    #         [3, 4, 5]])
    
    describe(torch.cat([x,x],dim=1))                                       
    # Type: torch.LongTensor
    # Shape: torch.Size([2, 6])
    # Values: tensor([[0, 1, 2, 0, 1, 2],
    #         [3, 4, 5, 3, 4, 5]])
    
    describe(torch.stack([x,x]))                                           
    # Type: torch.LongTensor
    # Shape: torch.Size([2, 2, 3])
    # Values: tensor([[[0, 1, 2],
    #          [3, 4, 5]],
    #         [[0, 1, 2],
    #          [3, 4, 5]]])
    

    Linear algebra on tensors : multiplication

    x = torch.arange(6).view(2,3) 
    # differnet from the examples in the book, the type of x is LongTensor, but x in the examples was FloatTensor
    
    x2 = torch.ones(3,2)
    
    x2[:,1] += 1 
    # tensor([[1., 2.],
    #         [1., 2.],
    #         [1., 2.]])
    
    torch.mm(x,x2)
    # this will appear RuntimeError: Expected object of type torch.LongTensor but found type torch.FloatTensor for argument #2 'mat2'
    # so we should change x = torch.arange(6.0).view(2,3)
    
    

    Tensors and Computational Graphs

    In the computational graph setting, gradients
    exist for each parameter in the model and can be thought of as the parameter's
    contribution to the error signal.

    x = torch.ones(2,2,requires_grad=True)
    y = (x+2)*(x+5) + 3
    # tensor([[21., 21.],
    #        [21., 21.]], grad_fn=<AddBackward>)
    
    z = y.mean()
    # tensor(21., grad_fn=<MeanBackward1>)
    
    z.backward()
    print(x.grad is None)
    # False
    x.grad
    # tensor([[2.2500, 2.2500],
    # [2.2500, 2.2500]])
    
    

    how to calculate x?

    [z = frac{1}{4} * (y) ]

    [y = (x+2)*(x+5) + 3 ]

    [frac{partial z}{partial x}=frac{1}{4}*(2x+7) ]

    把 x 的值带进去,就能得到 x 的梯度值

    CUDA Tensors

    torch.cuda.is_available()
    # device = torch.device("cuda" if torch.cuda.is_available() else "cpu") 
    # define which gpu to use
    device = torch.device(6)
    x = torch.rand(2,2).to(device)
    # Type: torch.cuda.FloatTensor
    

    note: To operate on CUDA and non-CUDA objects, we need to ensure that they are on the same device. If we don’t, the computations will break.

    CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py
    

    Exercise

    # 1. Create a 2D tensor and then add a dimension of size 1 inserted at dimension 0.
    a = torch.ones(2,2)
    b = a.unsqueeze(0)
    # torch.Size([1, 2, 2])
    # 2. Remove the extra dimension you just added to the previous tensor.
    b.squeeze(0)
    
    # 3. Create a random tensor of shape 5x3 in the interval [3, 7)
    3 + torch.rand(5,3)*(7-3)
    
    # 4. Create a tensor with values from a normal distribution (mean=0, std=1).
    a = torch.rand(3,3)
    a.normal_()
    
    # 5. Retrieve the indexes of all the nonzero elements in the tensor
    a = torch.Tensor([1, 1, 1, 0, 1]) 
    torch.nonzero(a) 
    # tensor([[0],
    #         [1],
    #         [2],
    #         [4]])
    
    # 6. Create a random tensor of size (3,1) and then horizontally stack four copies together.
    a = torch.rand(3,1) 
    # tensor([[0.5543],
    #         [0.1504],
    #         [0.6194]])
    
    a.expand(3,4) 
    # tensor([[0.5543, 0.5543, 0.5543, 0.5543],
    #         [0.1504, 0.1504, 0.1504, 0.1504],
    #         [0.6194, 0.6194, 0.6194, 0.6194]])
    
    # 7. Return the batch matrix-matrix product of two three-dimensional matrices (a=torch.rand(3,4,5), b=torch.rand(3,5,4)).
    torch.bmm(a,b)
    
    # 8. Return the batch matrix-matrix product of a 3D matrix and a 2D matrix (a=torch.rand(3,4,5),b=torch.rand(5,4)).
    a=torch.rand(3,4,5)
    b=torch.rand(5,4)
    torch.bmm(a,b.unsqueeze(0).expand(a.size(0),*b.size()))
    

    note: expand can only expand the dimension with size 1
    *b.shape 表示展开shape,只有传参的时候才能使用;譬如 shape=(1,2) ,相当于调用f的时候f(1,2)

  • 相关阅读:
    spring获取webapplicationcontext,applicationcontext几种方法详解(转)
    spring注入是否会被回收
    think in java 手记(一)
    spring 注解实例
    navicat远程连接oracle
    tomcat监听activemq jms配置
    HDU 1160:FatMouse's Speed
    YTU 2457: 很简单的一道题
    YTU 2456: 评委打分
    YTU 2455: Pefect 数字
  • 原文地址:https://www.cnblogs.com/curtisxiao/p/10674541.html
Copyright © 2011-2022 走看看