zoukankan      html  css  js  c++  java
  • theano初步

    Theano初探 


     为了搞清楚theano到底是什么东西,我们拿它做个简单的实验:写一个加法函数。

    >>> import theano.tensor as T
    >>> from theano import function
    >>> x = T.dscalar('x')
    >>> y = T.dscalar('y')
    >>> z = x + y
    >>> f = function([x, y], z)

    使用如下代码来使用

    >>> f(2, 3)
    array(5.0)
    >>> f(16.3, 12.1)
    array(28.4)

    在Theano中,所有的符号都必须要定义类型。在上面的代码中,T.dscalar就是一个类型,d代表double,scalar表示标量。 

    dscalar并不是一个类,因此,x和y并不是dscalar的实例,而是TensorVariable的实例。然而在type地界上,x和y确实归dscalar管。

    >>> type(x)                                              #有什么fuck区别?
    <class 'theano.tensor.basic.TensorVariable'>
    >>> x.type
    TensorType(float64, scalar)
    >>> T.dscalar
    TensorType(float64, scalar)
    >>> x.type is T.dscalar
    True
    也可以用eval函数来完成,eval虽不如functioni灵活,但是也能凑活着用,还省去了import function的麻烦。
    >>> import theano.tensor as T
    >>> x = T.dscalar('x')
    >>> y = T.dscalar('y')
    >>> z = x + y
    >>> z.eval({x : 16.3, y : 12.1})
    array(28.4)
    eval就是evaluate的缩写,意思是求值。

    其他例子


    整个复杂点的:

                             s(x) = frac{1}{1 + e^{-x}}

    >>> x = T.dmatrix('x')
    >>> s = 1 / (1 + T.exp(-x))
    >>> logistic = function([x], s)
    >>> logistic([[0, 1], [-1, -2]])
    array([[ 0.5       ,  0.73105858],
           [ 0.26894142,  0.11920292]])

    因为这个函数正好是大部分取值接近于0或者1,所以取名 logistic,是逻辑的意思

    正式开始


    import cPickle, gzip, numpy
    #cPickle 用c编制的“腌制”模块,用于存储
    #gzip GNU自由软件的zip文件 # Load the dataset f
    = gzip.open('mnist.pkl.gz', 'rb') train_set, valid_set, test_set = cPickle.load(f) #前面 是load函数的3个输出?(3个输出并不相同) f.close()
    def shared_dataset(data_xy):
        """ Function that loads the dataset into shared variables
    
        The reason we store our dataset in shared variables is to allow
        Theano to copy it into the GPU memory (when code is run on GPU).                                 #GPU   英文全称Graphic Processing Unit,中文翻译为“图形处理器”
        Since copying data into the GPU is slow, copying a minibatch everytime                           #batch 批
        is needed (the default behaviour if the data is not in a shared
        variable) would lead to a large decrease in performance.
        """
        data_x, data_y = data_xy
        shared_x = theano.shared(numpy.asarray(data_x, dtype=theano.config.floatX))                       #asarray   as脚本  array数组
        shared_y = theano.shared(numpy.asarray(data_y, dtype=theano.config.floatX))
        # When storing data on the GPU it has to be stored as floats
        # therefore we will store the labels as ``floatX`` as well
        # (``shared_y`` does exactly that). But during our computations
        # we need them as ints (we use labels as index, and if they are
        # floats it doesn't make sense) therefore instead of returning
        # ``shared_y`` we will have to cast it to int. This little hack
        # lets us get around this issue
        return shared_x, T.cast(shared_y, 'int32')                                              #cast是铸造的意思
    
    test_set_x, test_set_y = shared_dataset(test_set)
    valid_set_x, valid_set_y = shared_dataset(valid_set)
    train_set_x, train_set_y = shared_dataset(train_set)
    
    batch_size = 500    # size of the minibatch
    
    # accessing the third minibatch of the training set
    
    data  = train_set_x[2 * 500: 3 * 500]
    label = train_set_y[2 * 500: 3 * 500]
  • 相关阅读:
    微信企业号开发:UserAgent
    用sinopia搭建内部npm服务
    python format用法详解
    python正则表达式re之compile函数解析
    Socket通信原理
    TCP半开连接与半闭连接
    使用npm安装一些包失败了的看过来(npm国内镜像介绍)
    UI优秀框架(库)
    关于 WebView 知识点的详解
    CommonJS规范
  • 原文地址:https://www.cnblogs.com/Iknowyou/p/3579250.html
Copyright © 2011-2022 走看看