zoukankan      html  css  js  c++  java
  • theano scan optimization

    selected from Theano Doc

    Optimizing Scan performance

    Minimizing Scan Usage

    performan as much of the computation as possible outside of Scan. This may have the effect increasing memory usage but also reduce the overhead introduce by Scan.

    Explicitly passing inputs of the inner function to scan

    It's more efficient to explicitly pass parameter as non-sequence inputs.

    Examples: Gibbs Sampling

    Version One:

    import theano
    from theano import tensor as T
    
    W = theano.shared(W_values) # we assume that ``W_values`` contains the
                                # initial values of your weight matrix
    
    bvis = theano.shared(bvis_values)
    bhid = theano.shared(bhid_values)
    
    trng = T.shared_randomstreams.RandomStreams(1234)
    
    def OneStep(vsample) :
        hmean = T.nnet.sigmoid(theano.dot(vsample, W) + bhid)
        hsample = trng.binomial(size=hmean.shape, n=1, p=hmean)
        vmean = T.nnet.sigmoid(theano.dot(hsample, W.T) + bvis)
        return trng.binomial(size=vsample.shape, n=1, p=vmean,
                             dtype=theano.config.floatX)
    
    sample = theano.tensor.vector()
    values, updates = theano.scan(OneStep, outputs_info=sample, n_steps=10)
    gibbs10 = theano.function([sample], values[-1], updates=updates)
    

    Version Two:

    W = theano.shared(W_values) # we assume that ``W_values`` contains the
                                # initial values of your weight matrix
    
    bvis = theano.shared(bvis_values)
    bhid = theano.shared(bhid_values)
    
    trng = T.shared_randomstreams.RandomStreams(1234)
    
    # OneStep, with explicit use of the shared variables (W, bvis, bhid)
    def OneStep(vsample, W, bvis, bhid):
        hmean = T.nnet.sigmoid(theano.dot(vsample, W) + bhid)
        hsample = trng.binomial(size=hmean.shape, n=1, p=hmean)
        vmean = T.nnet.sigmoid(theano.dot(hsample, W.T) + bvis)
        return trng.binomial(size=vsample.shape, n=1, p=vmean,
                         dtype=theano.config.floatX)
    
    sample = theano.tensor.vector()
    
    # The new scan, with the shared variables passed as non_sequences
    values, updates = theano.scan(fn=OneStep,
                                  outputs_info=sample,
                                  non_sequences=[W, bvis, bhid],
                                  n_steps=10)
    
    gibbs10 = theano.function([sample], values[-1], updates=updates)
    

    Deactivating garbage collecting in Scan

    Deactivating garbage collecting in Scan can allow it to reuse memory between executins instead of always having to allocate new memory. Scan reuses memory between iterations of the same execution but frees the memory after the last iteration.
    config.scan.allow_gc=False

    Graph Optimizations

    There are patterns that Theano can't optimize. the LSTM tutorial provides an example of optimization that theano can't perform. Instead of performing many matrix multiplications between matrix (x_t) and each of the shared msatrices (W_i,W_c,W_f) and (W_o), the matrixes (W_{*}) are merged into a single shared (W) and the graph performans a single larger matrix multiplication between (W) and (x_t). The resulting matrix is then sliced to obtain the results of that the small individial matrix multiplications by a single larger one and thus improves performance at the cost of a potentially higher memory usage.

  • 相关阅读:
    Open-E DSS V7 应用系列之 9 主动/主动 iSCSI群集部署
    Open-E DSS V7 应用系列 7~8
    Open-E DSS V7 应用系列之4~6
    Open-E DSS V7 应用系列之1~3
    kbmmw 5.18.0 发布
    Spring笔记--@ConditionalOnBean坑
    Kafka 3.0新特性
    如何让Git记住你的GitHub Token,避免每次都要重复输入?
    特征值和特征向量到底是个啥?能做什么用?
    ICCV 2021 | BN-NAS: 只训练BN层来自动搜索模型
  • 原文地址:https://www.cnblogs.com/ZJUT-jiangnan/p/6062755.html
Copyright © 2011-2022 走看看