zoukankan html css js c++ java

theano scan optimization

Optimizing `Scan` performance

Minimizing Scan Usage

performan as much of the computation as possible outside of Scan. This may have the effect increasing memory usage but also reduce the overhead introduce by Scan.

Explicitly passing inputs of the inner function to scan

It's more efficient to explicitly pass parameter as non-sequence inputs.

Examples: Gibbs Sampling

Version One:

import theano
from theano import tensor as T

W = theano.shared(W_values) # we assume that ``W_values`` contains the
                            # initial values of your weight matrix

bvis = theano.shared(bvis_values)
bhid = theano.shared(bhid_values)

trng = T.shared_randomstreams.RandomStreams(1234)

def OneStep(vsample) :
    hmean = T.nnet.sigmoid(theano.dot(vsample, W) + bhid)
    hsample = trng.binomial(size=hmean.shape, n=1, p=hmean)
    vmean = T.nnet.sigmoid(theano.dot(hsample, W.T) + bvis)
    return trng.binomial(size=vsample.shape, n=1, p=vmean,
                         dtype=theano.config.floatX)

sample = theano.tensor.vector()
values, updates = theano.scan(OneStep, outputs_info=sample, n_steps=10)
gibbs10 = theano.function([sample], values[-1], updates=updates)

Version Two:

W = theano.shared(W_values) # we assume that ``W_values`` contains the
                            # initial values of your weight matrix

bvis = theano.shared(bvis_values)
bhid = theano.shared(bhid_values)

trng = T.shared_randomstreams.RandomStreams(1234)

# OneStep, with explicit use of the shared variables (W, bvis, bhid)
def OneStep(vsample, W, bvis, bhid):
    hmean = T.nnet.sigmoid(theano.dot(vsample, W) + bhid)
    hsample = trng.binomial(size=hmean.shape, n=1, p=hmean)
    vmean = T.nnet.sigmoid(theano.dot(hsample, W.T) + bvis)
    return trng.binomial(size=vsample.shape, n=1, p=vmean,
                     dtype=theano.config.floatX)

sample = theano.tensor.vector()

# The new scan, with the shared variables passed as non_sequences
values, updates = theano.scan(fn=OneStep,
                              outputs_info=sample,
                              non_sequences=[W, bvis, bhid],
                              n_steps=10)

gibbs10 = theano.function([sample], values[-1], updates=updates)

Deactivating garbage collecting in Scan

Deactivating garbage collecting in Scan can allow it to reuse memory between executins instead of always having to allocate new memory. Scan reuses memory between iterations of the same execution but frees the memory after the last iteration.
config.scan.allow_gc=False

Graph Optimizations

There are patterns that Theano can't optimize. the LSTM tutorial provides an example of optimization that theano can't perform. Instead of performing many matrix multiplications between matrix (x_t) and each of the shared msatrices (W_i,W_c,W_f) and (W_o), the matrixes (W_{*}) are merged into a single shared (W) and the graph performans a single larger matrix multiplication between (W) and (x_t). The resulting matrix is then sliced to obtain the results of that the small individial matrix multiplications by a single larger one and thus improves performance at the cost of a potentially higher memory usage.

查看全文

相关阅读:
IIS代理
 NODEJS
js图表插件
 注册nodejs程序为windows服务
 中断子系统7_中断出口处理
 Leetcode: Sort List
jquery 鼠标经过放大图片
 在Tomcat上运行ADF Essentials应用
 简谈HTML5与APP技术应用
 Boost的Serialization和SmartPoint搭配使用

原文地址：https://www.cnblogs.com/ZJUT-jiangnan/p/6062755.html