zoukankan      html  css  js  c++  java
  • tensorflow,torch tips

    • apply weightDecay,L2 REGULARIZATION_LOSSES
    weights = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES)
    for w in weights:
        print(w)
    l2r = tf.contrib.layers.l2_regularizer(0.001)
    tf.contrib.layers.apply_regularization(l2r,weights)
    tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)

    ##cross_entropy loss

    tf.add_to_collection('losses', cross_entropy_mean)

    loss = tf.add_n(tf.get_collection('losses'), name='cross_entropy_loss')

    # config optimizer
    target_loss = target_loss + tf.add_n(tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES),name='l2_loss')
    train_step = tf.train.AdamOptimizer(
    learning_rate).minimize(target_loss,global_step)



    • .learningRateDecay
    global_step = tf.Variable(0, trainable=False,name = 'global_step')
    learning_rate = tf.train.exponential_decay(opts.learning_rate, global_step, 10000, 0.96, staircase=True)
    train_step = tf.train.AdamOptimizer(learning_rate).minimize(target_loss,global_step)
    • tensorflow 与torch 中 learningRateDecay的差异
    torch:  
     -- (3) learning rate decay (annealing)
       local clr = lr / (1 + state.t*lrd)
    
       state.t = state.t + 1
    
    https://github.com/torch/optim/blob/master/adam.lua
    
    tensorflow:
    decayed_learning_rate = learning_rate *
                            decay_rate ^ (global_step / decay_steps)
    
    https://www.tensorflow.org/versions/r0.11/api_docs/python/train/decaying_the_learning_rate
    

    torch中是每个batch执行一次,如果lrd = 0.001

    tensorflow 对应的应该是:decay_steps设为1,decay_steps = 1-lrd=0.999,这样就与torch的方法近似了?

    不对,tesorflow中有等价的tf.train.inverse_time_decay

    • tensorflow 中的softmax与torch 中LogSoftmax

    tf.nn.softmax 

     exp(logits) / reduce_sum(exp(logits), dim)

    tf.log(tf.nn.softmax(logits))并不与torch的LogSoftmax,torch中的LogSoftmax实现方式不一样:

    https://github.com/torch/nn/blob/master/lib/THNN/generic/LogSoftMax.c

    http://blog.csdn.net/lanchunhui/article/details/51248184 

    • saver

    http://www.jianshu.com/p/8487db911d9a 

    • tensorflow 与torch 中 DropOut的差异
    torch:
    Furthermore, the outputs are scaled by a factor of 1/(1-p) during training. 
    
    tensorflow:
    With probability keep_prob, outputs the input element scaled up by 1 / keep_prob, otherwise outputs 0. The scaling is so that the expected sum is unchanged.

    所以torch中的dropout_rate = p,相当于tesnsorflow中的keep_prob = 1-p

    参数顺序

    conv:torch outputs*inputs*kh*kw , tf  kh*kw*inputs*outputs

    deconv:torch inputs*outputs*kh*kw , tf  kh*kw*outputs*inputs

    移动端&MPS: outputs*kh*kw*inputs ,注意deconv kh*kw rotate 180度

  • 相关阅读:
    2021/9/20 开始排序算法
    快速排序(自己版本)
    2021/9/17(栈实现+中后缀表达式求值)
    2021/9/18+19(中缀转后缀 + 递归 迷宫 + 八皇后)
    20212021/9/13 稀疏数组
    2021/9/12 线性表之ArrayList
    开发环境重整
    Nginx入门
    《财富的帝国》读书笔记
    Linux入门
  • 原文地址:https://www.cnblogs.com/mlj318/p/7009178.html
Copyright © 2011-2022 走看看