zoukankan html css js c++ java

【Caffe篇】--Caffe solver层从初始到应用

一、前述

solve主要是定义求解过程，超参数的

二、具体

#往往loss function是非凸的，没有解析解,我们需要通过优化方法来求解。
#caffe提供了六种优化算法来求解最优参数，在solver配置文件中，通过设置type类型来选择。

    Stochastic Gradient Descent (type: "SGD"),
    AdaDelta (type: "AdaDelta"),
    Adaptive Gradient (type: "AdaGrad"),
    Adam (type: "Adam"),
    Nesterov’s Accelerated Gradient (type: "Nesterov") and
    RMSprop (type: "RMSProp")


net: "examples/mnist/lenet_train_test.prototxt"  #网络配置文件位置
test_iter: 100
test_interval: 500
base_lr: 0.01#基础学习率
momentum: 0.9
type: SGD
weight_decay: 0.0005
lr_policy: "inv"
gamma: 0.0001
power: 0.75
display: 100
max_iter: 20000
snapshot: 5000
snapshot_prefix: "examples/mnist/lenet"
solver_mode: CPU

net: "examples/mnist/lenet_train_test.prototxt" #网络位置
train_net: "examples/hdf5_classification/logreg_auto_train.prototxt" #也可以分别设定train和test
test_net: "examples/hdf5_classification/logreg_auto_test.prototxt"

test_iter: 100 #迭代了多少个测试样本呢？ batch*test_iter 假设有5000个测试样本，一次测试想跑遍这5000个则需要设置test_iter×batch=5000

test_interval: 500 #测试间隔。也就是每训练500次，才进行一次测试。


base_lr: 0.01 #base_lr用于设置基础学习率

lr_policy: "inv" #学习率调整的策略 希望学习率越来越小

        - fixed:　　 保持base_lr不变.
        - step: 　　 如果设置为step,则还需要设置一个stepsize,  返回 base_lr * gamma ^ (floor(iter / stepsize)),其中iter表示当前的迭代次数
        - exp:   　　返回base_lr * gamma ^ iter， iter为当前迭代次数
        - inv:　　    如果设置为inv,还需要设置一个power, 返回base_lr * (1 + gamma * iter) ^ (- power)
        - multistep: 如果设置为multistep,则还需要设置一个stepvalue。这个参数和step很相似，step是均匀等间隔变化，而multistep则是根据                                             stepvalue值变化
        - poly: 　　  学习率进行多项式误差, 返回 base_lr (1 - iter/max_iter) ^ (power)
        - sigmoid:　学习率进行sigmod衰减，返回 base_lr ( 1/(1 + exp(-gamma * (iter - stepsize))))

momentum ：0.9 #动量 一般是固定为0.9

display: 100 #每训练100次，在屏幕上显示一次。如果设置为0，则不显示。

max_iter: 20000 #最大迭代次数，2W次就停止了

snapshot: 5000 #快照。将训练出来的model和solver状态进行保存，snapshot用于设置训练多少次后进行保存
snapshot_prefix: "examples/mnist/lenet" 

solver_mode: CPU #设置运行模式。默认为GPU,如果你没有GPU,则需要改成CPU,否则会出错。

查看全文

相关阅读:
The Python Standard Library
Python 中的round函数
 Python文件类型
 Python中import的用法
 Python Symbols 各种符号
 python 一行写多个语句
 免费SSL证书(https网站)申请，便宜SSL https证书申请
 元宇宙游戏Axie龙头axs分析
 OLE DB provider "SQLNCLI10" for linked server "x.x.x.x" returned message "No transaction is active.".
The operation could not be performed because OLE DB provider "SQLNCLI10" for linked server "xxx.xxx.xxx.xxx" was unable to begin a distributed transaction.

原文地址：https://www.cnblogs.com/LHWorldBlog/p/9247070.html